• Users Online: 7564
  • Print this page
  • Email this page

Table of Contents
Year : 2017  |  Volume : 6  |  Issue : 4  |  Page : 87-93

Explorative Analysis of Motorcyclists' Injury Severity Pattern at a National Level in Iran

1 Department of Transportation Engineering, School of Civil Engineering; Road Safety Research Center, University of Science and Technology, Tehran, Iran
2 Department of Transportation Engineering, School of Civil Engineering, University of Science and Technology, Tehran, Iran

Date of Web Publication20-Feb-2018

Correspondence Address:
Dr. Ali Tavakoli Kashani
School of Civil Engineering, Iran University of Science and Technology, Tehran
Login to access the Email id

Source of Support: None, Conflict of Interest: None

DOI: 10.4103/atr.atr_38_17

Rights and Permissions

Objectives: This study aimed at examining the hidden patterns of motorcycle crashes and riders' injury severity at the national level in Iran. Methods: Hierarchical clustering (HC) and latent class clustering (LCC) techniques were used in combination to analyze riders' injury pattern in 6638 motorcycle crashes occurred in Iran during 2009–2012. First, the HC was performed to classify the provinces into homogeneous groups, based on the percentage of different crash factors in each province and a new variable called “province group” was added to the crash database as the output of the HC analysis. Next, the LCC was conducted to cluster the crash data and to investigate the riders' injury pattern across the country. Results: Among the six crash clusters identified by the LCC, Clusters 1 and 5, in which, respectively, 91% and 84%, of the riders were under 30 as well as Cluster 2, in which 65% of the riders were above 30 years had the highest percentages of injured motorcyclists (86%, 84%, and 88%, respectively). Cluster 5 had also the lowest percentage of helmet usage (about 5%) and licensed riders (5%). Moreover, Cluster 6 had the highest fatality rate among the six clusters. In this cluster, 73% of the crashes were occurred in nonresidential/agricultural land uses, and 94% were occurred in rural areas. Conclusions: Since a significant share of crashes in Cluster 5 was occurred in province Groups C and E; this might be regarded as an indication of weak law enforcement over helmet usage and licensure in these provinces. In addition, as the pattern of helmet usage was different among province clusters, future studies might be conducted regarding motorcyclists' helmet-wearing intentions among several provinces. Moreover, crashes occurred in rural roads, particularly in the vicinity of nonresidential or agricultural land uses were more severe and need special future attention.

Keywords: Cluster analysis, data mining, injuries and trauma, motorcycle accidents

How to cite this article:
Kashani AT, Mohammadian A, Besharati MM. Explorative Analysis of Motorcyclists' Injury Severity Pattern at a National Level in Iran. Arch Trauma Res 2017;6:87-93

How to cite this URL:
Kashani AT, Mohammadian A, Besharati MM. Explorative Analysis of Motorcyclists' Injury Severity Pattern at a National Level in Iran. Arch Trauma Res [serial online] 2017 [cited 2024 Mar 5];6:87-93. Available from: https://www.archtrauma.com/text.asp?2017/6/4/87/225892

  Introduction Top

Motorcycle riders experience significantly high rates of injury in traffic crashes in developing countries.[1] Similarly, motorcycle is a popular transportation mode in Iran and is, unfortunately, involved in a significant share of fatal crashes.[2] According to the Iran Forensic Medicine Organization report, 30,901 motorcyclists have been killed in traffic crashes occurred during 2006–2010 in Iran, which about 25.7% of the total traffic crash fatalities occurred during this period. This demonstrates the worrying condition of motorcycle safety in Iran. Therefore, the current study has focused on investigating the pattern of crashes and injury severity of motorcyclists at a national level in Iran.

Several previous studies have investigated the effect of different contributory factors on the motorcyclists' injury severity. Vlahogianni et al.[3] have provided a review on the literature about motorcycle crashes. In this regard, previous studies have reported that the rider age,[4] helmet usage,[5],[6] having a pillion passenger,[5],[7] unlicensed male riders as well as alcohol consumption [4],[8] might influence the motorcycle crash severity.

Other factors such as the engine capacity,[9] and production year of the motorcycle [4] as well as roadway and environmental factors such as geometric design, road type, pavement condition, weather condition, area type (i.e., urban or rural), and illumination [3] have also been found to affect the severity of motorcycle crashes.

From the viewpoint of analytical tools, data mining techniques are being widely used by traffic safety researchers in the recent years, aiming to examine the factors that might contribute to traffic crash frequency and injury severity of different road users.[6],[10] In this regard, several methods of clustering have been used as a common preliminary tool to mine large crash databases and group the data into more homogeneous clusters.[11],[12],[13]

Overall, a review of the literature indicates that very few previous studies have focused on the analysis of motorcyclists' injury severity at a national level in a country. Therefore, the current study tries to explore the pattern of motorcyclists' injury severity due to traffic crashes occurred in 31 provinces of Iran, using a two-step analysis framework.

  Procedure Top

Crash data

The data pertaining to motorcycle crashes occurred in Iran during 2009–2012 were used in the current study. These data were collected by traffic police officers at the crash scene using a Traffic Crash Record form, called KAM114, which contained important information about several features of crashes. Since the aim of the present study was to identify the factors influencing motorcyclists' injury severity, the motorcycle crash data were extracted from the original database.

Finally, after cleaning the database, 6638 data records were prepared for analysis. Furthermore, fourteen variables were considered in the current study. [Table 1] presents the study variables and subcategories of each variable. Regarding the two variables of “crash severity” and “rider's injury severity”, it should be noted that crash severity points to the severity of the crash for all the involved parties. In this regard, the crash was considered as fatal (or injury) if at least one of the involved parties was died (or injured) due to the crash. Moreover, the variable of “rider's injury severity” only reported the injury severity of the rider. For example, if a motorcycle-pedestrian crash caused an injury to the pedestrian but no injury to the motorcyclist, the crash severity would be labeled as “injury” and the rider's injury severity would be labeled as “no injury.”
Table 1: Variable description

Click here to view

Analysis procedure

Since the provinces under study had significantly unequal crash frequencies and fatalities (due to unequal population), it was not possible to simply group the provinces according to the raw frequencies of crashes in each province. For example, >50% of the crashes have occurred in the Tehran Province.

To overcome this problem, a new analysis framework was adopted, which is presented in [Figure 1]. As shown in this figure, first of all, the percentages of crashes in the subcategories of each variable were calculated for each of the provinces. The provinces were then clustered into homogeneous groups using hierarchical clustering (HC) analysis and a new variable called “province group” was introduced as the analysis output. This variable was then added to the crash database for further analysis. Indeed, the variable of “province group” represents the group of provinces that each data record was belonged to. In the next step, a latent class clustering (LCC) approach was performed using the newly introduced variable as well as other variables in [Table 1], aiming to group the motorcycle crashes into homogeneous clusters and explore hidden patterns of the motorcyclists' injury severity among the clusters.
Figure 1: Analytical framework of this study

Click here to view

Cluster analysis

Hierarchical clustering

HC has two types of strategies, divisive, and agglomerative. Divisive methods are “top-down” approaches in which all records start in one cluster, and splits are performed recursively as one moves down the hierarchy. Agglomerative methods are “bottom up” approaches in which each record starts in its own cluster, and pairs of clusters are merged as one moves up the hierarchy. Ward linkage algorithm, which is one of the agglomerative methods and has widely been used in the similar studies,[14] was employed in the current study to group the provinces.

Latent class clustering

LCC is the classification of similar objects into K latent classes, where uncertainty is involved in the class membership, meaning that every single data can belong to different clusters with different probabilities at the same time.[15]

Several goodness-of-fit criteria such as Bayesian information criterion (BIC), Akaike information criterion (AIC), and AIC corrected (AICc) have been employed in the previous studies to select the most appropriate number of clusters in the LCC analysis. However, due to its superiority,[16] BIC was used for this purpose in the current study. In addition, the entropy criterion was used as another measure to find the best clustering model. The entropy criterion takes values between 0 and 1, where 1 indicates the highest certainty in the classification and 0 shows the worst quality in the clustering.[13] Furthermore, the R2 was calculated for each of the variables used in the clustering analysis, which indicates how well one can predict class memberships based on that variable. The closer the R2 values are to 1, the better the predictions.[17] In other words, the R2 can be considered as a measure of importance of each variable in determining the cluster membership of each record.

  Results Top

In the first step, the study variables were used to cluster the provinces [Figure 1] for the analysis procedure]. More specifically, the percentages of crashes in the subcategories of each variable were calculated and used to cluster the provinces. However, the percentages of subcategories of all the study variables [shown in [Table 1] in each province were not presented here for the purpose of brevity. Only as an example, the percentages of each of the two subcategories of “Helmet usage” variable in each province are presented in [Table 2].
Table 2: Univariate distribution of helmet usage across the provinces

Click here to view

Provinces were finally clustered based on the distribution of study variables in crashes of each province using the Ward clustering algorithm. [Figure 2] shows the dendrogram of the provinces as a result of clustering by the Ward algorithm. As presented in this figure, the provinces were divided into five groups based on the results obtained in the HC output. These clusters are as follows:
Figure 2: Hierarchical clustering output by the Ward method

Click here to view

  • Group A: Ardabil, Hormozgan, Tehran, Bushehr, Mazandaran, Qazvin
  • Group B: Chaharmahal and Bakhtiari, Khorasan-Razavi, Gilan, Semnan, Lorestan, Golestan, Fars, Tehran, Kordestan, Hamedan
  • Group C: Ilam, Kohkilouye and Boyerahmad, West Azerbaijan, Khuzestan, Kermanshah, Sistan, and Balouchestan
  • Group D: Zanjan, Yazd, Qom
  • Group E: North Khorasan, South Khorasan, Markazi, Isfahan, Kerman, East Azerbaijan.

After clustering the provinces into homogeneous groups, the variable of “province group” was added to the database to represent the group of provinces that each record belonged to.

In the next step, the variable of “province group” and other variables were inputted into the LCC analysis. [Table 3] shows the corresponding R2 for each of the variables used in LCC analysis. The R2 indicates how well an indicator is explained by the model.[17]
Table 3: R2 of variables used in the latent class clustering step

Click here to view

Several models with 3–9 clusters were built, and finally, the 6-cluster solution was selected as the best model based on the BIC criterion. As shown in [Figure 3], the model with 6 clusters had the lowest BIC between the alternative models. In addition, the entropy criterion of the model was 0.77, which indicates a reasonably high certainty in the classification.
Figure 3: The Bayesian information criterion for several clustering models

Click here to view

The next step was to characterize the clusters obtained from the LCC analysis based on the proportion of each subcategory of variables in each cluster. [Table 3] describes the univariate distributions of the variables in each of the six clusters.

Similar to the previous studies,[13],[18] the clusters were analyzed and named based on their variable distributions. The variables that were selected to characterize the clusters are shown in [Table 4]. This table also shows the proportion of each subcategory of the variables in each one of the 6 clusters. Note that only significant subcategories of each variable are presented in this table.
Table 4: Summary of univariate distributions for the variables in each latent class cluster

Click here to view

In this regard, the six clusters were named as follows:

  1. Crashes occurred in urban areas – mostly under 30-year-old riders holding valid license
  2. Crashes occurred in urban areas – mostly above 30-year-old riders holding valid license
  3. Pedestrian-motorcycle crashes mostly occurred on residential/commercial/office land uses with the motorcyclists being at-fault
  4. Province Groups A and B; motorcyclists used helmet and held valid license
  5. Province Groups C and E; mostly under 30-year-old and unlicensed riders, not used helmet
  6. Crashes occurred in rural areas, and mostly in nonresidential/agricultural land uses.

More than 80% of the crash data were fallen into the three Clusters of 1, 2, and 3. Clusters 1 and 2 overlap in most of their prevalent features, namely, motorcycle crashes that occurred in urban areas with the riders mostly being not-at-fault and not using a helmet. The main difference between these two clusters is the distribution of riders' age groups in each cluster. About 91% of the riders in Cluster 1 were under 30-year-old. Furthermore, 86% of the riders involved in crashes of this cluster were injured. On the contrary, about 65% of the riders in Cluster 2 were above 30-year-old. However, similar to Cluster 1, about 88% of the riders in this cluster were injured. Nonetheless, both clusters were associated with severe injuries of motorcyclists.

The third critical cluster is Cluster 5, in which >84% of the riders were injured in crashes. About 74% of the crashes in this cluster had occurred in Province Groups C and E. Furthermore, 84% of the riders in this cluster were under 30-year-old. This cluster also had the lowest percentage of helmet usage (about 95% of the riders had not used helmet). In addition, >95% of the riders in this cluster did not hold a driving license at the time of crash.

Cluster 3 is the only cluster with domination of “pedestrian-motorcycle collisions” and “commercial/office” land uses. About 93% of the crashes in this cluster were related to pedestrian-motorcycle collisions, and interestingly, in 95% of the crashes in this cluster, motorcyclists were identified as the at-fault party.

About 86% of the crashes in Cluster 4 had occurred in Province Groups A and B. In addition, in 62% of the cases in this cluster, motorcyclists have used helmet, which is a considerably high rate, compared to the helmet usage rates in the other clusters. Although 3% of crashes in this cluster were fatal [Table 4], only 1% of the riders were killed in crashes of this cluster [Table 5].
Table 5: Distribution of motorcyclists' injury severity among the 6 clusters

Click here to view

Cluster 6 had the highest percentage of motorcyclist fatalities among the six clusters. It also had a domination of Province Groups E and A. About 73% of the crashes in this cluster had occurred in nonresidential/agricultural land uses. This cluster was also the only cluster with domination of crashes occurred in rural areas (86% of all crashes in this cluster). Furthermore, about 15% of the crashes in this cluster were “overturn.” In addition, about 3%, 1%, and 2% of the crashes in Clusters 4, 5, and 6 were fatal, respectively [Table 4].

On the other hand, the in-cluster and inter-cluster distribution of the riders' injury severity across the six clusters are presented in [Table 5]. According to this table, Clusters 1, 2, and 5 have the highest, and Clusters 3 and 4 have the lowest shares of injured riders.

In addition, crashes in Cluster 6 were the most dangerous crashes for motorcyclists because 1.8% of the riders involved in crashes of this cluster were died. Furthermore, 0.5% of riders involved in crashes of the Cluster 5 were killed, and another 83.7% were injured.

  Conclusions Top

The current study contributes to the literature of motorcycle crashes by providing a holistic view over the injury severity pattern of motorcyclists at the national level. For this purpose, two clustering techniques were used in combination. First, the HC technique was employed to group the provinces according to the distribution of crashes. In the second step, the LCC was conducted to investigate the crash and injury severity patterns of motorcyclists.

The LCC analysis produced six crash clusters with different crash patterns and provincial grouping was found to be a significant factor in the final crash clusters. According to the results of the LCC analysis, Clusters 1, 2, and 5 had the highest percentages of injured riders among the six clusters. In addition, Cluster 2 mostly consisted of the older riders and Clusters 1 and 5 had a predominance of under 30-year-old riders, while Clusters 3 and 4 have an approximately normal distribution according to riders' age. Therefore, higher shares of injured riders in Clusters 1, 2, and 5 could be attributed to the effect of the riders' age. This is in line with the results of previous studies, which reported that increasing the motorcyclists' age is associated with more severe injuries.[4] However, our results indicate that both younger and older riders are associated with more crash injuries.

Among the six crash clusters, Cluster 5 mostly comprised of crashes occurred in Province Groups C and E. In addition, about 84% of the motorcyclists in this cluster were under 30 years. Furthermore, a significant proportion of motorcyclists (95%) did not wear helmet and were unlicensed at the time of the crash. Since numerous previous studies all over the world have shown that not holding a driving license as well as not wearing a helmet might be associated with more severe motorcycle crashes;[1] more strict law enforcement over unlicensed riders and forcing motorcyclists to wear helmet in these provinces could help reduce riders' injury severity.

Moreover, Cluster 3 was the only cluster with domination of pedestrian-motorcycle collisions and was also one of the two clusters with the least percentage of injured riders. In other words, comparing the “crash severity” [presented in [Table 4] and the “injury severity of motorcyclists” [presented in [Table 5] for Cluster 3 shows that although 100% of crashes in this cluster were injury, but only 9.7% of motorcyclists involved in crashes of this cluster were injured [Table 5], Cluster 3]. The Cluster 4 mostly comprised of crashes occurred in Province Groups A and B. An interesting feature of this cluster is the share of motorcyclists that wore helmet (62%), which is considerably higher than other clusters.

Two percentage of the motorcyclists that involved in crashes of Cluster 6 were killed. This is the highest fatality rate among the six clusters, which might be attributed to specific features of crashes in this cluster. These crashes were mostly occurred in rural roads and in the vicinity of nonresidential or agricultural land uses. In addition, the proportion of overturn collisions in this cluster was relatively higher than the share of such collisions in other clusters. Therefore, further attention might be given to this group of crashes.

Results also confirmed that the combined use of HC and LCC can help reveal the motorcyclists' injury severity pattern at a national level. Since the unbalanced nature of crash data across subnational regions is not peculiar to Iran, the framework adopted in the current study could be used in other similar researches for macroanalysis of crash patterns in other countries or states. However, this approach should not be regarded as an alternative to predictive methods but as a preliminary analysis tool, which can provide a holistic view over crash patterns at the national level. This in turn might facilitate decision-making about road safety issues.

Financial support and sponsorship


Conflicts of interest

There are no conflicts of interest.

  References Top

Lin MR, Kraus JF. A review of risk factors and patterns of motorcycle injuries. Accid Anal Prev 2009;41:710-22.  Back to cited text no. 1
Mahdian M, Sehat M, Fazel MR, Moraveji A, Mohammadzadeh M. Epidemiology of urban traffic accident victims hospitalized more than 24 hours in a level III trauma center, Kashan County, Iran, during 2012-2013. Arch Trauma Res 2015;4:e28465.  Back to cited text no. 2
Vlahogianni EI, Yannis G, Golias JC. Overview of critical risk factors in power-two-wheeler safety. Accid Anal Prev 2012;49:12-22.  Back to cited text no. 3
Savolainen P, Mannering F. Probabilistic models of motorcyclists' injury severities in single- and multi-vehicle crashes. Accid Anal Prev 2007;39:955-63.  Back to cited text no. 4
Sadeghi-Bazargani H, Ayubi E, Azami-Aghdash S, Abedi L, Zemestani A, Amanati L, et al. Epidemiological patterns of road traffic crashes during the last two decades in Iran: A Review of the literature from 1996 to 2014. Arch Trauma Res 2016;5:e32985.  Back to cited text no. 5
Tavakoli Kashani A, Rabieyan R, Besharati MM. A data mining approach to investigate the factors influencing the crash severity of motorcycle pillion passengers. J Safety Res 2014;51:93-8.  Back to cited text no. 6
Tavakoli Kashani A, Rabieyan R, Besharati MM. Modeling the effect of operator and passenger characteristics on the fatality risk of motorcycle crashes. J Inj Violence Res 2016;8:35-42.  Back to cited text no. 7
Kasantikul V, Ouellet JV, Smith T, Sirathranont J, Panichabhongse V. The role of alcohol in Thailand motorcycle crashes. Accid Anal Prev 2005;37:357-66.  Back to cited text no. 8
Quddus MA, Noland RB, Chin HC. An analysis of motorcycle injury and vehicle damage severity using ordered probit models. J Safety Res 2002;33:445-62.  Back to cited text no. 9
Tavakoli Kashani A, Besharati MM. An analysis of vehicle occupants' injury severity in crashes occurred on rural freeways and multilane highways in Iran. Int J Transp Eng 2016;4:137-46.  Back to cited text no. 10
Kashani AT, Besharati MM. Fatality rate of pedestrians and fatal crash involvement rate of drivers in pedestrian crashes: A case study of Iran. Int J Inj Contr Saf Promot 2017;24:222-31.  Back to cited text no. 11
Besharati MM, Tavakoli Kashani A. Which set of factors contribute to increase the likelihood of pedestrian fatality in road crashes? Int J Inj Contr Saf Promot 2017;1-0. Doi: 10.1080/17457300.2017.1363781.  Back to cited text no. 12
Depaire B, Wets G, Vanhoof K. Traffic accident segmentation by means of latent class clustering. Accid Anal Prev 2008;40:1257-66.  Back to cited text no. 13
O'brien O, Cheshire J, Batty M. Mining bicycle sharing data for generating insights into sustainable transport systems. J Transp Geogr 2014;34:262-73.  Back to cited text no. 14
Höppner F. Fuzzy Cluster Analysis: Methods for Classification, Data Analysis and Image Recognition. Hoboken, New Jersey: John Wiley & Sons; 1999.  Back to cited text no. 15
Nylund KL, Asparouhov T, Muthén BO. Deciding on the number of classes in latent class analysis and growth mixture modeling: A Monte Carlo simulation study. Struct Equ Model 2007;14:535-69.  Back to cited text no. 16
Vermunt JK, Magidson J. Latent GOLD 4.0 User's Guide; 2005.  Back to cited text no. 17
de Oña J, López G, Mujalli R, Calvo FJ. Analysis of traffic accidents on rural highways using Latent Class Clustering and Bayesian Networks. Accid Anal Prev 2013;51:1-0.  Back to cited text no. 18


  [Figure 1], [Figure 2], [Figure 3]

  [Table 1], [Table 2], [Table 3], [Table 4], [Table 5]

This article has been cited by
1 Factors affecting the accident size of motorcycle-involved crashes: a structural equation modeling approach
Ali Tavakoli Kashani,Mahsa Jafari,Moslem Azizi Bondarabadi,Shahab Dabirinejad
International Journal of Injury Control and Safety Promotion. 2020; : 1
[Pubmed] | [DOI]


    Similar in PUBMED
   Search Pubmed for
   Search in Google Scholar for
 Related articles
    Access Statistics
    Email Alert *
    Add to My List *
* Registration required (free)  

  In this article
Article Figures
Article Tables

 Article Access Statistics
    PDF Downloaded438    
    Comments [Add]    
    Cited by others 1    

Recommend this journal