Vehicle Utilization Classification

It can be useful to automatically classify driving behavior to determine whether the vehicle is used primarily as a commuting vehicle or consistently used during the day as a vehicle for hire

How often vehicles are used and when they are used

gdf_tz = gdf.sjoin(tz[['geometry','tz_name1st']])
gdf_tz['timestamp_local'] = gdf_tz.apply(lambda row: row['timestamp'].tz_convert(row['tz_name1st']), axis=1)
gdf_tz['day_of_week'] = gdf_tz['timestamp_local'].map(lambda x: x.dayofweek)
gdf_tz['hour_of_day'] = gdf_tz['timestamp_local'].map(lambda x: x.hour)
gb = pd.DataFrame(gdf_tz.loc[gdf_tz['haversine_dist_shift']>0].groupby(['day_of_week','hour_of_day'])['haversine_dist_shift'].sum())

gb = pd.DataFrame(gdf_tz.loc[gdf_tz['haversine_dist_shift']>0].groupby(['day_of_week','hour_of_day']).agg({'haversine_dist_shift':[np.sum,np.mean],'timestamp_diff_second':[np.sum,np.mean]}))

gb = gb.reset_index(drop = False)
gb.columns = [i[0] + '_' + i[1] if i[1] != '' else i[0] for i in gb.columns]

temp_lst = []
for day in [0,1,2,3,4,5,6]:
    temp = gb.loc[gb['day_of_week'] == day]
    all_time = pd.DataFrame(index = np.arange(0,24,1))
    all_time['odometer_diff_fill'] = 0
    all_time['time_fill'] = 0
    temp = temp.merge(all_time, left_on = 'hour_of_day', right_index = True, how = 'right')
    temp['filled_dist_sum'] = temp['haversine_dist_shift_sum']
    temp['filled_time_sum'] = temp['timestamp_diff_second_sum']
    temp['filled_dist_mean'] = temp['haversine_dist_shift_mean']
    temp['filled_time_mean'] = temp['timestamp_diff_second_mean']
    temp['day_of_week'] = day

    if day < 5:
        temp['weekday'] = True
    else:
        temp['weekday'] = False

    temp_lst.append(temp)

weekly_dst = pd.concat(temp_lst)

weekly_dst['filled_dist_sum'] = weekly_dst['haversine_dist_shift_sum'].combine_first(weekly_dst['odometer_diff_fill'])
weekly_dst['filled_time_sum'] = weekly_dst['timestamp_diff_second_sum'].combine_first(weekly_dst['time_fill'])

weekly_dst['filled_dist_mean'] = weekly_dst['haversine_dist_shift_mean'].combine_first(weekly_dst['odometer_diff_fill'])
weekly_dst['filled_time_mean'] = weekly_dst['timestamp_diff_second_mean'].combine_first(weekly_dst['time_fill'])

With this setup, the percent of weekday driving miles and hours that take place during commuting hours (7am to 9am and 4pm to 7pm). If that fraction is sufficiently high, the vehicle is probably primarily used for commuting.

Using the above code, the averaged distribution of driving during a week can be plotted as a heatmap (above) or a bar graph (below)

And here is an example heatmap

Example vehicle heatmap

Note that this type of heatmap (and the below bar graph) can easily be created for fleet-level statistics as well. If weekly_dst contains data for multiple vehicles, the data to be plotted can be obtained using a groupby as in:

For 100 example vehicles, the activity heatmap looks like this:

Example Fleet-Wide Vehicle Activity

And here is an example bar graph of the distribution of driving kilometers averaged over each weekday

Example vehicle bar graph

Last updated