pandas - python: recursively find the distance between points in a group -
i can apply vincenty
in geopy
dataframe
in pandas
, determine distance between 2 consecutive machines. however, want find distance between machines in group without repeating.
for example, if group company name , there 3 machines associated company, want find distance between machine 1 , 2, 1 , 3, , (2 , 3) not calculate distance between (2 , 1) , (3 , 1) since symmetric (identical results).
import pandas pd geopy.distance import vincenty df = pd.dataframe({'ser_no': [1, 2, 3, 4, 5, 6, 7, 8, 9, 0], 'co_nm': ['aa', 'aa', 'aa', 'bb', 'bb', 'bb', 'bb', 'cc', 'cc', 'cc'], 'lat': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 'lon': [21, 22, 23, 24, 25, 26, 27, 28, 29, 30]}) coord_col = ['lat', 'lon'] matching_cust = df['co_nm'] == df['co_nm'].shift(1) shift_coords = df.shift(1).loc[matching_cust, coord_col] # join in shifted coords , compute distance df_shift = df.join(shift_coords, how = 'inner', rsuffix = '_2') # return distance in miles df['dist'] = df_shift.apply(lambda x: vincenty((x[1], x[2]), (x[4], x[5])).mi, axis = 1)
this finds distance of consecutive machines in group how can expand on find distance of machines in group?
this code returns:
co_nm lat lon ser_no dist 0 aa 1 21 1 nan 1 aa 2 22 2 97.47832 2 aa 3 23 3 97.44923 3 bb 4 24 4 nan 4 bb 5 25 5 97.34752 5 bb 6 26 6 97.27497 6 bb 7 27 7 97.18804 7 cc 8 28 8 nan 8 cc 9 29 9 96.97129 9 cc 10 30 0 96.84163
edit:
the desired output find unique distance combinations machines related company; is, co_nm aa
have distance between ser_no (1,2), (1,3), (2,3), (1,3) , distance machines in co_nm bb
, cc
well, wouldn't determine distance of machines in different co_nm
groups.
does make sense?
update2: using function:
def calc_dist(df): return pd.dataframe( [ [grp, df.loc[c[0]].ser_no, df.loc[c[1]].ser_no, vincenty(df.loc[c[0], ['lat','lon']], df.loc[c[1], ['lat','lon']]) ] grp,lst in df.groupby('co_nm').groups.items() c in combinations(lst, 2) ], columns=['co_nm','machinea','machineb','distance']) in [27]: calc_dist(df) out[27]: co_nm machinea machineb distance 0 aa 1 2 156.87614939082016 km 1 aa 1 3 313.7054454472326 km 2 aa 2 3 156.829329105069 km 3 cc 8 9 156.06016539095216 km 4 cc 8 0 311.9109981692541 km 5 cc 9 0 155.85149813446617 km 6 bb 4 5 156.66564183673603 km 7 bb 4 6 313.2143330250297 km 8 bb 4 7 469.6225353388079 km 9 bb 5 6 156.54889741438788 km 10 bb 5 7 312.95759746593706 km 11 bb 6 7 156.4089967703544 km
update:
in [9]: dist = pd.dataframe( ...: [ [grp, ...: df.loc[c[0]].ser_no, ...: df.loc[c[1]].ser_no, ...: vincenty(df.loc[c[0], ['lat','lon']], df.loc[c[1], ['lat','lon']]) ...: ] ...: grp,lst in df.groupby('co_nm').groups.items() ...: c in combinations(lst, 2) ...: ], ...: columns=['co_nm','machinea','machineb','distance']) in [10]: dist out[10]: co_nm machinea machineb distance 0 aa 1 2 156.87614939082016 km 1 aa 1 3 313.7054454472326 km 2 aa 2 3 156.829329105069 km 3 cc 8 9 156.06016539095216 km 4 cc 8 0 311.9109981692541 km 5 cc 9 0 155.85149813446617 km 6 bb 4 5 156.66564183673603 km 7 bb 4 6 313.2143330250297 km 8 bb 4 7 469.6225353388079 km 9 bb 5 6 156.54889741438788 km 10 bb 5 7 312.95759746593706 km 11 bb 6 7 156.4089967703544 km
explanation: combination part
in [11]: [c ....: grp,lst in df.groupby('co_nm').groups.items() ....: c in combinations(lst, 2)] out[11]: [(0, 1), (0, 2), (1, 2), (7, 8), (7, 9), (8, 9), (3, 4), (3, 5), (3, 6), (4, 5), (4, 6), (5, 6)]
old answer:
in [3]: itertools import combinations in [4]: import pandas pd in [5]: geopy.distance import vincenty in [6]: df = pd.dataframe({'machine': [1,2,3], 'lat': [11, 12, 13], 'lon': [21,22,23]}) in [7]: df out[7]: lat lon machine 0 11 21 1 1 12 22 2 2 13 23 3 in [8]: dist = pd.dataframe( ...: [ [df.loc[c[0]].machine, ...: df.loc[c[1]].machine, ...: vincenty(df.loc[c[0], ['lat','lon']], df.loc[c[1], ['lat','lon']]) ...: ] ...: c in combinations(df.index, 2) ...: ], ...: columns=['machinea','machineb','distance']) in [9]: dist out[9]: machinea machineb distance 0 1 2 155.3664523771998 km 1 1 3 310.4557192973811 km 2 2 3 155.09044419651156 km
Comments
Post a Comment