python - Sklearn preprocessing - PolynomialFeatures - How to keep column names/headers of the output array / dataframe -
tldr: how headers output numpy array sklearn.preprocessing.polynomialfeatures() function?
let's have following code...
import pandas pd import numpy np sklearn import preprocessing pp = np.ones(3) b = np.ones(3) * 2 c = np.ones(3) * 3 input_df = pd.dataframe([a,b,c]) input_df = input_df.t input_df.columns=['a', 'b', 'c'] input_df b c 0 1 2 3 1 1 2 3 2 1 2 3 poly = pp.polynomialfeatures(2) output_nparray = poly.fit_transform(input_df) print output_nparray [[ 1. 1. 2. 3. 1. 2. 3. 4. 6. 9.] [ 1. 1. 2. 3. 1. 2. 3. 4. 6. 9.] [ 1. 1. 2. 3. 1. 2. 3. 4. 6. 9.]]
how can 3x10 matrix/ output_nparray carry on a,b,c labels how relate data above?
working example, in 1 line (i assume "readability" not goal here):
target_feature_names = ['x'.join(['{}^{}'.format(pair[0],pair[1]) pair in tuple if pair[1]!=0]) tuple in [zip(input_df.columns,p) p in poly.powers_]] output_df = pd.dataframe(output_nparray, columns = target_feature_names)
Comments
Post a Comment