I want to impute all of the columns on a pandas DataFrame from where I've learned from
Machine Learning...the only way I can think of doing this is column by column as shown below...
Is there an operation where I can impute the entire DataFrame without iterating through the columns?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51 | #!/usr/bin/python
from sklearn.preprocessing import Imputer
import numpy as np
import pandas as pd
#Imputer
fill_NaN = Imputer(missing_values=np.nan, strategy='mean', axis=1)
#Model 1
DF = pd.DataFrame([[0,1,np.nan],[2,np.nan,3],[np.nan,2,5]])
DF.columns = "c1.c2.c3".split(".")
DF.index = "i1.i2.i3".split(".")
#Impute Series
imputed_DF = DF
for col in DF.columns:
imputed_column = fill_NaN.fit_transform(DF[col]).T
#Fill in Series on DataFrame
imputed_DF[col] = imputed_column
#DF
#c1 c2 c3
#i1 0 1 NaN
#i2 2 NaN 3
#i3 NaN 2 5
#imputed_DF
#c1 c2 c3
#i1 0 1.0 4
#i2 2 1.5 3
#i3 1 2.0 5
|