Register
handmade.network»Forums»Impute entire DataFrame (all columns) using Scikit-learn (sklearn) without iterating over columns
1 posts

Having 4 years of experience in Machine Learning and Artificial Intelligence with the background of B.Tech in Computer Science and Robotics.

Impute entire DataFrame (all columns) using Scikit-learn (sklearn) without iterating over columns
1 month, 2 weeks ago Edited by Rohit_bhat on Feb. 21, 2020, 5:47 a.m. Reason: Giving more information with resource page
I want to impute all of the columns on a pandas DataFrame from where I've learned from Machine Learning...the only way I can think of doing this is column by column as shown below...

Is there an operation where I can impute the entire DataFrame without iterating through the columns?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
#!/usr/bin/python

from sklearn.preprocessing import Imputer

import numpy as np

import pandas as pd

#Imputer

fill_NaN = Imputer(missing_values=np.nan, strategy='mean', axis=1)

#Model 1

DF = pd.DataFrame([[0,1,np.nan],[2,np.nan,3],[np.nan,2,5]])

DF.columns = "c1.c2.c3".split(".")

DF.index = "i1.i2.i3".split(".")

#Impute Series

imputed_DF = DF

for col in DF.columns:

    imputed_column = fill_NaN.fit_transform(DF[col]).T

    #Fill in Series on DataFrame

    imputed_DF[col] = imputed_column

#DF

#c1  c2  c3

#i1   0   1 NaN

#i2   2 NaN   3

#i3 NaN   2   5

#imputed_DF

#c1   c2  c3

#i1   0  1.0   4

#i2   2  1.5   3

#i3   1  2.0   5