sklearn-dummies

Scikit-learn label binarizer with support for missing values.

Usage example

import pandas as pd
import sklearn_dummies as skdm

df = pd.DataFrame(['A', 'B', None, 'A'], columns=['val'])

df_dummy = skdm.DataFrameDummies().fit_transform(df)

Result:

idx val_A val_B
0 1.0 0.0
1 0.0 1.0
2 NaN NaN
3 1.0 0.0

Installing

Sklearn-dummies is available in PyPI. Install via pip:

pip install sklearn_dummies

sklearn_dummies API

Base module.

class sklearn_dummies.base.DataFrameDummies

Bases: sklearn.base.TransformerMixin

cat_cols

List of categorical columns

Type:list
final_cols

List of all columns with dummy values

Type:list
fit(df, y=None)
Parameters:df (pd.DataFrame) – Provides the column names to be used.
Returns:Itself.
Return type:DataFrameDummies
transform(df, y=None)
Parameters:df (pd.DataFrame) – Data to be dummified.
Returns:Transformed data
Return type:pd.DataFrame
class sklearn_dummies.base.NPArrayDummies

Bases: sklearn.base.TransformerMixin

labels

List of labels

Type:list
fit(X, y=None)
Parameters:X (np.ndarray) – Provides the labels.
Returns:Itself.
Return type:NPArrayDummies
transform(X, y=None)
Parameters:X (np.ndarray) – Data to be dummified.
Returns:Xt – Transformed data
Return type:np.ndarray

Indices and tables