=====
Usage
=====

All examples we'll walk through need ``pandas``:
    
    import pandas as pd
    
To generate correlation plots for your data frame, import the ``corr_plot`` function from the ``eazieda.corr_plot`` module::

    from eazieda.corr_plot import corr_plot

Then pass your data frame (``df``) and the columns you want to generate the plots for::

    corr_plot(df, ['column1','column2','column3'])

This will give you an interactive correlation plot implemented in altair
*Make sure to pass at least two columns to get the correlation!*

To generate histograms and bar plots from your data, import the ``histograms`` function from the ``eazieda.histograms`` module::

    from eazieda.histograms import histograms

Then pass your data frame (``df``) and the columns you want to generate the plots for::

    histograms(df, ['numeric_column', 'categorical_column'],num_cols=2)

This will give you a combined histogram (for numeric columns) and bar plot (for categorical columns) implemented in altair.
You can control the number of columns using ``num_cols`` and the plot dimensions with ``plot_width`` and ``plot_height``

To detect missing data in your dataframe, import the ``missing_detect`` function from the ``eazieda.missing_detect`` module::

    from eazieda.missing_detect import missing_detect

To detect the missing data, do this::

    df = pd.DataFrame([[1, "x"], [np.nan, "y"], [2, np.nan], [3, "y"]],
    columns = ['a', 'b'])
    missing_detect(df)

You will get a data frame back which has the number of missing values and their corresponding percentage

To deal with missing data in your dataframe, import the ``missing_impute`` function from the ``eazieda.missing_impute`` module::

    from eazieda.missing_impute import missing_impute

To impute the missing data, do this::

    df = pd.DataFrame([[1, "x"], [np.nan, "y"], [2, np.nan], [3, "y"]],
    columns = ['a', 'b'])
    missing_impute(df)

You will get an imputed data frame back. 
You can control the type of imputation with ``method_num`` and ``method_non_num``

To detect outliers in your data, import the ``outliers_detect`` function from the ``eazieda.outliers_detect`` module::

    from eazieda.outliers_detect import outliers_detect

To detect the outliers do this::

    s = pd.Series([1,1,1,1,1,1,1,1,1,1,1e14])
    outliers_detect(s)

You can control the method used to detect the outliers with ``method``. You can choose one of ``iforest``, ``iqr`` or ``zscore``

To remove outliers in your data, import the ``remove_outliers`` function from the ``eazieda.outliers_detect`` module::

    from eazieda.outliers_detect import remove_outliers

To remove the outliers, do this::

    s = pd.Series([1,1e14])
    outliers = np.array([False,,True])
    s_without_outliers = remove_outliers(s, outliers)

You can choose to do the removal in place with ``inplace=True``