Research Breakthrough Possible @S-Logix pro@slogix.in

Office Address

Social List

How to Do Standardize and Normalize the Data

Standardized and Normalized Data

Condition for Standardization and Normalization

  • Description:
    Standardization: Refers to transforming the data to have a mean of 0 and a standard deviation of 1. This technique is particularly useful when the data follows a Gaussian (normal) distribution.

    Normalization: Refers to transforming the data to a fixed range, typically between 0 and 1, and is useful when the data is not normally distributed.
Step-by-Step Process
  • Import Necessary Libraries:
    Use scikit-learn for preprocessing and numpy for array manipulations.
  • Standardization:
    Subtract the mean of each feature and divide by the standard deviation.
  • Normalization:
    Scale the features using the minimum and maximum values of each feature to a fixed range, typically [0, 1].
  • Apply to Dataset:
    Standardization and normalization can be applied to each feature of a dataset independently.
Sample Source Code
  • # Code for Standardization and Normalization

    import numpy as np
    import pandas as pd
    from sklearn.preprocessing import StandardScaler, MinMaxScaler

    data = {
    'Height': [5.5, 6.2, 5.8, 5.4, 6.0],
    'Weight': [150, 180, 160, 140, 170],
    'Age': [23, 34, 25, 45, 36]
    }

    df = pd.DataFrame(data)

    # Standardization (Z-score normalization)
    scaler_standard = StandardScaler()
    df_standardized = pd.DataFrame(scaler_standard.fit_transform(df), columns=df.columns)

    print("Standardized Data (Mean = 0, Std = 1):")
    print(df_standardized)

    # Normalization (Min-Max scaling)
    scaler_minmax = MinMaxScaler()
    df_normalized = pd.DataFrame(scaler_minmax.fit_transform(df), columns=df.columns)

    print("\nNormalized Data (Range = 0 to 1):")
    print(df_normalized)
Screenshots
  • Standardized and Normalized Data Output