Research breakthrough possible @S-Logix pro@slogix.in

Office Address

Social List

How to Build a Simple Linear Regression Model to Predict the Weight of Students Based on Height in Python?

Simple Linear Regression Model Screenshot

Condition for Building a Simple Linear Regression Model to Predict the Weight of Students Based on Height in Python

  • Description:
    Linear Regression is a fundamental statistical method used to model the relationship between a dependent variable and one or more independent variables. In this case, we will use a dataset containing the heights and weights of males and females. The goal is to predict one of the variables (e.g., weight) based on the other variable (e.g., height) using a simple linear regression model.
Why Should We Choose Linear Regression?
  • Simplicity and Interpretability: Linear regression is a straightforward approach, easy to understand, and explain.
  • Clear Relationship: The assumption of a linear relationship between height and weight is reasonable in many cases.
  • Quick to Implement: It is computationally inexpensive and suitable for quick prototyping.
  • Predictive Power: Even simple linear regression models can provide useful insights and make reliable predictions with small datasets.
Step-by-Step Process
  • Data Collection: Obtain a dataset that includes height and weight information for both males and females.
  • Data Preprocessing: Clean the dataset, handle missing values, and ensure that the data is in a usable format.
  • Exploratory Data Analysis (EDA): Analyze the dataset to understand distributions, relationships, and correlations between features.
  • Model Building: Create a simple linear regression model to predict weight from height or vice versa.
  • Model Evaluation: Evaluate the performance of the model using metrics such as Mean Squared Error (MSE), R-squared, etc.
  • Visualization: Plot the data and the regression line to visualize the fit of the model.
  • Interpretation: Interpret the model's coefficients and results to derive meaningful conclusions.
Sample Source Code
  • import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    import seaborn as sns
    from sklearn.model_selection import train_test_split
    from sklearn.linear_model import LinearRegression
    from sklearn.metrics import mean_squared_error, r2_score

    # Step 1: Generate or Load a Simple Dataset
    data = {'Height': [160, 170, 180, 165, 175, 162, 168, 172, 177, 165, 185, 178, 169, 174, 180],
    'Weight': [60, 70, 80, 65, 75, 62, 68, 72, 78, 67, 85, 79, 71, 73, 80],
    'Gender': ['Female', 'Male', 'Male', 'Female', 'Male', 'Female', 'Male', 'Male', 'Male', 'Female', 'Male', 'Male', 'Female', 'Male', 'Male']}
    df = pd.DataFrame(data)

    # Step 2: Exploratory Data Analysis (EDA)
    plt.figure(figsize=(10, 6))
    sns.scatterplot(data=df, x='Height', y='Weight', hue='Gender')
    plt.title('Height vs Weight Scatter Plot')
    plt.xlabel('Height (cm)')
    plt.ylabel('Weight (kg)')
    plt.legend()
    plt.show()

    # Step 3: Data Preprocessing
    X = df[['Height']]
    y = df['Weight']

    # Step 4: Split Data into Training and Test Sets
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

    # Step 5: Building the Linear Regression Model
    model = LinearRegression()
    model.fit(X_train, y_train)

    # Step 6: Model Prediction
    y_pred = model.predict(X_test)

    # Step 7: Model Evaluation
    mse = mean_squared_error(y_test, y_pred)
    r2 = r2_score(y_test, y_pred)

    print(f'Mean Squared Error: {mse}')
    print(f'R-Squared: {r2}')

    # Step 8: Visualization of the regression line
    plt.figure(figsize=(10, 6))
    sns.scatterplot(data=df, x='Height', y='Weight', hue='Gender', palette='coolwarm')
    plt.plot(X_test, y_pred, color='red', linewidth=2, label='Regression Line')
    plt.title('Height vs Weight with Linear Regression Line')
    plt.xlabel('Height (cm)')
    plt.ylabel('Weight (kg)')
    plt.legend()
    plt.show()
Screenshot
  • Linear Regression Visualization