List of Topics:
Location Research Breakthrough Possible @S-Logix pro@slogix.in

Office Address

Social List

How to Build and Evaluate a Simple RNN Model for Multiclass Student Risk Level Prediction

Simple RNN Model for Multiclass Student Risk Level Prediction

Condition for Building and Evaluating a Simple RNN Model for Multiclass Student Risk Level Prediction

  • Description:
    The process involves preprocessing student monitoring data, encoding categorical variables, and scaling features for model training. A Simple Recurrent Neural Network (RNN) is built to predict risk levels, with the data reshaped for RNN compatibility. The model is then evaluated using various metrics such as accuracy, F1-score, recall, precision, and a classification report.
Step-by-Step Process
  • Import Libraries:
    Import essential libraries like pandas, numpy, sklearn, matplotlib, and tensorflow for data processing, model building, and evaluation.
  • Load and Inspect Data:
    Load the student monitoring dataset using pd.read_csv() for analysis and preprocessing.
  • Data Integrity Check:
    Check for NaN and null values using df.isna().sum() and df.isnull().sum().
  • Feature Extraction:
    Convert the 'Date' column to datetime and extract day, month, and year.
  • Encoding Categorical Variables:
    Encode categorical columns using LabelEncoder for model compatibility.
  • Data Scaling:
    Scale features using StandardScaler for uniform contribution in model training.
  • Data Reshaping:
    Reshape the feature data to be compatible with RNN input.
  • Build and Train Model:
    Build a Simple RNN model, train it on the data, and evaluate performance using accuracy, F1-score, recall, and precision.
Sample Source Code
  • # Import Necessary Libraries
    import pandas as pd
    import numpy as np
    from sklearn.preprocessing import LabelEncoder, StandardScaler
    import matplotlib.pyplot as plt
    import seaborn as sns
    from sklearn.model_selection import train_test_split
    from tensorflow.keras.layers import SimpleRNN, Dense, Input
    from tensorflow.keras.models import Model
    from sklearn.metrics import (classification_report,confusion_matrix,accuracy_score, f1_score,recall_score,precision_score)

    import warnings
    warnings.filterwarnings("ignore")
    df = pd.read_csv("/home/soft12/Downloads/sample_dataset/Website/Dataset/student_monnitoring_data.csv")

    # Check Nan Values
    print("Check Nan Values\n")
    print(df.isna().sum())

    # Check Null Values
    print("\n")
    print("Check Null Values\n")
    print(df.isnull().sum())

    # Check dtypes
    print(df.dtypes)

    # Convert the date column to a datetime object
    df['Date'] = pd.to_datetime(df['Date'])

    # Split into day, month, and year
    df['day'] = df['Date'].dt.day
    df['month'] = df['Date'].dt.month
    df['year'] = df['Date'].dt.year

    label = LabelEncoder()
    for i in df.columns:
    if df[i].dtypes == 'object':
    df[i] = label.fit_transform(df[i])

    # Compute the correlation matrix
    correlation_matrix = df.corr()

    # Display the correlation matrix
    print(correlation_matrix)

    # Plot the heatmap
    plt.figure(figsize=(10, 8))
    sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt='.2f', linewidths=0.5)
    plt.title('Correlation Heatmap')
    plt.show()

    x = df.drop(['Date','Risk Level'],axis=1)
    y = df['Risk Level']

    # Scaling the data
    scaler = StandardScaler()
    x = scaler.fit_transform(x)

    # Reshape the Data for RNN model
    x = x.reshape(x.shape[0],1,x.shape[1])

    X_train,X_test,y_train,y_test = train_test_split(x,y,test_size=.2,random_state=42)

    def simple_RNN_model(input_shape):
    # Input layer
    inputs = Input(shape=input_shape) # Adding a third dimension for RNN input

    # RNN layer
    rnn_layer = SimpleRNN(32, activation='relu', return_sequences=False)(inputs)

    # Output layer
    layer1 = Dense(32,activation='relu')(rnn_layer)
    output_layer = Dense(3, activation='softmax')(layer1)

    # Build the model
    rnn_model = Model(inputs=inputs, outputs=output_layer)

    # Compile the model with Adam optimizer and binary crossentropy loss function
    rnn_model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

    return rnn_model

    input_shape = (X_train.shape[1],X_train.shape[2])

    model = simple_RNN_model(input_shape)

    # Summary of Model
    model.summary()

    model.fit(X_train,y_train,batch_size=16,epochs=10,validation_data=(X_test,y_test))

    y_pred = model.predict(X_test)
    y_pred = [np.argmax(i) for i in y_pred]

    print("___Performance_Metrics___\n")
    print('Classification_Report:\n',classification_report(y_test, y_pred))
    print('Confusion_Matrix:\n',confusion_matrix(y_test, y_pred))
    print('\n')
    print('Accuracy_Score: ',accuracy_score(y_test, y_pred))
    print('F1_Score (macro): ', f1_score(y_test, y_pred, average='macro'))
    print('Recall_Score (macro): ', recall_score(y_test, y_pred, average='macro'))
    print('Precision_Score (macro): ', precision_score(y_test, y_pred, average='macro'))
Screenshots
  • RNN Model Output Screenshot