Building and Evaluating a Simple RNN Model for Multiclass Student Risk Level Prediction

How to Build and Evaluate a Simple RNN Model for Multiclass Student Risk Level Prediction

Condition for Building and Evaluating a Simple RNN Model for Multiclass Student Risk Level Prediction

Description:
The process involves preprocessing student monitoring data, encoding categorical variables, and scaling features for model training. A Simple Recurrent Neural Network (RNN) is built to predict risk levels, with the data reshaped for RNN compatibility. The model is then evaluated using various metrics such as accuracy, F1-score, recall, precision, and a classification report.

Step-by-Step Process

Import Libraries:
Import essential libraries like pandas, numpy, sklearn, matplotlib, and tensorflow for data processing, model building, and evaluation.
Load and Inspect Data:
Load the student monitoring dataset using pd.read_csv() for analysis and preprocessing.
Data Integrity Check:
Check for NaN and null values using df.isna().sum() and df.isnull().sum().
Feature Extraction:
Convert the 'Date' column to datetime and extract day, month, and year.
Encoding Categorical Variables:
Encode categorical columns using LabelEncoder for model compatibility.
Data Scaling:
Scale features using StandardScaler for uniform contribution in model training.
Data Reshaping:
Reshape the feature data to be compatible with RNN input.
Build and Train Model:
Build a Simple RNN model, train it on the data, and evaluate performance using accuracy, F1-score, recall, and precision.

Sample Source Code

# Import Necessary Libraries
import pandas as pd
import numpy as np
from sklearn.preprocessing import LabelEncoder, StandardScaler
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from tensorflow.keras.layers import SimpleRNN, Dense, Input
from tensorflow.keras.models import Model
from sklearn.metrics import (classification_report,confusion_matrix,accuracy_score, f1_score,recall_score,precision_score)

import warnings
warnings.filterwarnings("ignore")
df = pd.read_csv("/home/soft12/Downloads/sample_dataset/Website/Dataset/student_monnitoring_data.csv")

# Check Nan Values
print("Check Nan Values\n")
print(df.isna().sum())

# Check Null Values
print("\n")
print("Check Null Values\n")
print(df.isnull().sum())

# Check dtypes
print(df.dtypes)

# Convert the date column to a datetime object
df['Date'] = pd.to_datetime(df['Date'])

# Split into day, month, and year
df['day'] = df['Date'].dt.day
df['month'] = df['Date'].dt.month
df['year'] = df['Date'].dt.year

label = LabelEncoder()
for i in df.columns:
if df[i].dtypes == 'object':
df[i] = label.fit_transform(df[i])

# Compute the correlation matrix
correlation_matrix = df.corr()

# Display the correlation matrix
print(correlation_matrix)

# Plot the heatmap
plt.figure(figsize=(10, 8))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt='.2f', linewidths=0.5)
plt.title('Correlation Heatmap')
plt.show()

x = df.drop(['Date','Risk Level'],axis=1)
y = df['Risk Level']

# Scaling the data
scaler = StandardScaler()
x = scaler.fit_transform(x)

# Reshape the Data for RNN model
x = x.reshape(x.shape[0],1,x.shape[1])

X_train,X_test,y_train,y_test = train_test_split(x,y,test_size=.2,random_state=42)

def simple_RNN_model(input_shape):
# Input layer
inputs = Input(shape=input_shape) # Adding a third dimension for RNN input

# RNN layer
rnn_layer = SimpleRNN(32, activation='relu', return_sequences=False)(inputs)

# Output layer
layer1 = Dense(32,activation='relu')(rnn_layer)
output_layer = Dense(3, activation='softmax')(layer1)

# Build the model
rnn_model = Model(inputs=inputs, outputs=output_layer)

# Compile the model with Adam optimizer and binary crossentropy loss function
rnn_model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

return rnn_model

input_shape = (X_train.shape[1],X_train.shape[2])

model = simple_RNN_model(input_shape)

# Summary of Model
model.summary()

model.fit(X_train,y_train,batch_size=16,epochs=10,validation_data=(X_test,y_test))

y_pred = model.predict(X_test)
y_pred = [np.argmax(i) for i in y_pred]

print("___Performance_Metrics___\n")
print('Classification_Report:\n',classification_report(y_test, y_pred))
print('Confusion_Matrix:\n',confusion_matrix(y_test, y_pred))
print('\n')
print('Accuracy_Score: ',accuracy_score(y_test, y_pred))
print('F1_Score (macro): ', f1_score(y_test, y_pred, average='macro'))
print('Recall_Score (macro): ', recall_score(y_test, y_pred, average='macro'))
print('Precision_Score (macro): ', precision_score(y_test, y_pred, average='macro'))

Screenshots

List

Office Address

Social List

How to Build and Evaluate a Simple RNN Model for Multiclass Student Risk Level Prediction

Condition for Building and Evaluating a Simple RNN Model for Multiclass Student Risk Level Prediction

Step-by-Step Process

Sample Source Code

Screenshots

S-Logix (OPC) Private Limited