How to Build a Multi-Layer Perceptron Using Scikit-Learn in Python?
Share
Condition for Building a Multi-Layer Perceptron Using Scikit-Learn in Python
Description: The code implements a Multi-Layer Perceptron (MLP) classifier to predict breast cancer diagnosis (malignant or benign) using clinical features.It preprocesses the dataset by scaling the input features, encoding the target labels, and training the MLP model. The model's performance is evaluated using metrics like accuracy, precision, recall, and F1 score.
Step-by-Step Process
Step1: Libraries like pandas, matplotlib, seaborn, and sklearn are imported to handle data manipulation, visualization, and machine learning tasks.
Step2: The breast cancer dataset is loaded from a CSV file using pandas.read_csv() to begin the analysis and processing.
Step3: The breast cancer dataset is loaded from a CSV file using pandas.read_csv() to begin the analysis and processing.
Step4: A correlation matrix is generated using df.corr() to identify relationships between numeric features, followed by a heatmap to visualize these correlations.
Step5: A bar plot is created to check the distribution of the target class (diagnosis), ensuring the classes are balanced or identifying any imbalance.
Step6: The features (x) and target (y) are separated, where x contains all columns except the diagnosis column, and y contains the target class values.
Step7: The target labels (y) are converted into numeric values using LabelEncoder() to prepare for model training, as MLP requires numeric inputs.
Step8: Standard scaling is applied to the feature set (x) using StandardScaler(), ensuring that all features have similar ranges for the MLP to perform optimally.
Step9: The dataset is split into training and testing subsets using train_test_split() to evaluate the model’s performance on unseen data after training.
Step10: The MLP model is defined with a specified architecture, trained on the training data, and predictions are made on the test set. Various evaluation metrics like accuracy, F1 score, precision, and recall are then used to assess the model's performance.
Sample Code
#Import Necessary Libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import LabelEncoder,StandardScaler
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split
import warnings
warnings.filterwarnings("ignore")
from sklearn.metrics import (classification_report,confusion_matrix,accuracy_score,
f1_score,recall_score,precision_score)
df = pd.read_csv("/home/soft12/Downloads/sample_dataset/Website/Dataset/breast- cancer.csv")
#check null values
df.isnull().sum()
#calculate correlation between features
# Compute the correlation matrix
correlation_matrix = df.corr()
# Display the correlation matrix
print(correlation_matrix)
# Plot the heatmap
plt.figure(figsize=(10, 8))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt='.2f', linewidths=0.5)
plt.title('Correlation Heatmap')
plt.show()
x = df.drop('diagnosis',axis=1)
y = df['diagnosis']
# Count the number of samples per class
class_counts = y.value_counts()
# Plot the class distribution
plt.figure(figsize=(8, 6))
sns.barplot(x=class_counts.index, y=class_counts.values, palette="viridis")
plt.title('Class Balance Check', fontsize=16)
plt.xlabel('Class', fontsize=14)
plt.ylabel('Count', fontsize=14)
plt.xticks(fontsize=12)
plt.yticks(fontsize=12)
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.show()
#converting object to numeric in traget column
label = LabelEncoder()
y = label.fit_transform(y)
#Scaling the input data
scaler = StandardScaler()
x = scaler.fit_transform(x)
#Split the train_test_data
X_train,X_test,y_train,y_test = train_test_split(x,y,test_size=.2,random_state=42)
# Define the MLP Classifier
mlp = MLPClassifier(hidden_layer_sizes=(64, 64, 32),
activation='relu',
solver='adam',
max_iter=10,
batch_size=2,
random_state=42,
verbose=True)
# Train the model
mlp.fit(X_train, y_train)
# Make predictions
y_pred = mlp.predict(X_test)
print("___Performance_Metrics___\n")
print('Classification_Report:\n',classification_report(y_test, y_pred))
print('Confusion_Matrix:\n',confusion_matrix(y_test, y_pred))
print('Accuracy_Score: ',accuracy_score(y_test, y_pred))
print('F1_Score: ',f1_score(y_test, y_pred))
print('Recall_Score: ',recall_score(y_test, y_pred))
print('Precision_Score: ',precision_score(y_test, y_pred))