Building and Evaluating an LSTM Model for Breast Cancer Classification

How to Build and Evaluate an LSTM Model for Breast Cancer Classification

Condition for Building and Evaluating an LSTM Model for Breast Cancer Classification

Description:
Condition for building and evaluating an LSTM model for breast cancer classification involves preprocessing the dataset, including label encoding and feature scaling. The model is trained using an LSTM layer to predict cancer diagnosis, followed by evaluation using classification metrics. Performance is assessed through accuracy, F1 score, recall, precision, and a confusion matrix heatmap.

Step-by-Step Process

Import Necessary Libraries:
Necessary libraries such as Pandas, Scikit-Learn, TensorFlow, and Seaborn are imported for data handling, model building, and evaluation. Visualization libraries Matplotlib and Seaborn are included for plotting.
Load the Dataset:
The breast cancer dataset is loaded using Pandas read_csv() from the given file path. The dataset contains various features for prediction and the target variable "diagnosis."
Data Preprocessing:
The code checks for missing values using isnull().sum(), ensuring that the dataset is complete before processing.
Feature Selection:
The correlation matrix of features is computed to identify relationships between them, and a heatmap is plotted to visualize these correlations.
Label Encoding and Scaling:
The target variable 'diagnosis' is label-encoded using LabelEncoder(), and features are standardized using StandardScaler().
Reshape Data for LSTM:
The data is reshaped into a 3D array to be suitable for input into an LSTM model.
Build and Train LSTM Model:
The LSTM model is built and trained for 10 epochs with a batch size of 16.
Model Evaluation:
The model's performance is evaluated using accuracy, F1 score, recall, precision, and a confusion matrix heatmap.

Sample Source Code

# Import Necessary Libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.model_selection import train_test_split
from tensorflow.keras.layers import LSTM, Dense, Input
from tensorflow.keras.models import Model
from sklearn.metrics import (classification_report, confusion_matrix, accuracy_score, f1_score, recall_score, precision_score)

df = pd.read_csv("/home/soft12/Downloads/sample_dataset/Website/Dataset/breast-cancer.csv")

# Check null values
print("Check null values")
df.isnull().sum()

# Check NaN values
print("Check Nan values")
print(df.isnull().sum())

# Check data types of columns
print("Check data types of columns")
print(df.dtypes)

# Find Correlation
correlation_matrix = df.corr()

# Display the correlation matrix
print(correlation_matrix)

# Plot the heatmap
plt.figure(figsize=(10, 8))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt='.2f', linewidths=0.5)
plt.title('Correlation Heatmap')
plt.show()

x = df.drop('diagnosis', axis=1)
y = df['diagnosis']

# Label Encoding
label = LabelEncoder()
y = label.fit_transform(y)

# Scaling the data
scaler = StandardScaler()
x = scaler.fit_transform(x)

# Convert the Size Suitable for LSTM
x = x.reshape(x.shape[0], 1, x.shape[1])

X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)

def LSTM_model(input_shape):
# Input layer
inputs = Input(shape=input_shape)

# LSTM layer
lstm_layer = LSTM(32, activation='relu', return_sequences=False)(inputs)

# Output layer
output_layer = Dense(1, activation='sigmoid')(lstm_layer)

# Build the model
lstm_model = Model(inputs=inputs, outputs=output_layer)

# Compile the model with Adam optimizer and binary crossentropy loss function
lstm_model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

return lstm_model

input_shape = (X_train.shape[1], X_train.shape[2])
model = LSTM_model(input_shape)

# Summary of Model
model.summary()

model.fit(X_train, y_train, batch_size=16, epochs=10, validation_data=(X_test, y_test))

y_pred = model.predict(X_test)
y_pred = [1 if i > 0.5 else 0 for i in y_pred]

# Calculate confusion matrix
cm = confusion_matrix(y_test, y_pred)

class_labels = ['Negative', 'Positive']

# Plot the heatmap with correct labels
plt.figure(figsize=(6, 5))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', xticklabels=class_labels, yticklabels=class_labels)
plt.title('Confusion Matrix Heatmap')
plt.xlabel('Predicted Label')
plt.ylabel('True Label')
plt.show()

print("___Performance_Metrics___\n")
print('Classification_Report:\n', classification_report(y_test, y_pred))
print('Confusion_Matrix:\n', confusion_matrix(y_test, y_pred))
print('Accuracy_Score: ', accuracy_score(y_test, y_pred))
print('F1_Score: ', f1_score(y_test, y_pred))
print('Recall_Score: ', recall_score(y_test, y_pred))
print('Precision_Score: ', precision_score(y_test, y_pred))

Screenshots

List

Office Address

Social List

How to Build and Evaluate an LSTM Model for Breast Cancer Classification

Condition for Building and Evaluating an LSTM Model for Breast Cancer Classification

Step-by-Step Process

Sample Source Code

Screenshots

S-Logix (OPC) Private Limited