List of Topics:
Location Research Breakthrough Possible @S-Logix pro@slogix.in

Office Address

Social List

How to Build and Evaluate an LSTM Model for Breast Cancer Classification

Building and Evaluating an LSTM Model for Breast Cancer Classification

Condition for Building and Evaluating an LSTM Model for Breast Cancer Classification

  • Description:
    Condition for building and evaluating an LSTM model for breast cancer classification involves preprocessing the dataset, including label encoding and feature scaling. The model is trained using an LSTM layer to predict cancer diagnosis, followed by evaluation using classification metrics. Performance is assessed through accuracy, F1 score, recall, precision, and a confusion matrix heatmap.
Step-by-Step Process
  • Import Necessary Libraries:
    Necessary libraries such as Pandas, Scikit-Learn, TensorFlow, and Seaborn are imported for data handling, model building, and evaluation. Visualization libraries Matplotlib and Seaborn are included for plotting.
  • Load the Dataset:
    The breast cancer dataset is loaded using Pandas read_csv() from the given file path. The dataset contains various features for prediction and the target variable "diagnosis."
  • Data Preprocessing:
    The code checks for missing values using isnull().sum(), ensuring that the dataset is complete before processing.
  • Feature Selection:
    The correlation matrix of features is computed to identify relationships between them, and a heatmap is plotted to visualize these correlations.
  • Label Encoding and Scaling:
    The target variable 'diagnosis' is label-encoded using LabelEncoder(), and features are standardized using StandardScaler().
  • Reshape Data for LSTM:
    The data is reshaped into a 3D array to be suitable for input into an LSTM model.
  • Build and Train LSTM Model:
    The LSTM model is built and trained for 10 epochs with a batch size of 16.
  • Model Evaluation:
    The model's performance is evaluated using accuracy, F1 score, recall, precision, and a confusion matrix heatmap.
Sample Source Code
  • # Import Necessary Libraries
    import pandas as pd
    import matplotlib.pyplot as plt
    import seaborn as sns
    from sklearn.preprocessing import LabelEncoder, StandardScaler
    from sklearn.model_selection import train_test_split
    from tensorflow.keras.layers import LSTM, Dense, Input
    from tensorflow.keras.models import Model
    from sklearn.metrics import (classification_report, confusion_matrix, accuracy_score, f1_score, recall_score, precision_score)

    df = pd.read_csv("/home/soft12/Downloads/sample_dataset/Website/Dataset/breast-cancer.csv")

    # Check null values
    print("Check null values")
    df.isnull().sum()

    # Check NaN values
    print("Check Nan values")
    print(df.isnull().sum())

    # Check data types of columns
    print("Check data types of columns")
    print(df.dtypes)

    # Find Correlation
    correlation_matrix = df.corr()

    # Display the correlation matrix
    print(correlation_matrix)

    # Plot the heatmap
    plt.figure(figsize=(10, 8))
    sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt='.2f', linewidths=0.5)
    plt.title('Correlation Heatmap')
    plt.show()

    x = df.drop('diagnosis', axis=1)
    y = df['diagnosis']

    # Label Encoding
    label = LabelEncoder()
    y = label.fit_transform(y)

    # Scaling the data
    scaler = StandardScaler()
    x = scaler.fit_transform(x)

    # Convert the Size Suitable for LSTM
    x = x.reshape(x.shape[0], 1, x.shape[1])

    X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)

    def LSTM_model(input_shape):
    # Input layer
    inputs = Input(shape=input_shape)

    # LSTM layer
    lstm_layer = LSTM(32, activation='relu', return_sequences=False)(inputs)

    # Output layer
    output_layer = Dense(1, activation='sigmoid')(lstm_layer)

    # Build the model
    lstm_model = Model(inputs=inputs, outputs=output_layer)

    # Compile the model with Adam optimizer and binary crossentropy loss function
    lstm_model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

    return lstm_model

    input_shape = (X_train.shape[1], X_train.shape[2])
    model = LSTM_model(input_shape)

    # Summary of Model
    model.summary()

    model.fit(X_train, y_train, batch_size=16, epochs=10, validation_data=(X_test, y_test))

    y_pred = model.predict(X_test)
    y_pred = [1 if i > 0.5 else 0 for i in y_pred]

    # Calculate confusion matrix
    cm = confusion_matrix(y_test, y_pred)

    class_labels = ['Negative', 'Positive']

    # Plot the heatmap with correct labels
    plt.figure(figsize=(6, 5))
    sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', xticklabels=class_labels, yticklabels=class_labels)
    plt.title('Confusion Matrix Heatmap')
    plt.xlabel('Predicted Label')
    plt.ylabel('True Label')
    plt.show()

    print("___Performance_Metrics___\n")
    print('Classification_Report:\n', classification_report(y_test, y_pred))
    print('Confusion_Matrix:\n', confusion_matrix(y_test, y_pred))
    print('Accuracy_Score: ', accuracy_score(y_test, y_pred))
    print('F1_Score: ', f1_score(y_test, y_pred))
    print('Recall_Score: ', recall_score(y_test, y_pred))
    print('Precision_Score: ', precision_score(y_test, y_pred))
Screenshots
  • LSTM Model Output Screenshot