How to Build and Evaluate an Artificial Neural Network (ANN) for House Resale Price Prediction Using Regression Analysis in Python
Share
Condition for Building and Evaluating an Artificial Neural Network (ANN) for House Resale Price Prediction Using Regression Analysis in Python
Description:
This code implements a regression model using an Artificial Neural Network
(ANN) to predict housing resale prices. It includes data preprocessing
steps like encoding categorical variables, scaling features, and
visualizing feature correlations. The model is trained and evaluated using
key regression metrics such as MSE, RMSE, MAE, and R².
Step-by-Step Process
Import Libraries:
Import essential libraries like pandas, seaborn, and TensorFlow for
data manipulation, visualization, and building the regression model.
Load and Inspect Data:
Load the dataset, check column names, data points, and inspect for missing values.
Data Preprocessing:
Handle NaN values, encode categorical variables, and scale features for model training.
Feature Visualization:
Generate a heatmap to visualize correlations between features, aiding in feature selection.
Build and Train Model:
Define the neural network architecture, compile the model, and train it using the processed data.
Evaluate and Visualize:
Evaluate the model with regression metrics such as MSE, RMSE, MAE, and R² to assess its performance.
Sample Source Code
# Import Necessary Libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import LabelEncoder,StandardScaler
import warnings
warnings.filterwarnings("ignore")
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Dense, Dropout
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
df = pd.read_csv("/home/soft12/Downloads/sample_dataset/Website/Dataset/hdb_presale_prices_2015-2024_cleaned_regression.csv")
# Print Columns
print(df.columns)
# Print Data
print("Data_points")
print(df.head())
# Check null values
print("check null values\n")
print(df.isnull().sum())
# Check NaN Values
print("Check Nan Values\n")
print(df.isna().sum())
# Compute the correlation matrix
corr = df.corr()
# Plot the heatmap
plt.figure(figsize=(10, 8))
sns.heatmap(corr, annot=True, cmap='Blues', fmt='.2f')
plt.title('Correlation Heatmap')
plt.show()
# Split the Dependent and Independent Variables
x = df.drop('resale_price',axis=1)
y = df['resale_price']
from sklearn.preprocessing import MinMaxScaler
scaler_y = MinMaxScaler()
y = scaler_y.fit_transform(y.values.reshape(-1, 1))
label = LabelEncoder()
x['year'] = label.fit_transform(x['year'])
# Convert Object Column
for i in x.columns:
dtype = x[i].dtypes
if dtype == object:
x[i] = label.fit_transform(x[i])
scaler = StandardScaler()
x = scaler.fit_transform(x)
# Split the data for training and testing
X_train,X_test,y_train,y_test = train_test_split(x,y,test_size=0.2,random_state=42)
def ANN_model_regression(input_shape):
# Input layer
inputs = Input(shape=(input_shape,))
# Hidden layers
layer1 = Dense(32, activation='relu')(inputs)
drop1 = Dropout(0.2)(layer1)
layer2 = Dense(64, activation='relu')(drop1)
drop2 = Dropout(0.2)(layer2)
# Output layer
output_layer = Dense(1, activation='linear')(drop2)
# Build the model
ann_model = Model(inputs=inputs, outputs=output_layer)
# Compile the model with Adam optimizer and regression loss function
ann_model.compile(optimizer='adam', loss='mse', metrics=['mean_squared_error'])
return ann_model
model = ANN_model_regression(X_train.shape[1])
# Summary of Model
model.summary()
model.fit(X_train,y_train,batch_size=32,epochs=10,validation_data=(X_test,y_test))
y_pred = model.predict(X_test)
y_pred = y_pred.ravel()
# Assuming y_test and y_pred are the actual and predicted values
print("___Performance_Metrics___\n")
print('Mean Squared Error (MSE): ', mean_squared_error(y_test, y_pred))
print('Root Mean Squared Error (RMSE):', mean_squared_error(y_test, y_pred, squared=False)) # RMSE
print('Mean Absolute Error (MAE): ', mean_absolute_error(y_test, y_pred))
print('R-squared (R2 Score): ', r2_score(y_test, y_pred))