How to Calculate the Measure of Central Tendency for a Dataset Using Python
Share
Condition for Calculating the Measures of Central Tendency in a Dataset Using Python
Description:
Measures of central tendency are statistical values that represent the center of a dataset.
The three key measures are:
Mean: The average value of the dataset, calculated by summing all the values and dividing by the count of values.
Median: The middle value when the data is ordered. If the dataset has an odd number of values, it's the middle value; if even, it's the average of the two middle values.
Mode: The most frequent value in the dataset.
Step-by-Step Process
Import Required Libraries:
Import the necessary libraries such as Pandas for data manipulation.
Load Dataset:
Load the dataset that contains the data you want to analyze (e.g., 'company_sales_data.csv').
Calculate the Mean:
Use the mean() function to calculate the average of all columns in the dataset.
Calculate the Median:
Use the median() function to calculate the middle value of each column in the dataset.
Calculate the Mode:
Use the mode() function to identify the most frequent value in specific columns (e.g., 'facecream').
Sample Source Code
# Import necessary libraries
import pandas as pd
df = pd.read_csv('company_sales_data.csv')
print("ORIGINAL DATASET: ")
print(df)
print()
# Mean calculation
print("MEAN VALUE OF THE DATASET")
print(df.mean())
print()
# Median calculation
print("MEDIAN VALUE OF THE DATASET")
print(df.median())
print()
# Mode calculation
print("MODE VALUE OF FACE CREAM IN THE DATASET:")
print(df['facecream'].mode()[0])