How to do Descriptive Statistics using Pandas Groupby Function in Python?
Share
Condition for Descripting Statistics using Pandas Groupby Function in Python
Description: Descriptive statistics can be computed using the pandas library, especially with the groupby() function. The groupby() function allows you to group your data by one or more columns and then apply various statistical methods (like mean, sum, count, etc.) to each group.
Step-by-Step Process
Import pandas: import the pandas library
Group the Data: Use the groupby() method to group data by one or more columns.
Apply Descriptive Statistics: Use various functions such as mean(), sum(), count(), describe(), etc.,on the grouped data to compute descriptive statistics.
Sample Code
# Group by Region and compute descriptive statistics on Sales
import pandas as pd
data = {
'Region': ['East', 'West', 'East', 'East', 'West', 'West'],
'Salesperson': ['Arun', 'Bala', 'Charlie', 'Deva', 'karthi', 'Franklin'],
'Sales': [200, 300, 150, 500, 400, 600]
}
df = pd.DataFrame(data)
grouped = df.groupby('Region')['Sales'].describe()
print(grouped)
print()
# Group by Region and calculate the mean and sum of Sales
mean_sales = df.groupby('Region')['Sales'].mean()
sum_sales = df.groupby('Region')['Sales'].sum()
print("Output for mean and sum:")
print("Mean Sales by Region:")
print(mean_sales)
print("\nSum of Sales by Region:")
print(sum_sales)