Menu

Skewness and Kurtosis – Peaks and Tails, Understanding Data Through Skewness and Kurtosis”

Unravel the secrets of data distributions with skewness and kurtosis. A concise guide to understanding data asymmetry and tail behaviors

Written by Jagdeesh | 5 min read

Statistics has a variety of tools to help us understand and interpret data. Two such tools are skewness and kurtosis, which give us insights into the shape of a data distribution.

Let’s dive deeper into these concepts and understand their significance.

In this blog post we will learn

  1. Skewness
    1.1. Types of Skewness:
    1.2. Rules of Thumb for Skewness:
    1.3. Implications of Skewness:
  2. Kurtosis
    2.1. Types of Kurtosis:
    2.2. Implications of Kurtosis:
  3. Computing Skewness and Kurtosis
  4. Importance of Understanding Skewness and Kurtosis
  5. Conclusion

1. Skewness

Skewness measures the asymmetry of a data distribution. If you visualize your data using a histogram or frequency curve, skewness indicates which side of the distribution is more stretched out or elongated than the other, and which side has a tail.

1.1. Types of Skewness:

  • Positive Skewness (Right-skewed): The right tail (larger values) is longer than the left tail (smaller values). The mean is greater than the median.

  • Negative Skewness (Left-skewed): The left tail (smaller values) is longer than the right tail (larger values). The mean is less than the median.

  • No Skewness: The distribution is symmetric. This does not necessarily mean the distribution is “normal”.

1.2. Rules of Thumb for Skewness:

  • If skewness is less than -1 or greater than 1, the distribution is highly skewed.
  • If skewness is between -1 and -0.5 or between 0.5 and 1, the distribution is moderately skewed.
  • If skewness is between -0.5 and 0.5, the distribution is approximately symmetric.

The rule of thumb for skewness helps in providing a general guideline for interpreting its value, which indicates the symmetry of the data distribution. Although these guidelines can vary slightly depending on the source, here’s a commonly used interpretation:

  • Positive Value: Positive skewness indicates a distribution that is skewed to the right. The right tail is longer or fatter than the left tail. If skewness is greater than 1, the distribution is highly skewed to the right. If it’s between 0.5 and 1, it might be moderately positively skewed.

  • Negative Value: Negative skewness indicates a distribution that is skewed to the left. In other words, the left tail is longer or fatter than the right tail. Commonly, if skewness is less than -1 or less, the distribution is highly skewed to the left. If it’s between -0.5 and -1, it might be moderately negatively skewed.

  • Near Zero: If the skewness is near 0, the data are fairly symmetrical. However, symmetry doesn’t necessarily imply “normality” (as in a normal distribution).

1.3. Implications of Skewness:

If the data is skewed, it may lead to potential biases in the analysis. In such cases, certain statistical techniques that assume data is normally distributed might not be appropriate.

2. Kurtosis

Kurtosis quantifies the sharpness of the peak and the thickness of the tails of a data distribution. In simpler words, it tells us about the extreme values in the tails.

2.1. Types of Kurtosis:

  • Leptokurtic (Kurtosis > 3): Distributions with fatter tails and a sharper peak than the normal distribution. Higher susceptibility to outliers.

  • Platykurtic (Kurtosis < 3): Distributions with thinner tails and a more flattened peak than the normal distribution.

  • Mesokurtic (Kurtosis = 3): Distributions with similar kurtosis as the normal distribution.

(Note: The above values are based on the standard method of computing kurtosis, where the kurtosis of a normal distribution is defined as 3.)

2.2. Implications of Kurtosis:

A leptokurtic distribution has more frequent large jumps away from the mean than a normal distribution does. This can be a sign of volatility in financial contexts. Platykurtic distributions, on the other hand, tend to have values closer to the mean, indicating stability.

3. Computing Skewness and Kurtosis

In most statistical software, skewness and kurtosis can be easily calculated. In Python, for example, you can use the scipy.stats library

python
# Import necessary libraries
import seaborn as sns
import matplotlib.pyplot as plt
from scipy.stats import skew, kurtosis
import pandas as pd

# Load the Iris dataset
url = 'https://raw.githubusercontent.com/selva86/datasets/master/Iris.csv'
iris = pd.read_csv(url)

iris.head()
IdSepalLengthCmSepalWidthCmPetalLengthCmPetalWidthCmSpecies
015.13.51.40.2Iris-setosa
124.93.01.40.2Iris-setosa
234.73.21.30.2Iris-setosa
344.63.11.50.2Iris-setosa
455.03.61.40.2Iris-setosa
python
# Extract 'sepal_length' data
sepal_length = iris['SepalLengthCm']

# Compute skewness and kurtosis
print(f"Skewness of sepal_length: {skew(sepal_length):.2f}")
print(f"Kurtosis of sepal_length: {kurtosis(sepal_length, fisher=False):.2f}")

python
Skewness of sepal_length: 0.31
Kurtosis of sepal_length: 2.43
python
# Visualization using distplot
plt.figure(figsize=(10,6))
sns.distplot(sepal_length, bins=30, color='skyblue', kde_kws={'linewidth': 2, 'color': 'red'})
plt.axvline(x=sepal_length.mean(), color='green', linestyle='--', label='Mean')
plt.title('Distribution of Sepal Length')
plt.legend()
plt.show()

Lets look at another example

python
# Extract 'Sepal_Width' data
Sepal_Width = iris['SepalWidthCm']

# Compute skewness and kurtosis
print(f"Skewness of sepal_length: {skew(Sepal_Width):.2f}")
print(f"Kurtosis of sepal_length: {kurtosis(Sepal_Width, fisher=False):.2f}")
python
Skewness of sepal_length: 0.33
Kurtosis of sepal_length: 3.24
python
# Visualization using distplot
plt.figure(figsize=(10,6))
sns.distplot(Sepal_Width, bins=30, color='skyblue', kde_kws={'linewidth': 2, 'color': 'red'})
plt.axvline(x=Sepal_Width.mean(), color='green', linestyle='--', label='Mean')
plt.title('Distribution of Sepal Length')
plt.legend()
plt.show()

4. Importance of Understanding Skewness and Kurtosis

Skewness and kurtosis are crucial for various reasons:

  • Normality Tests: Many statistical tests assume the data is normally distributed. Skewness and kurtosis can be indicators if this assumption holds true.

  • Risk Management: In finance, understanding the tails (extreme events) can be essential for risk assessment.

  • Data Preprocessing: Recognizing skewness might lead one to apply certain transformations, like logarithms, to make data more symmetric and meet modeling assumptions.

5. Conclusion

While skewness and kurtosis are just two of the many measures in statistics, they provide a deeper understanding of data distributions. By quantifying asymmetry and the propensity for extreme values, they serve as invaluable tools for researchers, analysts, and statisticians in various fields.

Free Course
Master Core Python — Your First Step into AI/ML

Build a strong Python foundation with hands-on exercises designed for aspiring Data Scientists and AI/ML Engineers.

Start Free Course
Trusted by 50,000+ learners
Jagdeesh
Written by
Related Course
Master Statistics — Hands-On
Join 5,000+ students at edu.machinelearningplus.com
Explore Course
Free Callback - Limited Slots
Not Sure Which Course to Start With?
Talk to our AI Counsellors and Practitioners. We'll help you clear all your questions for your background and goals, bridging the gap between your current skills and a career in AI.
10-digit mobile number
📞
Thank You!
We'll Call You Soon!
Our learning advisor will reach out within 24 hours.
(Check your inbox too — we've sent a confirmation)
⚡ Before you go

Python.
SQL. NumPy.
All free.

Get the exact 10-course programming foundation that Data Science professionals use.

🐍
Core Python — from first line to expert level
📈
NumPy & Pandas — the #1 libraries every DS job needs
🗃️
SQL Levels I–III — basics to Window Functions
📄
Real industry data — Jupyter notebooks included
R A M S K
57,000+ students
★★★★★ Rated 4.9/5
⚡ Before you go
Python. SQL.
All Free.
R A M S K
57,000+ students  ★★★★★ 4.9/5
Get Free Access Now
10 courses. Real projects. Zero cost. No credit card.
New learners enrolling right now
🔒 100% free ☕ No spam, ever ✓ Instant access
🚀
You're in!
Check your inbox for your access link.
(Check Promotions or Spam if you don't see it)
Or start your first course right now:
Start Free Course →
Scroll to Top
Scroll to Top
Course Preview

Machine Learning A-Z™: Hands-On Python & R In Data Science

Free Sample Videos:

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science