Menu

SQL Window Functions Exercises – Practice to perfection

SQL window functions is considered a _'hard'_ concept in SQL. This set of exercises is designed to challenge your SQL muscle and help internalize data wrangling using window functions in SQL.

Written by Selva Prabhakaran | 5 min read

SQL window functions is considered a ‘hard’ concept in SQL. This set of exercises is designed to challenge your SQL muscle and help internalize data wrangling using window functions in SQL. The questions are of three levels of difficulty: Easy, Intermediate and Hard.

If you haven’t read the tutorial yet, read SQL Window Functions – Made Simple and Intuitive.

Try solving hands-on in MySQL using the link provided in each question.

Author’s note: The solutions provided is (mostly) one of the several possible solutions. The goal is not to conform with the provided solution. Try reach the desired output shown.

Q1. Find the running cumulative total demand.

Difficulty Level: Intermediate

Task:

From the demand2 table, find the cumulative total sum for qty.

Input:

sql
SELECT * FROM demand2;
dayqty
110
26
321
49
612
718
83
96
1023

Desired Output:

dayqtycumQty
11010
2616
32137
4946
61258
71876
8379
9685
1023108

Solve Hands-On: HERE, Table Schema and data: Gist

Show Solution

SELECT day, 
       qty,
       SUM(qty) OVER(ORDER BY day) as cumQty 
FROM demand2;

Q2. Find the running cumulative total demand by product.

Difficulty Level: Intermediate

Task:

From the demand table, find the cumulative total sum for qty for each product category.

Input:

productdayqty
A110
A26
A321
A49
A519
B112
B218
B33
B46
B523

Desired Solution:

productdayqtyCUMSUM
A11010
A2616
A32137
A4946
A51965
B11212
B21830
B3333
B4639
B52362

Solve Hands-On: HERE, Table Schema and data: Gist

Show Solution

SELECT product,
	day,
        qty,
        SUM(qty) OVER(PARTITION BY product ORDER BY day) as CUMSUM
FROM demand;

Q3. When are the top 2 worst performing days for each product?

Difficulty Level: Intermediate

Task:

Extract the two worst performing days of each product in terms of number of qty sold. Paraphrasing it: Get the days corresponding to the two minimum most values of qty for each product.

Input:

productdayqty
A110
A26
A321
A49
A519
B112
B218
B33
B46
B523

Desired Solution:

productdayqtyRN
A261
A492
B331
B462

Solve Hands-On: HERE, Table Schema: Gist

Show Solution

with tbl as (SELECT PRODUCT,
                    qty,
                    day,
                    row_number() over (partition by product order by qty) as RN
FROM demand)
select product, day, qty
from tbl a
where RN in (1,2)

Q4. Find the percentage increase in qty compared to the previous day.

Difficulty Level: Intermediate

Task:

Sort the table by qty for each product and compute the percentage increase (or decrease) compared to the previous day.

Input:

productdayqty
A110
A26
A321
A49
A519
B112
B218
B33
B46
B523

Desired Solution:

productdayqtyqty_lagperc_increase
A2610-40
A3216250
A4921-57.14
A5199111.11
B2181250
B3318-83.33
B463100
B5236283.33

Solve Hands-On: HERE, Table Schema: Gist

Show Solution

with tbl as (select product, 
                    day, 
                    qty, 
                    lag(qty, 1) over (partition by product order by day) as qty_lag
from demand)
select *,
round(((qty - qty_lag)/qty_lag) * 100, 2) as perc_increase
from tbl
where qty_lag is not null;

Q5. Show the minimum and maximum ‘qty’ sold for each product as separate columns.

Difficulty Level: Easy

Task:

Create two new columns in the table that shows the minimum and the maximum quantity sold for each product.

Input:

productdayqty
A110
A26
A321
A49
A519
B112
B218
B33
B46
B523

Desired Solution:

productdayqtymin_qtymax_qty
A110621
A26621
A321621
A49621
A519621
B112323
B218323
B33323
B46323
B523323

Solve Hands-On: HERE, Table Schema: Gist

Show Solution

select *,
        min(qty) over(partition by product) as min_qty, 
        max(qty) over(partition by product) as max_qty 
from
demand;

Q6. Calculate the difference between the second largest and the second smallest sales qty

Difficulty Level: Hard

Task:

Calculate the diffence between the second largest and the second smallest sales qty for each product.

Input:

productdayqty
A110
A26
A321
A49
A519
B112
B218
B33
B46
B523

Desired Output:

productdayqty
A49
A519
B46
B218

Solve Hands-On: HERE, Table Schema: Gist

Show Solution

select product, 
 day, 
 qty
from
  (select *,
  row_number() over (partition by product order by qty) as rownum, 
  count(*) over (partition by product) as total_recs
  from
  demand) a
where (rownum = 2) or rownum = (total_recs - 1);

Q7. On each day, which product had the highest sales?

Difficulty Level: Intermediate

Task:

Create a table to show the day and the names of the product the the highest qty sale.

Input:

productdayqty
A110
A26
A321
A49
A519
B112
B218
B33
B46
B523

Desired Output:

dayproductqtymaxqty
1B1212
2B1818
3A2121
4A99
5B2323

Solve Hands-On: HERE, Table Schema: Gist

Show Solution

with tbl as (
  select day, 
          product,
          qty,
          max(qty) over (partition by day) as maxqty
  from demand)
select day,
	product 
from tbl
where qty = maxqty;


Questions numbers Q8 to Q15 uses the same table as below.

Q8. Create row numbers in increasing order of sales.

Difficulty Level: Intermediate

Task:

Create row numbers in increasing order of sales, starting with 1 in each location

Input:

productlocationsalesrefunds
AS192
BS181
BS173
BS2518
CS23310
DS2123
AS2102
BS215217
ES310123
CS37312
FS36319
FS3234
DS350
AS350

Desired Output:

productlocationsalesrefundsrn
BS1731
BS1812
AS1923
AS21021
DS21232
CS233103
BS25184
BS2152175
DS3501
AS3502
FS32343
FS363194
CS373125
ES3101236

Solve Hands-On: HERE, Table Schema: Gist

Show Solution

select *,
row_number() over(partition by location order by sales) as rn
from demand;

Q9. Find the top products (Rank 1 and 2) within each location

Difficulty: Intermediate

Task:

Extract the names of top selling products (rank 1 and 2) in each location. That is, the products with highest and second highest sales in each region.

Input:

productlocationsalesrefunds
AS192
BS181
BS173
BS2518
CS23310
DS2123
AS2102
BS215217
ES310123
CS37312
FS36319
FS3234
DS350
AS350

Desired Output:

locationproducttotal_salesrn
S1B151
S1A92
S2B2031
S2C332
S3E1011
S3F862

Solve Hands-On: HERE, Table Schema: Gist

Show Solution

drop table if exists temp;

create table temp as
with tbl as (select location, 
                    product,
                    sum(sales) as total_sales
from demand
group by location, product
order by location, total_sales desc)
select *, 
row_number() over (partition by location order by total_sales desc) as rn
from tbl;

select * from temp
where rn <= 2;

Q10. What is the total sales from Top 3 products in each location?

Difficulty: Intermediate

Task:

Calculate the total sales from the top 3 best selling products in each location.

Input:

productlocationsalesrefunds
AS192
BS181
BS173
BS2518
CS23310
DS2123
AS2102
BS215217
ES310123
CS37312
FS36319
FS3234
DS350
AS350

Desired Output:

locationproducttotal_salesrn
S1B151
S1A92
S2B2031
S2C332
S2D123
S3E1011
S3F862
S3C733

Solve Hands-On: HERE, Table Schema and data: Gist

Show Solution

select location,
		product,
        total_sales,
        rn
from 
(select *, 
	row_number() over(partition by location order by total_sales desc) as rn
from (select location, 
              product, 
              sum(sales) as total_sales
      from demand
      group by location, product
      order by location, total_sales desc)
 ) 
 where rn <= 3;

Q11. Calculate the proportion of sales from each location

Difficulty: Intermediate

Task:

Calculate the proportion of sales from each location and show this against each record.

Input:

productlocationsalesrefunds
AS192
BS181
BS173
BS2518
CS23310
DS2123
AS2102
BS215217
ES310123
CS37312
FS36319
FS3234
DS350
AS350

Desired Output:

productlocationsalesrefundstotal_locationtotal_salesratio
AS192245520.0435
BS181245520.0435
BS173245520.0435
BS25182585520.4674
CS233102585520.4674
DS21232585520.4674
AS21022585520.4674
BS2152172585520.4674
ES3101232705520.4891
CS373122705520.4891
FS363192705520.4891
FS32342705520.4891
DS3502705520.4891
AS3502705520.4891
Show Solution

SELECT *,
total_location,
round(cast(total_location as float) / cast(total_sales as float), 4) as ratio
from
(
  SELECT *,
  sum(sales) over(partition by location rows between unbounded preceding and unbounded following) as total_location, 
  sum(sales) over(rows between unbounded preceding and unbounded following) as total_sales 
  FROM demand
);

Q12. Which products contribute to top 80% of total sales?

Difficulty: Hard

Task:

Calculate and extract the best selling products whose total sales does not exceed 80% of overall sales. Show the cumulative percentage contribution to sales as a percentage as well.

Input:

productlocationsalesrefunds
AS192
BS181
BS173
BS2518
CS23310
DS2123
AS2102
BS215217
ES310123
CS37312
FS36319
FS3234
DS350
AS350

Desired Output:

productrntotal_product_salescum_total_salestotal_salescum_perc_sales
D117175520.03
A224415520.074
F3861275520.23
E41012285520.41
C51063345520.60

Solve Hands-On: HERE, Table Schema: Gist

Show Solution

SELECT *,
round((cum_total_sales/total_sales), 4) as cum_perc_sales
FROM
(select product,
	row_number() over (order by total_product_sales) as rn,
        total_product_sales,
        sum(total_product_sales) over (order by total_product_sales) as cum_total_sales,
 	sum(total_product_sales) over (rows between unbounded preceding and unbounded following) as total_sales
from
  (select *, 
  sum(sales) as total_product_sales
  from demand
  group by product
  order by total_product_sales desc
  )
)
where perc_sales <= 0.8;

Q13. What is the median value of sales overall?

Difficulty: Intermediate

Task:

Calculate the median sales value overall irrespective of product or location.

Input:

productlocationsalesrefunds
AS192
BS181
BS173
BS2518
CS23310
DS2123
AS2102
BS215217
ES310123
CS37312
FS36319
FS3234
DS350
AS350

Desired Output:

productlocationsalesrefunds
DS2123

Solve Hands-On: HERE, Table Schema: Gist

Note: SQLite does not support floor and ceil functions, so make sure to try out in MySQL or a db that supports these functions.
ion;
```

Show Solution

select location, 
       avg(sales) as median
from
  (select *,
  row_number() over (order by sales) as rn,
  count(*) over () as total_rows
  from demand) as tbl
where rn in (floor(total_rows/2), ceil(total_rows/2))
group by location;

Q14. What is the median value of sales for each location?

Difficulty: Hard

Task:

Calculate the median sales value across all product within each location.

Input:

productlocationsalesrefunds
AS192
BS181
BS173
BS2518
CS23310
DS2123
AS2102
BS215217
ES310123
CS37312
FS36319
FS3234
DS350
AS350

Desired Output:

locationmedian_sales
S17.5000
S222.5000
S323.0000

Note: SQLite does not support floor and ceil functions, so make sure to try out in MySQL or a db that supports these functions.

Solve Hands-On: HERE, Table Schema: Gist

Show Solution

select location, 
       avg(sales) as median_sales
from
 (select *
  from
        (select *,
        row_number() over (partition by location order by sales) as rn,
        count(*) over (partition by location) as total_rows
        from demand) as tbl
where rn in (floor(total_rows/2), ceil(total_rows/2))) as sub
group by location;

Q15. Which product has the largest refund rate overall?

Difficulty: Intermediate

Task:

Find out which product had the highest refund rate and how much.

Input:

productlocationsalesrefunds
AS192
BS181
BS173
BS2518
CS23310
DS2123
AS2102
BS215217
ES310123
CS37312
FS36319
FS3234
DS350
AS350

Desired Output:

producttotal_salestotal_refundsrefund_rate
F86230.267442

Solve Hands-On: HERE, Table Schema: Gist

Show Solution

with tbl as (select *,
      cast(total_refunds/total_sales as float) as refund_rate
  from
  (
    select product,
            sum(sales) as total_sales,
            sum(refunds) as total_refunds
    from demand
    group by product
  ) as sub
  order by refund_rate desc)
  select * from tbl
  where refund_rate >= (select max(refund_rate) from tbl);

Free Course
Master Core Python — Your First Step into AI/ML

Build a strong Python foundation with hands-on exercises designed for aspiring Data Scientists and AI/ML Engineers.

Start Free Course
Trusted by 50,000+ learners
Related Course
Master SQL — Hands-On
Join 5,000+ students at edu.machinelearningplus.com
Explore Course
Free Callback - Limited Slots
Not Sure Which Course to Start With?
Talk to our AI Counsellors and Practitioners. We'll help you clear all your questions for your background and goals, bridging the gap between your current skills and a career in AI.
10-digit mobile number
📞
Thank You!
We'll Call You Soon!
Our learning advisor will reach out within 24 hours.
(Check your inbox too — we've sent a confirmation)
⚡ Before you go

Python.
SQL. NumPy.
All free.

Get the exact 10-course programming foundation that Data Science professionals use.

🐍
Core Python — from first line to expert level
📈
NumPy & Pandas — the #1 libraries every DS job needs
🗃️
SQL Levels I–III — basics to Window Functions
📄
Real industry data — Jupyter notebooks included
R A M S K
57,000+ students
★★★★★ Rated 4.9/5
⚡ Before you go
Python. SQL.
All Free.
R A M S K
57,000+ students  ★★★★★ 4.9/5
Get Free Access Now
10 courses. Real projects. Zero cost. No credit card.
New learners enrolling right now
🔒 100% free ☕ No spam, ever ✓ Instant access
🚀
You're in!
Check your inbox for your access link.
(Check Promotions or Spam if you don't see it)
Or start your first course right now:
Start Free Course →
Scroll to Top
Scroll to Top
Course Preview

Machine Learning A-Z™: Hands-On Python & R In Data Science

Free Sample Videos:

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science