Menu

PySpark Rename Columns – How to Rename Columsn in PySpark DataFrame

In this Blog we will focus on explore different ways to rename columns in a PySpark DataFrame and illustrate the process with example code

Written by Jagdeesh | 3 min read

In this blog post, we will focus on one of the common data wrangling tasks in PySpark – renaming columns. We will explore different ways to rename columns in a PySpark DataFrame and illustrate the process with example code.

Different ways to rename columns in a PySpark DataFrame

  1. Renaming Columns Using ‘withColumnRenamed’

  2. Renaming Columns Using ‘select’ and ‘alias’

  3. Renaming Columns Using ‘toDF’

  4. Renaming Multiple Columns

Lets start by importing the necessary libraries, initializing a PySpark session and create a sample DataFrame to work with

python
import findspark
findspark.init()

from pyspark.sql import SparkSession

spark = SparkSession.builder.appName("PySpark Rename Columns").getOrCreate()

from pyspark.sql import Row

data = [Row(name="Alice", age=25, city="New York"),
        Row(name="Bob", age=30, city="San Francisco"),
        Row(name="Cathy", age=35, city="Los Angeles")]

sample_df = spark.createDataFrame(data)
sample_df.show()
python
+-----+---+-------------+
| name|age|         city|
+-----+---+-------------+
|Alice| 25|     New York|
|  Bob| 30|San Francisco|
|Cathy| 35|  Los Angeles|
+-----+---+-------------+

1. Renaming Columns Using ‘withColumnRenamed’

The ‘withColumnRenamed’ method is a simple way to rename a single column in a DataFrame

python
renamed_df = sample_df.withColumnRenamed("age", "user_age")

renamed_df.show()
python
+-----+--------+-------------+
| name|user_age|         city|
+-----+--------+-------------+
|Alice|      25|     New York|
|  Bob|      30|San Francisco|
|Cathy|      35|  Los Angeles|
+-----+--------+-------------+

2. Renaming Columns Using ‘select’ and ‘alias’

You can also use the ‘select’ and ‘alias’ methods to rename columns

python
from pyspark.sql.functions import col

renamed_df = sample_df.select(col("name"), col("age").alias("user_age"), col("city"))

renamed_df.show()
python
+-----+--------+-------------+
| name|user_age|         city|
+-----+--------+-------------+
|Alice|      25|     New York|
|  Bob|      30|San Francisco|
|Cathy|      35|  Los Angeles|
+-----+--------+-------------+

3. Renaming Columns Using ‘toDF’

Another approach is to use the ‘toDF’ method to rename columns by passing a list of new column names:

python
renamed_df = sample_df.toDF("user_name", "user_age", "user_city")

renamed_df.show()
python
+---------+--------+-------------+
|user_name|user_age|    user_city|
+---------+--------+-------------+
|    Alice|      25|     New York|
|      Bob|      30|San Francisco|
|    Cathy|      35|  Los Angeles|
+---------+--------+-------------+

4. Renaming Multiple Columns

If you need to rename multiple columns at once, you can chain ‘withColumnRenamed’ methods

python
renamed_df = sample_df.withColumnRenamed("name", "user_name") \
                      .withColumnRenamed("age", "user_age") \
                      .withColumnRenamed("city", "user_city")
renamed_df.show()
python
+---------+--------+-------------+
|user_name|user_age|    user_city|
+---------+--------+-------------+
|    Alice|      25|     New York|
|      Bob|      30|San Francisco|
|    Cathy|      35|  Los Angeles|
+---------+--------+-------------+

Alternatively, you can use a loop with ‘withColumnRenamed’ to rename multiple columns

python
columns_to_rename = {"name": "user_name", "age": "user_age", "city": "user_city"}

renamed_df = sample_df
for old_name, new_name in columns_to_rename.items():
    renamed_df = renamed_df.withColumnRenamed(old_name, new_name)

renamed_df.show()
python
+---------+--------+-------------+
|user_name|user_age|    user_city|
+---------+--------+-------------+
|    Alice|      25|     New York|
|      Bob|      30|San Francisco|
|    Cathy|      35|  Los Angeles|
+---------+--------+-------------+
python
spark.stop()

we explored different ways to rename columns in a PySpark DataFrame. We covered the ‘withColumnRenamed’, ‘select’ with ‘alias’, and ‘toDF’ methods, as well as techniques to rename multiple columns at once.

With this knowledge, you should be well-equipped to handle various column renaming scenarios in your PySpark projects.

Free Course
Master Core Python — Your First Step into AI/ML

Build a strong Python foundation with hands-on exercises designed for aspiring Data Scientists and AI/ML Engineers.

Start Free Course
Trusted by 50,000+ learners
Jagdeesh
Written by
Related Course
Master PySpark — Hands-On
Join 5,000+ students at edu.machinelearningplus.com
Explore Course
Free Callback - Limited Slots
Not Sure Which Course to Start With?
Talk to our AI Counsellors and Practitioners. We'll help you clear all your questions for your background and goals, bridging the gap between your current skills and a career in AI.
10-digit mobile number
📞
Thank You!
We'll Call You Soon!
Our learning advisor will reach out within 24 hours.
(Check your inbox too — we've sent a confirmation)
⚡ Before you go

Python.
SQL. NumPy.
All free.

Get the exact 10-course programming foundation that Data Science professionals use.

🐍
Core Python — from first line to expert level
📈
NumPy & Pandas — the #1 libraries every DS job needs
🗃️
SQL Levels I–III — basics to Window Functions
📄
Real industry data — Jupyter notebooks included
R A M S K
57,000+ students
★★★★★ Rated 4.9/5
⚡ Before you go
Python. SQL.
All Free.
R A M S K
57,000+ students  ★★★★★ 4.9/5
Get Free Access Now
10 courses. Real projects. Zero cost. No credit card.
New learners enrolling right now
🔒 100% free ☕ No spam, ever ✓ Instant access
🚀
You're in!
Check your inbox for your access link.
(Check Promotions or Spam if you don't see it)
Or start your first course right now:
Start Free Course →
Scroll to Top
Scroll to Top
Course Preview

Machine Learning A-Z™: Hands-On Python & R In Data Science

Free Sample Videos:

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science