Build an AI Chatbot with Memory in Python (Step-by-Step)

Learn to build a Python AI chatbot with conversation memory using the OpenAI API. Covers token management, streaming, memory strategies, and saving chat history.

Written by Selva Prabhakaran | 36 min read

Build a Python chatbot that truly recalls your chat — from first API call to a full assistant with smart memory.

You ask your chatbot a question. It gives a great answer. You ask a follow-up — and it has no clue what you just said. Sound like your day?

That’s because LLMs are stateless. Each API call starts fresh with no memory at all. Your chatbot doesn’t “forget” — it never knew in the first place.

In this guide, you’ll build a chatbot from scratch with Python and the OpenAI API. First, you’ll make a basic bot. Then you’ll see why it fails at real talks. After that, you’ll add true memory.

By the end, you’ll have a working bot that holds multi-turn chats. It will track token costs, stream replies, and save chat logs to disk.

What Is a Conversational AI Chatbot?

An AI chatbot is a program that uses a large language model (LLM) to give human-like replies in a back-and-forth chat. Old-school bots match keywords to scripted lines. AI chatbots read context and craft fresh replies on the fly.

The hard part? Keeping the thread going. One question and one answer is simple. But a clear chat across dozens of messages — that takes memory.

Here’s how every AI chatbot works at its core:

User sends a message — your Python code captures the input
Your code packages the message — along with conversation history and a system prompt
The LLM processes everything — and generates a response
Your code stores the exchange — adding both messages to memory
Repeat — each new message includes the growing history

Key Insight: LLMs have no built-in memory. They are stateless — input goes in, output comes out, nothing stays. “Memory” in a chatbot is just your code sending old messages along with each new one.

Setting Up Your Environment

You need Python 3.9+ and an OpenAI API key. Let’s get the tools in place.

bash

pip install openai tiktoken python-dotenv

openai — the OpenAI Python SDK
tiktoken — counts tokens so you can track costs
python-dotenv — loads API keys from a .env file (keeps secrets out of code)

Make a .env file in your project folder with your API key:

bash

# .env
OPENAI_API_KEY=sk-your-api-key-here

Warning: Never put API keys right in your Python files. If you push code to GitHub with a visible key, bots will find it fast and run up charges. Always use a `.env` file or shell variables.

Now load your key and make the OpenAI client. You’ll use this setup code for the rest of the guide:

python

import os
from dotenv import load_dotenv
from openai import OpenAI

load_dotenv()
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

Output:

python

# No output — the client is ready to use

No errors? Good — your setup works. The client object talks to the API for you.

Build a Basic Chatbot (Without Memory)

Let’s start with the simplest chatbot — one that answers one question. This shows the core API pattern.

The Chat API takes a list of messages. Each message has a role and content. The role is "system", "user", or "assistant". The system message sets the bot’s tone. The user message is your question.

python

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful Python tutor."},
        {"role": "user", "content": "What is a list comprehension?"}
    ]
)

print(response.choices[0].message.content)

Output:

python

A list comprehension is a concise way to create lists in Python. Instead of
writing a for loop to build a list, you write the entire operation in a single
line. The syntax is: [expression for item in iterable if condition]. For
example, [x**2 for x in range(5)] produces [0, 1, 4, 9, 16].

That works great for a one-shot question. But watch what happens when we try a conversation.

The Memory Problem

Let’s ask two related questions — a question and then a follow-up:

python

# First question
response1 = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful Python tutor."},
        {"role": "user", "content": "What is a list comprehension?"}
    ]
)
print("Q1:", response1.choices[0].message.content[:100])

# Follow-up question
response2 = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful Python tutor."},
        {"role": "user", "content": "Can you show me a more complex example of that?"}
    ]
)
print("Q2:", response2.choices[0].message.content[:100])

Output:

python

Q1: A list comprehension is a concise way to create lists in Python. Instead of writing a for loop...
Q2: Sure! Could you clarify what specific topic you'd like a more complex example of? I'd be happy...

The second reply has no idea what “that” means. Each API call stands alone. The model only sees what you send in that one request.

This is the core problem. The fix is simple: send the chat history with every request.

Add Conversation Memory

The simplest chatbot memory is a Python list. You store every message in this list — both user and bot replies. Then you send the whole list with each API call. The model sees the full chat.

Here’s a chatbot with basic memory. The chat() function adds user input to the list, sends it all to the API, and saves the reply:

python

def create_chatbot(system_prompt="You are a helpful assistant."):
    messages = [{"role": "system", "content": system_prompt}]

    def chat(user_input):
        messages.append({"role": "user", "content": user_input})

        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=messages
        )

        assistant_reply = response.choices[0].message.content
        messages.append({"role": "assistant", "content": assistant_reply})
        return assistant_reply

    return chat, messages

Output:

python

# No output — function defined

This uses a closure — the messages list lives on between calls to chat(). Let’s test it with the same two questions:

python

chat, history = create_chatbot("You are a helpful Python tutor.")

print("Q1:", chat("What is a list comprehension?"))
print()
print("Q2:", chat("Can you show me a more complex example of that?"))

Output:

python

Q1: A list comprehension is a concise way to create lists in Python. The syntax
is [expression for item in iterable if condition]. For example, [x**2 for x in
range(5)] gives you [0, 1, 4, 9, 16].

Q2: Here is a more complex list comprehension that flattens a matrix (list of
lists) into a single list:

matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat = [num for row in matrix for num in row]
# Result: [1, 2, 3, 4, 5, 6, 7, 8, 9]

This uses a nested comprehension — the outer loop iterates over rows, and the
inner loop iterates over numbers within each row.

Now the bot recalls the first question. The follow-up gets a clear answer because the full history rides along with each API call.

Key Insight: Chat memory is just a Python list. You add each message, send the full list each time, and the model acts like it “recalls” the chat. No magic here — your code holds the state that the API can’t.

How Memory Grows

Let’s look inside the messages list to see what the API gets:

python

for i, msg in enumerate(history):
    role = msg["role"].upper()
    preview = msg["content"][:60]
    print(f"[{i}] {role}: {preview}...")

Output:

python

[0] SYSTEM: You are a helpful Python tutor....
[1] USER: What is a list comprehension?...
[2] ASSISTANT: A list comprehension is a concise way to create lists in...
[3] USER: Can you show me a more complex example of that?...
[4] ASSISTANT: Here is a more complex list comprehension that flattens a ...

Each exchange adds two messages — one from you, one from the bot. After 50 exchanges, you’ll have 101 messages. That’s a lot of tokens. And tokens cost real money.

Build an Interactive Chat Loop

Let’s wrap this into a proper chatbot you can run in the terminal:

python

def run_chatbot():
    print("AI Chatbot (type 'quit' to exit)")
    print("-" * 40)

    chat, _ = create_chatbot(
        "You are a friendly and helpful assistant. "
        "Keep responses concise — under 3 sentences when possible."
    )

    while True:
        user_input = input("\nYou: ").strip()
        if user_input.lower() in ("quit", "exit", "q"):
            print("Goodbye!")
            break
        if not user_input:
            continue

        response = chat(user_input)
        print(f"\nAssistant: {response}")

run_chatbot()

Output:

python

AI Chatbot (type 'quit' to exit)
----------------------------------------

You: What is the capital of France?
Assistant: The capital of France is Paris.

You: How many people live there?
Assistant: About 2.1 million people live in Paris proper. The greater Paris
metropolitan area has around 12 million residents.

You: quit
Goodbye!

You now have a working chatbot with memory. But there’s a hidden trap — it’ll bite you after about 20 to 30 exchanges.

Handle the Token Limit Problem

Each message you send costs tokens. As your chat grows, you send more tokens per call. Soon you’ll hit two walls: the model’s context limit and your budget.

Let’s build a token counter. You’ll see how many tokens each call uses. The tiktoken library counts tokens the same way OpenAI does — exact counts, no guessing.

python

import tiktoken

def count_tokens(messages, model="gpt-4o-mini"):
    """Count the exact number of tokens in a message list."""
    encoding = tiktoken.encoding_for_model(model)
    token_count = 0

    for message in messages:
        token_count += 4  # every message has overhead tokens
        for key, value in message.items():
            token_count += len(encoding.encode(value))
    token_count += 2  # reply priming tokens
    return token_count

Output:

python

# No output — function defined

Now let’s watch how token use grows as chats get longer. We’ll fake a 10-turn chat and count tokens at each step:

python

sample_messages = [
    {"role": "system", "content": "You are a helpful assistant."},
]

# Simulate 10 exchanges
for i in range(10):
    sample_messages.append(
        {"role": "user", "content": f"Tell me fact number {i+1} about Python."}
    )
    sample_messages.append(
        {"role": "assistant", "content": f"Here is fact {i+1}: Python was "
         f"created by Guido van Rossum and released in 1991. "
         f"It emphasizes code readability and simplicity."}
    )

    tokens = count_tokens(sample_messages)
    print(f"After exchange {i+1:2d}: {len(sample_messages):3d} messages, {tokens:5d} tokens")

Output:

python

After exchange  1:   3 messages,    68 tokens
After exchange  2:   5 messages,   115 tokens
After exchange  3:   7 messages,   162 tokens
After exchange  4:   9 messages,   209 tokens
After exchange  5:  11 messages,   256 tokens
After exchange  6:  13 messages,   303 tokens
After exchange  7:  15 messages,   350 tokens
After exchange  8:  17 messages,   397 tokens
After exchange  9:  19 messages,   444 tokens
After exchange 10:  21 messages,   491 tokens

Token use grows in a straight line. In a real chat with longer replies, you can hit thousands of tokens in 15-20 turns. GPT-4o-mini takes up to 128K tokens, but you pay for each one.

Tip: Always track token use in live chatbots. Add a counter that warns you when a chat nears your budget. This stops surprise bills from long chats.

Add a Token Budget

Let’s update our chatbot to enforce a token budget. When the conversation gets too long, we’ll trim the oldest messages while keeping the system prompt:

python

def create_chatbot_with_budget(system_prompt, token_budget=4000):
    messages = [{"role": "system", "content": system_prompt}]

    def trim_history():
        """Remove oldest messages (except system) to stay within budget."""
        while count_tokens(messages) > token_budget and len(messages) > 2:
            messages.pop(1)  # remove oldest non-system message

    def chat(user_input):
        messages.append({"role": "user", "content": user_input})
        trim_history()

        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=messages
        )

        assistant_reply = response.choices[0].message.content
        messages.append({"role": "assistant", "content": assistant_reply})
        return assistant_reply

    return chat, messages

Output:

python

# No output — function defined

The trim_history() function drops the oldest non-system message one by one. It stops when the token count falls below the budget. Simple but solid — the bot keeps recent context and “forgets” old turns.

Warning: Popping old messages is a one-way door — that context is gone for good. If the user asks about something from 20 turns ago, the bot won’t know. For chats where old context matters, try the summary method up next.

Memory Strategies: Window, Summary, and Hybrid

The “send it all” method breaks down in long chats. Here are three smarter ways to handle memory. Each trades off context quality against token cost.

Strategy 1: Sliding Window Memory

Keep only the last N turns. Drop older ones. This is the simplest plan and works great for most bots.

python

def create_windowed_chatbot(system_prompt, window_size=10):
    """Keep only the last `window_size` exchanges in memory."""
    full_history = [{"role": "system", "content": system_prompt}]

    def chat(user_input):
        full_history.append({"role": "user", "content": user_input})

        # Build windowed context: system + last N exchanges
        windowed = [full_history[0]]  # always keep system prompt
        recent = full_history[1:][-window_size * 2:]  # last N exchanges
        windowed.extend(recent)

        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=windowed
        )

        assistant_reply = response.choices[0].message.content
        full_history.append({"role": "assistant", "content": assistant_reply})
        return assistant_reply

    return chat, full_history

Output:

python

# No output — function defined

With window_size=10, the bot sends the system prompt plus the last 10 turns (20 messages). Token use stays flat no matter how long the chat runs.

Strategy 2: Summary Memory

Don’t drop old messages — sum them up instead. The bot asks the LLM to squeeze the old chat into a short recap. That recap then stands in for all the older messages.

python

def summarize_messages(messages_to_summarize):
    """Use the LLM to summarize a list of messages."""
    conversation_text = "\n".join(
        f"{m['role']}: {m['content']}" for m in messages_to_summarize
    )

    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{
            "role": "user",
            "content": f"Summarize this conversation in 2-3 sentences. "
                       f"Focus on key facts and decisions:\n\n"
                       f"{conversation_text}"
        }]
    )
    return response.choices[0].message.content


def create_summary_chatbot(system_prompt, max_messages=20):
    """Chatbot that summarizes old messages when history gets long."""
    messages = [{"role": "system", "content": system_prompt}]
    summary = ""

    def chat(user_input):
        nonlocal summary

        messages.append({"role": "user", "content": user_input})

        # Summarize when history gets too long
        if len(messages) > max_messages:
            old_messages = messages[1:max_messages - 4]
            summary = summarize_messages(old_messages)

            # Keep system prompt + summary + recent messages
            recent = messages[max_messages - 4:]
            messages.clear()
            messages.append({"role": "system", "content": system_prompt})
            messages.append({
                "role": "system",
                "content": f"Previous conversation summary: {summary}"
            })
            messages.extend(recent)

        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=messages
        )

        assistant_reply = response.choices[0].message.content
        messages.append({"role": "assistant", "content": assistant_reply})
        return assistant_reply

    return chat, messages

Output:

python

# No output — function defined

Summary memory saves the gist without the token cost. The catch? You lose fine details. The bot knows “the user asked about Python lists” but may forget the exact code shown.

Strategy 3: Hybrid Memory (Best of Both)

Mix window and summary memory. Keep recent messages in full, and sum up the rest. You get both sharp detail (recent turns) and broad context (old turns).

python

def create_hybrid_chatbot(system_prompt, window_size=8, summary_threshold=20):
    """Combine summary of old messages with a window of recent ones."""
    all_messages = [{"role": "system", "content": system_prompt}]
    conversation_summary = ""

    def chat(user_input):
        nonlocal conversation_summary
        all_messages.append({"role": "user", "content": user_input})

        if len(all_messages) > summary_threshold:
            old = all_messages[1:-window_size * 2]
            if old:
                conversation_summary = summarize_messages(old)
                keep = all_messages[-window_size * 2:]
                all_messages.clear()
                all_messages.append({"role": "system", "content": system_prompt})
                all_messages.extend(keep)

        # Build context with summary + recent window
        context = [all_messages[0]]  # system prompt
        if conversation_summary:
            context.append({
                "role": "system",
                "content": f"Earlier conversation summary: {conversation_summary}"
            })
        context.extend(all_messages[1:])

        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=context
        )

        assistant_reply = response.choices[0].message.content
        all_messages.append({"role": "assistant", "content": assistant_reply})
        return assistant_reply

    return chat, all_messages

Output:

python

# No output — function defined

Comparing Memory Strategies

Here’s when to use each strategy:

Strategy	Token Usage	Context Quality	Best For
Full History	Grows linearly	Perfect recall	Short chats (under 20 exchanges)
Sliding Window	Constant	Recent only	Customer support, quick Q&A
Summary	Grows slowly	Approximate	Long research sessions
Hybrid	Moderate	Good balance	Production chatbots, complex tasks

Key Insight: There’s no one “best” memory plan. The right pick depends on chat length and how much old context matters. For most bots, a sliding window of 10-15 turns hits the sweet spot.

typescript

{
  type: 'exercise',
  id: 'chatbot-memory-ex1',
  title: 'Exercise 1: Build a Chatbot with Windowed Memory',
  difficulty: 'beginner',
  exerciseType: 'write',
  instructions: 'Complete the `windowed_chat` function below. It should keep only the last 3 exchanges (6 messages) plus the system prompt. Use the provided `messages` list and `window_size` variable. After adding the user message, build a windowed list and print the number of messages being sent.',
  starterCode: 'messages = [\n    {"role": "system", "content": "You are a helpful assistant."}\n]\nwindow_size = 3\n\ndef windowed_chat(user_msg):\n    messages.append({"role": "user", "content": user_msg})\n    # Build windowed context: system + last window_size exchanges\n    windowed = [messages[0]]\n    recent = messages[1:]  # FIX THIS LINE to keep only last window_size*2 messages\n    windowed.extend(recent)\n    # Simulate assistant reply\n    messages.append({"role": "assistant", "content": f"Reply to: {user_msg}"})\n    return len(windowed)\n\n# Simulate 5 exchanges\nfor i in range(5):\n    count = windowed_chat(f"Question {i+1}")\n    print(f"Exchange {i+1}: sending {count} messages")',
  testCases: [
    { id: 'tc1', input: '', expectedOutput: 'Exchange 5: sending 7', description: 'After 5 exchanges with window=3, should send 7 messages (1 system + 6 recent)' }
  ],
  hints: [
    'Slice the non-system messages with [-window_size * 2:] to keep only the last N exchanges',
    'Change the recent line to: recent = messages[1:][-window_size * 2:]'
  ],
  solution: 'messages = [\n    {"role": "system", "content": "You are a helpful assistant."}\n]\nwindow_size = 3\n\ndef windowed_chat(user_msg):\n    messages.append({"role": "user", "content": user_msg})\n    windowed = [messages[0]]\n    recent = messages[1:][-window_size * 2:]\n    windowed.extend(recent)\n    messages.append({"role": "assistant", "content": f"Reply to: {user_msg}"})\n    return len(windowed)\n\nfor i in range(5):\n    count = windowed_chat(f"Question {i+1}")\n    print(f"Exchange {i+1}: sending {count} messages")',
  solutionExplanation: 'The key fix is slicing with [-window_size * 2:]. Since each exchange has 2 messages (user + assistant), window_size * 2 gives us the right number of individual messages to keep. The negative slice takes from the end of the list.',
  xpReward: 15,
}

Add Streaming Responses

Without streaming, the user stares at a blank screen while the API builds the full reply. Streaming shows words as they come in — just like ChatGPT. It makes your bot feel way faster.

Turn on streaming with stream=True. You’ll get chunks instead of one big reply. Each chunk holds a small bit of text:

python

def chat_stream(messages):
    """Send messages and stream the response token by token."""
    stream = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages,
        stream=True
    )

    full_response = ""
    for chunk in stream:
        delta = chunk.choices[0].delta
        if delta.content:
            print(delta.content, end="", flush=True)
            full_response += delta.content

    print()  # newline after streaming completes
    return full_response

Output:

python

# No output — function defined

Let’s test it. Watch how the response appears word by word instead of all at once:

python

test_messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Explain what an API is in 2 sentences."}
]

result = chat_stream(test_messages)

Output:

python

An API (Application Programming Interface) is a set of rules that lets
different software programs talk to each other. Think of it like a waiter
in a restaurant — you tell the waiter what you want, the waiter tells the
kitchen, and then brings back your food.

The text shows up word by word as the model writes it. Users get instant feedback — they see the reply forming live.

Tip: Always use streaming in live bots. Even if total reply time stays the same, streaming feels 3-5x faster. Users start reading right away. The wait drops from “whole reply” to “first word” — usually under 500ms.

Streaming Chatbot with Memory

Let’s combine streaming with our conversation memory system:

python

def create_streaming_chatbot(system_prompt, window_size=10):
    messages = [{"role": "system", "content": system_prompt}]

    def chat(user_input):
        messages.append({"role": "user", "content": user_input})

        # Apply window
        context = [messages[0]]
        context.extend(messages[1:][-window_size * 2:])

        stream = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=context,
            stream=True
        )

        full_response = ""
        for chunk in stream:
            delta = chunk.choices[0].delta
            if delta.content:
                print(delta.content, end="", flush=True)
                full_response += delta.content

        print()
        messages.append({"role": "assistant", "content": full_response})
        return full_response

    return chat, messages

Output:

python

# No output — function defined

This pairs window memory with streaming. The bot recalls context and replies in real time.

Customize Your Chatbot’s Personality

The system prompt sets your bot’s tone, style, and skills. A sharp prompt turns a bland helper into a focused tool. Let’s look at some good patterns.

Here are three sample roles that show how the system prompt shapes replies:

python

# Personality 1: Concise technical expert
tech_prompt = (
    "You are a senior Python developer. Give precise, code-focused answers. "
    "Skip pleasantries. Use code examples over explanations when possible. "
    "If the user's approach has issues, point them out directly."
)

# Personality 2: Patient beginner tutor
tutor_prompt = (
    "You are a patient programming tutor teaching someone who has never "
    "coded before. Use simple analogies. Explain every term. Celebrate "
    "small wins. Never make the student feel stupid for asking."
)

# Personality 3: Data science advisor
ds_prompt = (
    "You are a senior data scientist at a tech company. Help users with "
    "ML model selection, feature engineering, and experiment design. "
    "Always ask clarifying questions about the dataset before recommending "
    "an approach. Mention tradeoffs for every suggestion."
)

Output:

python

# No output — prompts defined

Note: The system prompt rides with every API call, so it eats your token budget. Keep it short — 2 to 4 lines is best. A 500-word prompt wastes tokens on every turn.

System Prompt Best Practices

Follow these tips for good system prompts:

Name the role — “You are a senior Python dev” not “You are helpful”
Set the format — “Answer in bullet points” or “Use code examples”
Add limits — “Keep replies under 100 words” or “Ask one follow-up”
Say what NOT to do — “Never give medical advice” or “Don’t use jargon”

python

# A well-structured system prompt
system_prompt = (
    "You are a Python code reviewer. "
    "When the user shares code, respond with:\n"
    "1. A one-sentence summary of what the code does\n"
    "2. Any bugs or issues (if none, say 'No bugs found')\n"
    "3. One specific improvement suggestion with a code example\n"
    "Keep your total response under 150 words."
)

chat, _ = create_chatbot(system_prompt)
print(chat("Review this: x = [i for i in range(100) if i % 2 == 0]"))

Output:

python

**Summary:** Creates a list of even numbers from 0 to 98 using a list
comprehension with a filter condition.

**Bugs:** No bugs found.

**Improvement:** Use `range(0, 100, 2)` instead of filtering — it generates
only even numbers directly, which is faster and more readable:

x = list(range(0, 100, 2))

A clear system prompt gives you steady, expected output. This matters for bots built to do one thing — like code review, help desks, or teaching.

Save and Load Conversations

A bot that wipes its memory when you close the terminal isn’t much use. Let’s fix that — we’ll save chats to JSON files and load them back.

The json module works great here since our messages are plain dicts:

python

import json
from datetime import datetime

def save_conversation(messages, filepath="chat_history.json"):
    """Save conversation history to a JSON file."""
    data = {
        "saved_at": datetime.now().isoformat(),
        "message_count": len(messages),
        "messages": messages
    }
    with open(filepath, "w") as f:
        json.dump(data, f, indent=2)
    print(f"Saved {len(messages)} messages to {filepath}")


def load_conversation(filepath="chat_history.json"):
    """Load conversation history from a JSON file."""
    with open(filepath, "r") as f:
        data = json.load(f)
    print(f"Loaded {data['message_count']} messages (saved {data['saved_at']})")
    return data["messages"]

Output:

python

# No output — functions defined

Let’s test the save and load cycle:

python

# Create a chatbot and have a conversation
chat, history = create_chatbot("You are a helpful Python tutor.")
chat("What are decorators in Python?")
chat("Show me a simple example.")

# Save the conversation
save_conversation(history, "my_chat.json")

# Load it back
loaded_messages = load_conversation("my_chat.json")
print(f"\nFirst message role: {loaded_messages[0]['role']}")
print(f"Total messages: {len(loaded_messages)}")

Output:

python

Saved 5 messages to my_chat.json
Loaded 5 messages (saved 2026-03-22T14:30:00.000000)

First message role: system
Total messages: 5

Now you can pick up chats across sessions. Load the saved messages, hand them to a new bot, and carry on.

Manage Multiple Conversations

For a live bot, you’ll want to track many chats by session ID. Here’s a simple session manager:

python

import os

class SessionManager:
    def __init__(self, storage_dir="chat_sessions"):
        self.storage_dir = storage_dir
        os.makedirs(storage_dir, exist_ok=True)

    def save(self, session_id, messages):
        filepath = os.path.join(self.storage_dir, f"{session_id}.json")
        data = {
            "session_id": session_id,
            "saved_at": datetime.now().isoformat(),
            "messages": messages
        }
        with open(filepath, "w") as f:
            json.dump(data, f, indent=2)

    def load(self, session_id):
        filepath = os.path.join(self.storage_dir, f"{session_id}.json")
        if not os.path.exists(filepath):
            return None
        with open(filepath, "r") as f:
            return json.load(f)["messages"]

    def list_sessions(self):
        files = [f.replace(".json", "") for f in os.listdir(self.storage_dir)
                 if f.endswith(".json")]
        return sorted(files)

sessions = SessionManager()
print(f"Storage directory: {sessions.storage_dir}")
print(f"Active sessions: {sessions.list_sessions()}")

Output:

python

Storage directory: chat_sessions
Active sessions: []

Each chat gets its own JSON file. You can list sessions, load any past chat, and keep going.

typescript

{
  type: 'exercise',
  id: 'chatbot-persistence-ex2',
  title: 'Exercise 2: Add Conversation Metadata',
  difficulty: 'beginner',
  exerciseType: 'write',
  instructions: 'Modify the `save_with_metadata` function to save conversation data with extra metadata: the total word count across all messages and the number of user messages. Print the metadata after saving.',
  starterCode: 'import json\n\nmessages = [\n    {"role": "system", "content": "You are a helpful assistant."},\n    {"role": "user", "content": "What is machine learning?"},\n    {"role": "assistant", "content": "Machine learning is a branch of AI where computers learn patterns from data."},\n    {"role": "user", "content": "Give me an example."},\n    {"role": "assistant", "content": "Email spam filters learn to identify spam by analyzing thousands of labeled emails."}\n]\n\ndef save_with_metadata(messages):\n    word_count = 0  # FIX: count total words across all message contents\n    user_count = 0  # FIX: count messages where role is "user"\n    \n    for msg in messages:\n        pass  # Replace this line with your logic\n    \n    print(f"Total words: {word_count}")\n    print(f"User messages: {user_count}")\n\nsave_with_metadata(messages)',
  testCases: [
    { id: 'tc1', input: '', expectedOutput: 'User messages: 2', description: 'Should count 2 user messages' }
  ],
  hints: [
    'For word count, use len(msg["content"].split()) and add it to the total for each message',
    'For user count, check if msg["role"] == "user" inside the loop and increment the counter'
  ],
  solution: 'import json\n\nmessages = [\n    {"role": "system", "content": "You are a helpful assistant."},\n    {"role": "user", "content": "What is machine learning?"},\n    {"role": "assistant", "content": "Machine learning is a branch of AI where computers learn patterns from data."},\n    {"role": "user", "content": "Give me an example."},\n    {"role": "assistant", "content": "Email spam filters learn to identify spam by analyzing thousands of labeled emails."}\n]\n\ndef save_with_metadata(messages):\n    word_count = 0\n    user_count = 0\n    \n    for msg in messages:\n        word_count += len(msg["content"].split())\n        if msg["role"] == "user":\n            user_count += 1\n    \n    print(f"Total words: {word_count}")\n    print(f"User messages: {user_count}")\n\nsave_with_metadata(messages)',
  solutionExplanation: 'We loop through each message, split the content into words using .split() and add the count. For user messages, we check if the role equals "user" and increment the counter.',
  xpReward: 15,
}

Build a Complete Chatbot Class

Let’s pull it all into one clean Chatbot class. This class packs in memory, streaming, token tracking, saving, and custom prompts:

python

import os
import json
import tiktoken
from datetime import datetime
from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()


class Chatbot:
    """A production-ready chatbot with memory, streaming, and persistence."""

    def __init__(
        self,
        system_prompt="You are a helpful assistant.",
        model="gpt-4o-mini",
        window_size=15,
        token_budget=8000,
        storage_dir="chat_sessions"
    ):
        self.client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
        self.model = model
        self.window_size = window_size
        self.token_budget = token_budget
        self.storage_dir = storage_dir
        self.messages = [{"role": "system", "content": system_prompt}]
        self.total_tokens_used = 0

        os.makedirs(storage_dir, exist_ok=True)

    def _count_tokens(self, messages):
        """Count tokens in a message list."""
        encoding = tiktoken.encoding_for_model(self.model)
        count = 0
        for msg in messages:
            count += 4
            for value in msg.values():
                count += len(encoding.encode(str(value)))
        return count + 2

    def _build_context(self):
        """Build windowed context within token budget."""
        context = [self.messages[0]]  # system prompt
        recent = self.messages[1:][-self.window_size * 2:]
        context.extend(recent)

        # Trim further if over budget
        while self._count_tokens(context) > self.token_budget and len(context) > 2:
            context.pop(1)

        return context

    def chat(self, user_input, stream=True):
        """Send a message and get a response."""
        self.messages.append({"role": "user", "content": user_input})
        context = self._build_context()

        if stream:
            response_text = self._stream_response(context)
        else:
            response = self.client.chat.completions.create(
                model=self.model,
                messages=context
            )
            response_text = response.choices[0].message.content
            self.total_tokens_used += response.usage.total_tokens

        self.messages.append({"role": "assistant", "content": response_text})
        return response_text

    def _stream_response(self, context):
        """Stream the response token by token."""
        stream = self.client.chat.completions.create(
            model=self.model,
            messages=context,
            stream=True
        )

        full_response = ""
        for chunk in stream:
            delta = chunk.choices[0].delta
            if delta.content:
                print(delta.content, end="", flush=True)
                full_response += delta.content
        print()
        return full_response

    def save(self, session_id="default"):
        """Save conversation to disk."""
        filepath = os.path.join(self.storage_dir, f"{session_id}.json")
        data = {
            "session_id": session_id,
            "saved_at": datetime.now().isoformat(),
            "model": self.model,
            "total_tokens_used": self.total_tokens_used,
            "messages": self.messages
        }
        with open(filepath, "w") as f:
            json.dump(data, f, indent=2)
        print(f"Saved to {filepath}")

    def load(self, session_id="default"):
        """Load a previous conversation."""
        filepath = os.path.join(self.storage_dir, f"{session_id}.json")
        with open(filepath, "r") as f:
            data = json.load(f)
        self.messages = data["messages"]
        self.total_tokens_used = data.get("total_tokens_used", 0)
        print(f"Loaded {len(self.messages)} messages from {session_id}")

    def stats(self):
        """Print conversation statistics."""
        user_msgs = sum(1 for m in self.messages if m["role"] == "user")
        tokens_now = self._count_tokens(self.messages)
        print(f"Messages: {len(self.messages)} ({user_msgs} from user)")
        print(f"Current tokens: {tokens_now}")
        print(f"Total tokens used: {self.total_tokens_used}")

Output:

python

# No output — class defined

Here’s how to use the complete chatbot:

python

# Create and use the chatbot
bot = Chatbot(
    system_prompt="You are a senior data scientist. Give concise, practical advice.",
    window_size=10,
    token_budget=4000
)

# Non-streaming mode for scripted usage
response = bot.chat("What is the best way to handle missing data?", stream=False)
print(response)
print()

# Check stats
bot.stats()

Output:

python

The best approach depends on your data and model. For small amounts of missing
data (under 5%), dropping rows is fine. For more, use imputation — median for
numeric columns, mode for categorical. Scikit-learn's SimpleImputer handles
both. For tree-based models, some implementations handle NaN natively (XGBoost,
LightGBM), so you may not need imputation at all.

Messages: 3 (1 from user)
Current tokens: 142
Total tokens used: 187

typescript

{
  type: 'exercise',
  id: 'chatbot-class-ex3',
  title: 'Exercise 3: Add a Reset Method to the Chatbot',
  difficulty: 'beginner',
  exerciseType: 'write',
  instructions: 'Add a `reset` method to the SimpleChatbot class below. The method should clear all messages except the system prompt, reset the exchange counter to 0, and print "Chat reset. System prompt preserved." Then test it by running the provided code.',
  starterCode: 'class SimpleChatbot:\n    def __init__(self, system_prompt):\n        self.system_prompt = system_prompt\n        self.messages = [{"role": "system", "content": system_prompt}]\n        self.exchanges = 0\n\n    def add_exchange(self, user_msg, bot_reply):\n        self.messages.append({"role": "user", "content": user_msg})\n        self.messages.append({"role": "assistant", "content": bot_reply})\n        self.exchanges += 1\n\n    def reset(self):\n        pass  # YOUR CODE HERE\n\n# Test\nbot = SimpleChatbot("You are a helpful assistant.")\nbot.add_exchange("Hello", "Hi there!")\nbot.add_exchange("How are you?", "I am doing well!")\nprint(f"Before reset: {bot.exchanges} exchanges, {len(bot.messages)} messages")\nbot.reset()\nprint(f"After reset: {bot.exchanges} exchanges, {len(bot.messages)} messages")',
  testCases: [
    { id: 'tc1', input: '', expectedOutput: 'Chat reset. System prompt preserved.', description: 'Should print reset confirmation' },
    { id: 'tc2', input: '', expectedOutput: 'After reset: 0 exchanges, 1 messages', description: 'Should have 0 exchanges and 1 message after reset' }
  ],
  hints: [
    'Reset messages to a list containing only the system prompt: self.messages = [{"role": "system", "content": self.system_prompt}]',
    'Set self.exchanges = 0 and print the confirmation message'
  ],
  solution: 'class SimpleChatbot:\n    def __init__(self, system_prompt):\n        self.system_prompt = system_prompt\n        self.messages = [{"role": "system", "content": system_prompt}]\n        self.exchanges = 0\n\n    def add_exchange(self, user_msg, bot_reply):\n        self.messages.append({"role": "user", "content": user_msg})\n        self.messages.append({"role": "assistant", "content": bot_reply})\n        self.exchanges += 1\n\n    def reset(self):\n        self.messages = [{"role": "system", "content": self.system_prompt}]\n        self.exchanges = 0\n        print("Chat reset. System prompt preserved.")\n\nbot = SimpleChatbot("You are a helpful assistant.")\nbot.add_exchange("Hello", "Hi there!")\nbot.add_exchange("How are you?", "I am doing well!")\nprint(f"Before reset: {bot.exchanges} exchanges, {len(bot.messages)} messages")\nbot.reset()\nprint(f"After reset: {bot.exchanges} exchanges, {len(bot.messages)} messages")',
  solutionExplanation: 'The reset method rebuilds the messages list with only the system prompt, sets exchanges to 0, and prints a confirmation. We stored the system_prompt separately in __init__ so we can recreate the initial state without losing the prompt text.',
  xpReward: 15,
}

Common Mistakes and How to Fix Them

Mistake 1: Not Including Previous Messages in API Calls

This is the number one newbie mistake. Each API call must carry the full chat context.

Wrong:

python

# Each call is independent — no memory
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "What about the second point?"}]
)

Why it’s wrong: The model has no clue what “the second point” means. With no old messages, every call is a blank slate.

Correct:

python

# Include full history — model has context
messages.append({"role": "user", "content": "What about the second point?"})
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages  # includes all previous exchanges
)

Output:

python

# The response now correctly references previous context

Mistake 2: Forgetting to Append the Assistant’s Reply

If you skip saving the bot’s reply, the model won’t know what it said last time.

Wrong:

python

def chat_broken(user_input, messages):
    messages.append({"role": "user", "content": user_input})
    response = client.chat.completions.create(
        model="gpt-4o-mini", messages=messages
    )
    return response.choices[0].message.content
    # BUG: assistant reply is never added to messages

Why it’s wrong: The next call shows two user messages in a row with no bot reply between them. The model gets lost in the broken flow.

Correct:

python

def chat_fixed(user_input, messages):
    messages.append({"role": "user", "content": user_input})
    response = client.chat.completions.create(
        model="gpt-4o-mini", messages=messages
    )
    reply = response.choices[0].message.content
    messages.append({"role": "assistant", "content": reply})  # store the reply
    return reply

Mistake 3: Ignoring Token Limits Until the API Errors

With no token check, long chats crash with a context length error.

Wrong:

python

# No limit checking — will eventually fail
def chat_no_limit(user_input, messages):
    messages.append({"role": "user", "content": user_input})
    response = client.chat.completions.create(
        model="gpt-4o-mini", messages=messages  # messages grow forever
    )
    return response.choices[0].message.content

Why it’s wrong: After too many messages, you blow past the model’s context window. The API throws an error and the bot dies mid-chat.

Correct:

python

# Trim old messages before each call
def chat_with_limit(user_input, messages, max_tokens=4000):
    messages.append({"role": "user", "content": user_input})

    while count_tokens(messages) > max_tokens and len(messages) > 2:
        messages.pop(1)  # remove oldest non-system message

    response = client.chat.completions.create(
        model="gpt-4o-mini", messages=messages
    )
    reply = response.choices[0].message.content
    messages.append({"role": "assistant", "content": reply})
    return reply

Mistake 4: Hardcoding API Keys in Source Code

Wrong:

python

client = OpenAI(api_key="sk-abc123mykey456")  # exposed in source code

Why it’s wrong: If this file hits GitHub — even for a minute — bots scrape the key fast. You’ll face rogue charges and a hacked account.

Correct:

python

from dotenv import load_dotenv
load_dotenv()
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

Mistake 5: Not Handling API Errors

The API can fail for many reasons — rate limits, network drops, bad requests. With no error handling, your bot crashes.

Wrong:

python

# No error handling — any API failure crashes the chatbot
response = client.chat.completions.create(
    model="gpt-4o-mini", messages=messages
)

Correct:

python

import openai

try:
    response = client.chat.completions.create(
        model="gpt-4o-mini", messages=messages
    )
    reply = response.choices[0].message.content
except openai.RateLimitError:
    reply = "I'm getting too many requests right now. Please wait a moment."
except openai.APIConnectionError:
    reply = "I can't reach the API. Please check your internet connection."
except openai.APIError as e:
    reply = f"API error occurred: {e}"

Output:

python

# Chatbot gracefully handles errors instead of crashing

Frequently Asked Questions

How much does it cost to run an AI chatbot?

It depends on your model and chat length. GPT-4o-mini costs about $0.15 per million input tokens and $0.60 per million output tokens. A 20-message chat uses roughly 2,000 to 4,000 tokens. That’s under $0.01. For side projects, costs are near zero.

Can I use a different LLM instead of OpenAI?

Yes. The message format (system/user/assistant roles) is nearly the same across providers. Claude, Gemini, and Llama all use a similar pattern. Swap the client setup and tweak small API diffs — the memory logic stays the same.

How do I deploy my chatbot as a web app?

Wrap your bot in a web framework like FastAPI or Flask. Make an endpoint that takes user messages and sends back replies. Use session IDs (cookies or tokens) to keep one chat history per user. For the frontend, a simple HTML page with fetch calls works. Or use Streamlit for a quick demo.

What is the difference between conversation memory and RAG?

Chat memory stores what was said in the current session. RAG (Retrieval-Augmented Generation) pulls in outside facts from docs or databases. They solve different problems. Memory answers “what did we just talk about?” RAG answers “what does our policy say?” Many live bots use both.

How do I prevent my chatbot from generating harmful content?

Set rules in the system prompt: “Never give medical, legal, or money advice.” Add input checks to catch abuse. The OpenAI API has built-in content filters too. For live systems, add OpenAI’s Moderation API as a safety layer before showing replies.

Complete Code

Click to expand the full script (copy-paste and run)

python

# Complete code from: Python AI Chatbot with Memory
# Requires: pip install openai tiktoken python-dotenv
# Python 3.9+

import os
import json
import tiktoken
from datetime import datetime
from dotenv import load_dotenv
from openai import OpenAI
import openai

load_dotenv()


# --- Setup ---
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))


# --- Token Counting ---
def count_tokens(messages, model="gpt-4o-mini"):
    encoding = tiktoken.encoding_for_model(model)
    token_count = 0
    for message in messages:
        token_count += 4
        for key, value in message.items():
            token_count += len(encoding.encode(value))
    token_count += 2
    return token_count


# --- Production Chatbot Class ---
class Chatbot:
    def __init__(
        self,
        system_prompt="You are a helpful assistant.",
        model="gpt-4o-mini",
        window_size=15,
        token_budget=8000,
        storage_dir="chat_sessions"
    ):
        self.client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
        self.model = model
        self.window_size = window_size
        self.token_budget = token_budget
        self.storage_dir = storage_dir
        self.system_prompt = system_prompt
        self.messages = [{"role": "system", "content": system_prompt}]
        self.total_tokens_used = 0
        os.makedirs(storage_dir, exist_ok=True)

    def _count_tokens(self, messages):
        encoding = tiktoken.encoding_for_model(self.model)
        count = 0
        for msg in messages:
            count += 4
            for value in msg.values():
                count += len(encoding.encode(str(value)))
        return count + 2

    def _build_context(self):
        context = [self.messages[0]]
        recent = self.messages[1:][-self.window_size * 2:]
        context.extend(recent)
        while self._count_tokens(context) > self.token_budget and len(context) > 2:
            context.pop(1)
        return context

    def chat(self, user_input, stream=True):
        self.messages.append({"role": "user", "content": user_input})
        context = self._build_context()

        try:
            if stream:
                response_text = self._stream_response(context)
            else:
                response = self.client.chat.completions.create(
                    model=self.model, messages=context
                )
                response_text = response.choices[0].message.content
                self.total_tokens_used += response.usage.total_tokens
        except openai.RateLimitError:
            response_text = "Rate limited. Please wait a moment and try again."
        except openai.APIConnectionError:
            response_text = "Cannot reach API. Check your internet connection."
        except openai.APIError as e:
            response_text = f"API error: {e}"

        self.messages.append({"role": "assistant", "content": response_text})
        return response_text

    def _stream_response(self, context):
        stream = self.client.chat.completions.create(
            model=self.model, messages=context, stream=True
        )
        full_response = ""
        for chunk in stream:
            delta = chunk.choices[0].delta
            if delta.content:
                print(delta.content, end="", flush=True)
                full_response += delta.content
        print()
        return full_response

    def save(self, session_id="default"):
        filepath = os.path.join(self.storage_dir, f"{session_id}.json")
        data = {
            "session_id": session_id,
            "saved_at": datetime.now().isoformat(),
            "model": self.model,
            "total_tokens_used": self.total_tokens_used,
            "messages": self.messages
        }
        with open(filepath, "w") as f:
            json.dump(data, f, indent=2)
        print(f"Saved to {filepath}")

    def load(self, session_id="default"):
        filepath = os.path.join(self.storage_dir, f"{session_id}.json")
        with open(filepath, "r") as f:
            data = json.load(f)
        self.messages = data["messages"]
        self.total_tokens_used = data.get("total_tokens_used", 0)
        print(f"Loaded {len(self.messages)} messages from {session_id}")

    def reset(self):
        self.messages = [{"role": "system", "content": self.system_prompt}]
        self.total_tokens_used = 0
        print("Chat reset. System prompt preserved.")

    def stats(self):
        user_msgs = sum(1 for m in self.messages if m["role"] == "user")
        tokens_now = self._count_tokens(self.messages)
        print(f"Messages: {len(self.messages)} ({user_msgs} from user)")
        print(f"Current tokens: {tokens_now}")
        print(f"Total tokens used: {self.total_tokens_used}")


# --- Run Interactive Chat ---
def main():
    print("AI Chatbot with Memory")
    print("Commands: 'quit' to exit, 'save' to save, 'stats' for stats")
    print("-" * 50)

    bot = Chatbot(
        system_prompt="You are a friendly and knowledgeable assistant. "
                      "Keep responses concise and helpful.",
        window_size=10,
        token_budget=4000
    )

    while True:
        user_input = input("\nYou: ").strip()

        if not user_input:
            continue
        if user_input.lower() == "quit":
            bot.save("last_session")
            print("Goodbye!")
            break
        if user_input.lower() == "save":
            bot.save("manual_save")
            continue
        if user_input.lower() == "stats":
            bot.stats()
            continue

        print("\nAssistant: ", end="")
        bot.chat(user_input, stream=True)


if __name__ == "__main__":
    main()

References

OpenAI API Documentation — Chat Completions. Link
OpenAI API Documentation — Streaming. Link
OpenAI Cookbook — How to count tokens with tiktoken. Link
OpenAI Documentation — Best practices for prompt engineering. Link
LangChain Documentation — Conversational Memory. Link
tiktoken — OpenAI’s token counting library. Link
python-dotenv Documentation. Link
Pinecone — Conversational Memory for LLMs with LangChain. Link

[SCHEMA HINTS]
– Article type: Tutorial
– Primary technology: OpenAI API, Python 3.9+
– Programming language: Python
– Difficulty: Beginner
– Keywords: python ai chatbot, chatbot with memory, conversational assistant, openai api chatbot, chatbot conversation history, python chatbot tutorial, llm memory management, streaming chatbot python

Free Course

Master Core Python — Your First Step into AI/ML

Build a strong Python foundation with hands-on exercises designed for aspiring Data Scientists and AI/ML Engineers.

Start Free Course →

Trusted by 50,000+ learners

Written by

Selva Prabhakaran →

Related Course

Master Gen AI — Hands-On

Join 5,000+ students at edu.machinelearningplus.com

Explore Course

Build an AI Chatbot with Memory in Python (Step-by-Step)

What Is a Conversational AI Chatbot?

Setting Up Your Environment

Build a Basic Chatbot (Without Memory)

The Memory Problem

Add Conversation Memory

How Memory Grows

Build an Interactive Chat Loop

Handle the Token Limit Problem

Add a Token Budget

Memory Strategies: Window, Summary, and Hybrid

Strategy 1: Sliding Window Memory

Strategy 2: Summary Memory

Strategy 3: Hybrid Memory (Best of Both)

Comparing Memory Strategies

Add Streaming Responses

Streaming Chatbot with Memory

Customize Your Chatbot’s Personality

System Prompt Best Practices

Save and Load Conversations

Manage Multiple Conversations

Build a Complete Chatbot Class

Common Mistakes and How to Fix Them

Mistake 1: Not Including Previous Messages in API Calls

Mistake 2: Forgetting to Append the Assistant’s Reply

Mistake 3: Ignoring Token Limits Until the API Errors

Mistake 4: Hardcoding API Keys in Source Code

Mistake 5: Not Handling API Errors

Frequently Asked Questions

How much does it cost to run an AI chatbot?

Can I use a different LLM instead of OpenAI?

How do I deploy my chatbot as a web app?

What is the difference between conversation memory and RAG?

How do I prevent my chatbot from generating harmful content?

Complete Code

References

Machine Learning A-Z™: Hands-On Python & R In Data Science

Free Sample Videos:

What Is a Conversational AI Chatbot?

Setting Up Your Environment

Build a Basic Chatbot (Without Memory)

The Memory Problem

Add Conversation Memory

How Memory Grows

Build an Interactive Chat Loop

Handle the Token Limit Problem

Add a Token Budget

Memory Strategies: Window, Summary, and Hybrid

Strategy 1: Sliding Window Memory

Strategy 2: Summary Memory

Strategy 3: Hybrid Memory (Best of Both)

Comparing Memory Strategies

Add Streaming Responses

Streaming Chatbot with Memory

Customize Your Chatbot’s Personality

System Prompt Best Practices

Save and Load Conversations

Manage Multiple Conversations

Build a Complete Chatbot Class

Common Mistakes and How to Fix Them

Mistake 1: Not Including Previous Messages in API Calls

Mistake 2: Forgetting to Append the Assistant’s Reply

Mistake 3: Ignoring Token Limits Until the API Errors

Mistake 4: Hardcoding API Keys in Source Code

Mistake 5: Not Handling API Errors

Frequently Asked Questions

How much does it cost to run an AI chatbot?

Can I use a different LLM instead of OpenAI?

How do I deploy my chatbot as a web app?

What is the difference between conversation memory and RAG?

How do I prevent my chatbot from generating harmful content?

Complete Code

References

Related Articles

OpenAI API Python Tutorial – A Complete Crash Course

Build a Python AI Chatbot with Memory (Step-by-Step)

LLM API Call in Python: OpenAI, Claude, and Gemini Step-by-Step

Python.SQL. NumPy. All free.

Machine Learning A-Z™: Hands-On Python & R In Data Science

Free Sample Videos:

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Python.
SQL. NumPy.
All free.