Menu

Introduction to LLMs and the OpenAI API in Python

Written by Selva Prabhakaran | 21 min read


You type a question into ChatGPT, and back comes a paragraph that sounds like it was written by someone who actually knows the topic. What’s going on under the hood? And how do you tap into that same power from your own Python scripts?

That’s what we’ll work through here. By the end, you’ll understand what large language models actually do — and you’ll have real Python code that talks to the OpenAI API. We’ll cover text generation, multi-turn conversations, and streaming responses.

What Is a Large Language Model?

At its core, a large language model (LLM) is a next-word prediction engine. Feed it “The capital of France is” and it predicts “Paris” — because that’s the most likely word to come next.

The “large” part is about scale. GPT-4 has billions of parameters — numeric weights trained on a massive pile of text. Books, blog posts, code repos, forum threads — the model has absorbed patterns from all of it.

Here’s the part that surprises people: the model doesn’t know anything the way you do. What it has is a deep sense of which words tend to follow which other words, and in what context. When it gives you a correct answer, that’s because the right pattern showed up again and again in the training data.

KEY INSIGHT: An LLM doesn’t look up facts in a database. It predicts the most likely next token based on learned patterns. Once you grasp this, you can tell where it’ll do well (fluent, natural text) and where it’ll stumble (precise facts and figures).

How LLMs Work: Tokens, Transformers, and Attention

You don’t need to build a transformer from scratch to use the API. But a quick look at the basics will help you understand pricing, token limits, and why the model can “forget” things in a long conversation.

Tokens: The Model’s Alphabet

LLMs don’t read words — they read tokens. A token is a small chunk of text, typically 3–4 characters in English. “Understanding” splits into two tokens: “under” and “standing”. “Cat” stays as one.

Why should you care? Because tokens are what you pay for. Every API call has a token limit, and longer prompts burn through more of them.

Here’s how to count tokens before you send a request. OpenAI’s tiktoken library encodes text the same way the model does. Call encoding_for_model() to get the right tokenizer, then .encode() to turn your text into a list of token IDs.

python
import tiktoken

encoder = tiktoken.encoding_for_model("gpt-4o")
text = "What is machine learning?"
tokens = encoder.encode(text)
print(f"Text: {text}")
print(f"Tokens: {tokens}")
print(f"Token count: {len(tokens)}")

Five tokens for five words — a neat one-to-one ratio. Don’t count on that, though. Longer or unusual words often get split into multiple tokens.

The Transformer Architecture (30-Second Version)

The transformer is the neural network design behind every modern LLM. Here’s the quick version:

  • Embedding — Each token gets turned into a vector (a list of numbers) that captures its meaning. “King” and “queen” end up with vectors that sit close together; “king” and “bicycle” don’t.

  • Attention — This is the key idea. The model examines every token and works out which other tokens are relevant for the next prediction. Take “The cat sat on the mat because it was tired” — attention is what lets the model figure out that “it” refers to “cat,” not “mat.”

  • Feed-forward layers — Once attention has mapped out the relationships, small neural networks dig into each position to extract deeper patterns.

  • Repeat — Steps 2 and 3 run dozens of times. Each pass adds more understanding.

  • Prediction — A final layer scores every possible next token. The model either picks the top candidate or samples from the best ones, depending on the settings you choose.

UNDER THE HOOD: Attention looks at all input tokens at once, and its cost scales roughly with the square of input length. A 4,000-token prompt takes far more compute than a 1,000-token prompt. Shorter prompts save you real money.

Now that you have the mental model, let’s get your environment ready and make a real API call.

Setting Up the OpenAI Python SDK

You need three things before writing any code: Python on your machine, the OpenAI package, and an API key.

Prerequisites

  • Python version: 3.9+

  • Required library: openai (1.0+)

  • Install: pip install openai tiktoken

  • Time to complete: 15–20 minutes

Getting Your API Key

Go to platform.openai.com and create a new API key. Copy it the moment it appears — you won’t be able to see it again.

Save it as an environment variable. Never hard-code keys into your scripts.

python
# On macOS/Linux
export OPENAI_API_KEY="sk-your-key-here"

# On Windows (Command Prompt)
set OPENAI_API_KEY=sk-your-key-here

# On Windows (PowerShell)
$env:OPENAI_API_KEY="sk-your-key-here"

You can also drop the key into a .env file and load it with python-dotenv:

python
# .env file (add this to .gitignore!)
# OPENAI_API_KEY=sk-your-key-here

from dotenv import load_dotenv
load_dotenv()  # loads .env into environment variables

WARNING: Never commit API keys to version control. Put .env in your .gitignore. If a key leaks, anyone can rack up charges on your account.

The SDK picks up the OPENAI_API_KEY variable on its own, so you don’t need to pass it in code:

python
from openai import OpenAI

client = OpenAI()  # reads OPENAI_API_KEY from environment

That’s all it takes. The client object is your gateway to every OpenAI model.

Your First API Call and the Response Object

Here’s the moment you’ve been building up to. Call client.chat.completions.create(), give it a model name and a list of messages, and you get back a completion.

python
from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "user", "content": "What is Python used for?"}
    ]
)

print(response.choices[0].message.content)

Run this twice and you’ll get slightly different wording each time — that’s the nature of LLMs. Let’s break down the pieces:

  • model="gpt-4o-mini" — A fast, cheap model. Great for learning.

  • messages — A list of dicts, each with a role and content.

  • response.choices[0] — The API can return more than one reply. We grab the first.

  • .message.content — The actual text the model produced.

There’s more hiding in the response object. Let’s pull it apart to see what you’re getting — and what you’re paying for.

python
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Say hello in French."}]
)

print(f"Model used: {response.model}")
print(f"Response text: {response.choices[0].message.content}")
print(f"Finish reason: {response.choices[0].finish_reason}")
print(f"Prompt tokens: {response.usage.prompt_tokens}")
print(f"Completion tokens: {response.usage.completion_tokens}")
print(f"Total tokens: {response.usage.total_tokens}")
Field What It Means
response.model The exact model version that handled your request
choices[0].message.content The text it wrote
choices[0].finish_reason Why it stopped: "stop" (it was done) or "length" (it hit the token cap)
usage.prompt_tokens Tokens in your input — you pay for these
usage.completion_tokens Tokens the model wrote — you pay for these too
usage.total_tokens The sum of both — this is what shows up on your bill

KEY INSIGHT: You pay for tokens going in and coming out. A 500-token prompt with a 500-token reply costs the same as a 900-token prompt with a 100-token reply. So keep your prompts tight, and use max_tokens to cap the output when you don’t need a long answer.

Quick check: What would finish_reason be if you set max_tokens=5 on a question that needs a whole paragraph? (Answer: "length" — the model ran out of room before it could finish.)

Messages, Roles, and Controlling the Model

The messages list is where the real power lives. Each message carries a role that tells the model who’s talking. You’ll use three:

system — Think of this as stage directions. It sets the model’s tone, rules, and persona. The user never sees it, but the model follows it through the whole conversation.

user — That’s you (or your end user). The question, instruction, or prompt you want a response to.

assistant — The model’s own prior replies. You include these when you’re building a multi-turn conversation, so the model knows what it already said.

Watch how a system message changes the reply. Here we tell the model to act as a Python tutor who keeps answers short.

python
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {
            "role": "system",
            "content": "You are a Python tutor. Give short, clear answers in 2-3 sentences max."
        },
        {
            "role": "user",
            "content": "What's the difference between a list and a tuple?"
        }
    ]
)

print(response.choices[0].message.content)

Without the system message you’d get a long, generic answer. With it, the model stays focused and brief.

TIP: The system message is your strongest lever for output quality. Something like “You are a data analyst. Respond with Python code using pandas. No prose unless asked.” beats a vague prompt every time.

Key Parameters: temperature, max_tokens, and top_p

The create() method has a few parameters that shape how the model writes. Three matter most.

temperature dials randomness up or down. It goes from 0 to 2. Low values make the output focused and repeatable. High values make it wilder and more creative.

python
response_low = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Name a fruit."}],
    temperature=0
)
print(f"temp=0: {response_low.choices[0].message.content}")

response_high = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Name a fruit."}],
    temperature=1.5
)
print(f"temp=1.5: {response_high.choices[0].message.content}")

At temperature=0, you’ll keep getting the same fruit. At 1.5, you might see “Rambutan” or “Persimmon.”

A good rule of thumb: temperature=0 for factual work (code, data extraction) and 0.7–1.0 for creative work (writing, brainstorming).

max_tokens puts a hard cap on the model’s output length. If the model needs more room than you gave it, it stops mid-sentence — and finish_reason comes back as "length" instead of "stop".

top_p (nucleus sampling) is another knob for randomness. It tells the model to only consider the top P% of likely tokens. top_p=0.1 means it picks from just the top 10%. Use temperature or top_p, not both at once — OpenAI recommends changing one and leaving the other alone.

Building Multi-Turn Conversations

A single API call has no memory. The model doesn’t know what you asked five seconds ago. To have a real conversation, you send the entire chat history in the messages list every time you call the API.

The pattern is simple: after each reply, append the model’s response to your list, then add the new user message, then call the API again with the full list.

python
conversation = [
    {
        "role": "system",
        "content": "You are a helpful cooking assistant."
    },
    {
        "role": "user",
        "content": "How do I make scrambled eggs?"
    }
]

# First turn
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=conversation
)
assistant_reply = response.choices[0].message.content
print(f"Assistant: {assistant_reply}\n")

# Add the assistant's reply to history
conversation.append({"role": "assistant", "content": assistant_reply})

# Second turn — the model now has context
conversation.append({"role": "user", "content": "What cheese goes best with that?"})

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=conversation
)
print(f"Assistant: {response.choices[0].message.content}")

The second reply makes sense because the model can see the whole conversation. It knows “that” means scrambled eggs. Strip out the history, and the model has no idea what “that” refers to — you’d just get a generic cheese answer. This is one of the trickiest parts of the API: you own the conversation state.

Predict the output: What if you skip adding the assistant message before the second call? The model loses the thread. It doesn’t know what dish you’re talking about, so you get a vague answer instead of one tailored to eggs.

WARNING: Every message in the history costs tokens. A 50-turn conversation re-sends all 50 messages on every call. For long conversations, you’ll eventually need to summarize older messages or trim the history to stay under the token limit.

Streaming and Error Handling

Two more things to learn before you build anything real: streaming (so your app feels responsive) and error handling (so it doesn’t crash the first time something goes wrong).

Streaming Responses

Normally, the API waits until the model writes the entire response and then sends it all at once. For long replies, that pause feels slow. Streaming fixes this — tokens arrive as the model produces them, the same way ChatGPT shows text appearing on screen.

Turn it on with stream=True, then loop through the chunks.

python
stream = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "user", "content": "Write a haiku about Python programming."}
    ],
    stream=True
)

for chunk in stream:
    content = chunk.choices[0].delta.content
    if content is not None:
        print(content, end="", flush=True)

print()  # newline at the end

Notice we read delta.content instead of message.content. Each chunk carries only the new tokens since the last one. Some chunks come back with None (they’re metadata), so we skip those.

The cost is the same whether you stream or not — same tokens, same bill. But the user experience is night and day. Text that flows onto the screen word by word feels instant. A three-second blank stare followed by a wall of text feels broken. For anything user-facing, always stream.

Handling API Errors

The OpenAI SDK throws specific error types you can catch. Here are the four you’ll run into:

python
from openai import (
    OpenAI,
    AuthenticationError,
    RateLimitError,
    APIConnectionError,
    BadRequestError
)

client = OpenAI()

try:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Hello"}]
    )
    print(response.choices[0].message.content)

except AuthenticationError:
    print("Invalid API key. Check your OPENAI_API_KEY.")

except RateLimitError:
    print("Rate limit hit. Wait a moment and retry.")

except APIConnectionError:
    print("Can't reach OpenAI servers. Check your internet.")

except BadRequestError as e:
    print(f"Bad request: {e}")
Error Common Cause Fix
AuthenticationError Wrong or expired API key Create a new key at platform.openai.com
RateLimitError Too many requests per minute Add retry logic with backoff
APIConnectionError Network issue or OpenAI outage Check your connection, wait a few seconds, try again
BadRequestError Bad model name or too many tokens Double-check your parameters against the docs

For production code, wire up automatic retries. The tenacity library makes this painless:

python
# pip install tenacity
from tenacity import retry, wait_exponential, stop_after_attempt

@retry(wait=wait_exponential(min=1, max=60), stop=stop_after_attempt(3))
def call_openai(messages):
    return client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages
    )

This tries up to 3 times, waiting longer each round (1s, 2s, 4s… capped at 60s). Most rate limit errors resolve on their own within seconds.

Choosing the Right Model

OpenAI has several models to pick from. The right choice depends on what you’re building, how much you want to spend, and how fast you need results.

Model Best For Speed Cost Context Window
gpt-4o-mini Prototyping, simple tasks, high volume Fast Lowest 128K tokens
gpt-4o Complex reasoning, coding, analysis Medium Mid-range 128K tokens
gpt-4.1 Production apps, nuanced tasks Medium Higher 1M tokens
gpt-4.1-mini Good balance of speed and cost Fast Low 1M tokens

While you’re learning, stick with gpt-4o-mini. It’s cheap, fast, and more than enough for most everyday tasks. Move to a larger model when you need deeper reasoning or a bigger context window.

OpenAI also offers embeddings (turning text into numeric vectors for similarity search) and image generation (DALL-E). These use different endpoints, but the SDK and auth setup are the same. We’ll cover them in future articles.

NOTE: In 2025, OpenAI introduced the Responses API as a simpler way to interact with models. Here’s how it stacks up against Chat Completions:

Feature Chat Completions Responses API
Input format messages array (roles required) input string or messages
Output access response.choices[0].message.content response.output_text
Built-in tools None Web search, file search, code interpreter
Multi-turn state Manual (pass full history) Handled for you with store=True
Best for Learning, existing codebases New projects, agentic workflows

Start with Chat Completions — it’s what most tutorials, frameworks like LangChain, and production systems use today. But keep an eye on the Responses API; it’s where the platform is headed.

WARNING: When NOT to use the OpenAI API. The API isn’t always the right tool. Think twice when: (1) you need under 100ms latency — API round trips alone add 500ms to 2 seconds; (2) your data must stay on your own servers — look at self-hosted models like Llama or Mistral; (3) you need byte-for-byte identical output every time — even at temperature=0, results can drift between API versions; (4) you’re making millions of calls a day — at that scale, hosting your own model might be cheaper.

Common Mistakes and How to Fix Them

Mistake 1: Forgetting to Set the API Key

python
# ❌ Wrong — no API key configured
from openai import OpenAI
client = OpenAI(api_key="")  # empty string
python
AuthenticationError: No API key provided.
python
# ✅ Correct — set the environment variable first
# export OPENAI_API_KEY="sk-your-actual-key"
from openai import OpenAI
client = OpenAI()  # reads from environment

Mistake 2: Using the Old API Syntax

The OpenAI Python SDK got a full rewrite at version 1.0. If you copy code from a tutorial written before late 2023, it won’t work.

python
# ❌ Old syntax (pre v1.0) — this will error
import openai
openai.api_key = "sk-..."
response = openai.ChatCompletion.create(...)
python
# ✅ Current syntax (v1.0+)
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(...)

If you see AttributeError: module 'openai' has no attribute 'ChatCompletion', that’s the tell — old code, new SDK.

Mistake 3: Ignoring Token Limits

Try to send a 200,000-token prompt to a model with a 128K context window and you’ll get a BadRequestError. Always count your tokens first.

python
# ✅ Check token count before sending
import tiktoken

def count_tokens(messages, model="gpt-4o-mini"):
    """Approximate token count for a messages list."""
    encoder = tiktoken.encoding_for_model(model)
    total = 0
    for msg in messages:
        # +4 accounts for message formatting overhead
        total += len(encoder.encode(msg["content"])) + 4
    return total

Mistake 4: Not Handling Streaming Right

python
# ❌ Wrong — treating stream like a regular response
stream = client.chat.completions.create(..., stream=True)
print(stream.choices[0].message.content)  # AttributeError!
python
# ✅ Correct — iterate over chunks
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Real-World Example: Building a Code Review Assistant

Let’s pull everything together into something you can actually use. This small tool takes a Python function, sends it to GPT, and gets back a code review.

The review_code() function uses a system message to put the model in “senior Python developer” mode. It passes the code as the user message, sets temperature=0.3 so the feedback stays focused, and limits output to 500 tokens.

python
from openai import OpenAI

client = OpenAI()

def review_code(code_snippet):
    """Send a code snippet to GPT for review."""
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {
                "role": "system",
                "content": (
                    "You are a senior Python developer. "
                    "Review the code for bugs, style issues, "
                    "and performance. Be specific and concise."
                )
            },
            {
                "role": "user",
                "content": f"Review this Python code:\n\n{code_snippet}"
            }
        ],
        temperature=0.3,
        max_tokens=500
    )
    return response.choices[0].message.content

Let’s test it on a function with a few problems on purpose — it uses range(len()) instead of direct iteration, has no guard for an empty list, and ignores Python’s built-in sum():

python
sample_code = """
def get_avg(numbers):
    total = 0
    for i in range(len(numbers)):
        total = total + numbers[i]
    avg = total / len(numbers)
    return avg
"""

review = review_code(sample_code)
print(review)

The model catches the real issues — the un-Pythonic loop, the missing empty-list check, the unused sum(). Twenty lines of code, and you’ve got a working review tool. That’s what happens when you combine a clear system message, the right temperature, and a focused prompt.

Summary

Here’s what you’ve learned:

  • LLMs predict the next token using patterns from training data. They don’t “understand” things the way we do.

  • Tokens are the unit of work (and cost). Every model has a context window limit.

  • The OpenAI Python SDK gives you a clean interface through client.chat.completions.create().

  • Messages use three roles: system (sets the rules), user (your input), assistant (past replies).

  • Temperature controls how random or focused the output is.

  • Multi-turn conversations require you to send the full chat history each time.

  • Streaming delivers tokens as they’re produced, making your app feel faster.

  • Error handling with retries is a must for production code.

Practice Exercise

Build a command-line chatbot that keeps track of the conversation. The user types a message, the bot replies, and the history carries over from turn to turn.

python
from openai import OpenAI

client = OpenAI()

conversation = [
    {"role": "system", "content": "You are a helpful assistant. Keep answers concise."}
]

print("Chatbot ready! Type 'quit' to exit.\n")

while True:
    user_input = input("You: ")
    if user_input.lower() == "quit":
        break

    conversation.append({"role": "user", "content": user_input})

    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=conversation
    )

    reply = response.choices[0].message.content
    conversation.append({"role": "assistant", "content": reply})
    print(f"Bot: {reply}\n")

This exercise puts multi-turn conversation management to the test — the core pattern from this article. The conversation list grows with each turn, giving the model the full context it needs.

Complete Code

python
# Complete code from: Introduction to LLMs and the OpenAI API in Python
# Requires: pip install openai tiktoken
# Python 3.9+

from openai import OpenAI
import tiktoken

# --- Section 1: Token Counting ---
encoder = tiktoken.encoding_for_model("gpt-4o")
text = "What is machine learning?"
tokens = encoder.encode(text)
print(f"Text: {text}")
print(f"Tokens: {tokens}")
print(f"Token count: {len(tokens)}")

# --- Section 2: First API Call ---
client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "user", "content": "What is Python used for?"}
    ]
)
print(f"\nFirst call: {response.choices[0].message.content}")

# --- Section 3: Response Object ---
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Say hello in French."}]
)
print(f"\nModel: {response.model}")
print(f"Text: {response.choices[0].message.content}")
print(f"Finish reason: {response.choices[0].finish_reason}")
print(f"Total tokens: {response.usage.total_tokens}")

# --- Section 4: System Message ---
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {
            "role": "system",
            "content": "You are a Python tutor. Give short, clear answers in 2-3 sentences max."
        },
        {
            "role": "user",
            "content": "What's the difference between a list and a tuple?"
        }
    ]
)
print(f"\nWith system message: {response.choices[0].message.content}")

# --- Section 5: Temperature ---
response_low = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Name a fruit."}],
    temperature=0
)
print(f"\ntemp=0: {response_low.choices[0].message.content}")

# --- Section 6: Multi-Turn Conversation ---
conversation = [
    {"role": "system", "content": "You are a helpful cooking assistant."},
    {"role": "user", "content": "How do I make scrambled eggs?"}
]

response = client.chat.completions.create(
    model="gpt-4o-mini", messages=conversation
)
assistant_reply = response.choices[0].message.content
print(f"\nAssistant: {assistant_reply}")

conversation.append({"role": "assistant", "content": assistant_reply})
conversation.append({"role": "user", "content": "What cheese goes best with that?"})

response = client.chat.completions.create(
    model="gpt-4o-mini", messages=conversation
)
print(f"Assistant: {response.choices[0].message.content}")

# --- Section 7: Streaming ---
print("\nStreaming response:")
stream = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "user", "content": "Write a haiku about Python programming."}
    ],
    stream=True
)
for chunk in stream:
    content = chunk.choices[0].delta.content
    if content is not None:
        print(content, end="", flush=True)
print()

# --- Section 8: Code Review Assistant ---
def review_code(code_snippet):
    """Send a code snippet to GPT for review."""
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {
                "role": "system",
                "content": (
                    "You are a senior Python developer. "
                    "Review the code for bugs, style issues, "
                    "and performance. Be specific and concise."
                )
            },
            {
                "role": "user",
                "content": f"Review this Python code:\n\n{code_snippet}"
            }
        ],
        temperature=0.3,
        max_tokens=500
    )
    return response.choices[0].message.content

sample_code = """
def get_avg(numbers):
    total = 0
    for i in range(len(numbers)):
        total = total + numbers[i]
    avg = total / len(numbers)
    return avg
"""

print(f"\nCode Review:\n{review_code(sample_code)}")

print("\nScript completed successfully.")

Frequently Asked Questions

How much does the OpenAI API cost?

It depends on the model and how many tokens you use. As of 2025, gpt-4o-mini runs about \(0.15 per million input tokens and \)0.60 per million output tokens. A typical learning session — dozens of calls — costs just a few cents. Check openai.com/pricing for the latest numbers.

What’s the difference between ChatGPT and the OpenAI API?

ChatGPT is a product with a web interface. The API gives you direct access to the same models from code. With the API, you control the system message, tweak parameters, manage the conversation flow, and wire it all into your own apps. You can’t do any of that through the ChatGPT UI.

Do I need a GPU to use the OpenAI API?

No. All the heavy work runs on OpenAI’s servers. Your machine just sends text and gets text back. Anything that can run Python and make HTTP requests will do.

Can I use other LLM providers with the same code?

Yes, many of them. Anthropic, Google, Mistral, and local models (via Ollama) all offer APIs that mirror OpenAI’s format. Often you just swap the base URL and key. Libraries like LiteLLM let you talk to all of them through one interface.

References

  • OpenAI Platform — API Reference: Chat Completions. Link
  • OpenAI Platform — Developer Quickstart. Link
  • OpenAI — Responses API vs Chat Completions Guide. Link
  • Vaswani, A. et al. — “Attention Is All You Need.” NeurIPS 2017. arXiv
  • OpenAI Python SDK — GitHub Repository. Link
  • OpenAI — Tokenizer and tiktoken Library. Link
  • OpenAI — Model Pricing. Link
  • OpenAI — API Data Usage Policies. Link
Free Course
Master Core Python — Your First Step into AI/ML

Build a strong Python foundation with hands-on exercises designed for aspiring Data Scientists and AI/ML Engineers.

Start Free Course
Trusted by 50,000+ learners
Related Course
Master Gen AI — Hands-On
Join 5,000+ students at edu.machinelearningplus.com
Explore Course
Get the full course,
completely free.
Join 57,000+ students learning Python, SQL & ML. One year of access, all resources included.
📚 10 Courses
🐍 Python & ML
🗄️ SQL
📦 Downloads
📅 1 Year Access
No thanks
🎓
Free AI/ML Starter Kit
Python · SQL · ML · 10 Courses · 57,000+ students
🎉   You're in! Check your inbox (or Promotions/Spam) for the access link.
⚡ Before you go

Python.
SQL. NumPy.
All free.

Get the exact 10-course programming foundation that Data Science professionals use.

🐍
Core Python — from first line to expert level
📈
NumPy & Pandas — the #1 libraries every DS job needs
🗃️
SQL Levels I–III — basics to Window Functions
📄
Real industry data — Jupyter notebooks included
R A M S K
57,000+ students
★★★★★ Rated 4.9/5
⚡ Before you go
Python. SQL.
All Free.
R A M S K
57,000+ students  ★★★★★ 4.9/5
Get Free Access Now
10 courses. Real projects. Zero cost. No credit card.
New learners enrolling right now
🔒 100% free ☕ No spam, ever ✓ Instant access
🚀
You're in!
Check your inbox for your access link.
(Check Promotions or Spam if you don't see it)
Or start your first course right now:
Start Free Course →
Scroll to Top
Scroll to Top
Course Preview

Machine Learning A-Z™: Hands-On Python & R In Data Science

Free Sample Videos:

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science