Menu

Machine Learning

Build a Custom Scikit-Learn Regression Model: Step-by-Step Guide

Creating custom regressors in scikit-learn means building your own machine learning models that follow scikit-learn’s API conventions, allowing them to work seamlessly with pipelines, grid search, and all other scikit-learn tools. Ever hit a wall where existing scikit-learn regressors just don’t fit your specific problem? Maybe you need a model that minimizes a custom loss […]

Build a Custom Scikit-Learn Regression Model: Step-by-Step Guide Read More »

Mutual information vs Cross Entropy

Cross-entropy is a measure of error, while mutual information measures the shared information between two variable. Both concepts used in information theory, but they serve different purposes and are applied in different contexts. Let’s understand both in complete detail. Cross-Entropy Cross-entropy measures the difference between two probability distributions. Specifically, it quantifies the amount of additional

Mutual information vs Cross Entropy Read More »

Bayesian Optimization for Hyperparameter Tuning – Clearly explained.

Bayesian Optimization is a method used for optimizing ‘expensive-to-evaluate’ functions, particularly useful in hyperparameter tuning for machine learning models. Let’s understand how it works and the math behind it with all the detail. Overview of Bayesian Optimization Bayesian optimization for hyperparameter tuning involves the following steps. We will break it down to simple details after

Bayesian Optimization for Hyperparameter Tuning – Clearly explained. Read More »

KL Divergence

KL Divergence – What is it and mathematical details explained

At its core, KL (Kullback-Leibler) Divergence is a statistical measure that quantifies the dissimilarity between two probability distributions. Think of it like a mathematical ruler that tells us the “distance” or difference between two probability distributions. Remember, in data science, we’re often working with probabilities – the chances of events happening. So, if we have

KL Divergence – What is it and mathematical details explained Read More »

Cook’s Distance for Detecting Influential Observations

Cook’s distance is a measure computed to measure the influence exerted by each observation on the trained model. It is measured by building a regression model and therefore is impacted only by the X variables included in the model. What is Cooks Distance? Cook’s distance measures the influence exerted by each data point (row /

Cook’s Distance for Detecting Influential Observations Read More »

MICE imputation

MICE imputation – How to predict missing values using machine learning in Python

MICE Imputation, short for ‘Multiple Imputation by Chained Equation’ is an advanced missing data imputation technique that uses multiple iterations of Machine Learning model training to predict the missing values using known values from other features in the data as predictors. What is MICE Imputation? You can impute missing values by predicting them using other

MICE imputation – How to predict missing values using machine learning in Python Read More »

Spline Interpolation

Spline Interpolation – How to find the polynomial curve to interpolate missing values

Spline interpolation is a special type of interpolation where a piecewise lower order polynomial called spline is fitted to the datapoints. That is, instead of fitting one higher order polynomial (as in polynomial interpolation), multiple lower order polynomials are fitted on smaller segments. This can be implemented in Python. You can do non-linear spline interpolation

Spline Interpolation – How to find the polynomial curve to interpolate missing values Read More »

Interpolation in Python

Interpolation in Python – How to interpolate missing data, formula and approaches

Interpolation can be used to impute missing data. Let’s see the formula and how to implement in Python. But, you need to be careful with this technique and try to really understand whether or not this is a valid choice for your data. Often, interpolation is applicable when the data is in a sequence or

Interpolation in Python – How to interpolate missing data, formula and approaches Read More »

Missing Data Imputation Approaches

Missing Data Imputation Approaches | How to handle missing values in Python

Machine Learning works on the idea of garbage in – garbage out. If you put in useless junk data to the machine learning algorithm, the results will also be, well, ‘junk’. The quality and consistency of results depend on the data provided. Missing values in data degrade the quality. Why clean the data before training

Missing Data Imputation Approaches | How to handle missing values in Python Read More »

EDA

Exploratory Data Analysis (EDA) – How to do EDA for Machine Learning Problems using Python

Exploratory Data Analysis, simply referred to as EDA, is the step where you understand the data in detail. You understand each variable individually by calculating frequency counts, visualizing the distributions, etc. Also the relationships between the various combinations of the predictor and response variables by creating scatterplots, correlations, etc. EDA is typically part of every

Exploratory Data Analysis (EDA) – How to do EDA for Machine Learning Problems using Python Read More »

An Introduction to AdaBoost

AdaBoost – An Introduction to AdaBoost

Adaboost is one of the earliest implementations of the boosting algorithm. It forms the base of other boosting algorithms, like gradient boosting and XGBoost. This tutorial will take you through the math behind implementing this algorithm and also a practical example of using the scikit-learn Adaboost API. Contents: What is boosting? What is Adaboost? Algorithm

AdaBoost – An Introduction to AdaBoost Read More »

How to formulate machine learning problem

Let’s understand how to define and formulate the machine learning problem (for predictive modeling) from a business problem. This structured approach should help you apply the process to most other types of predictive modeling problems at work. Introduction Often in ML teams, you will hear from the business/company departments about the problems and issues they

How to formulate machine learning problem Read More »

Scroll to Top
Course Preview

Machine Learning A-Z™: Hands-On Python & R In Data Science

Free Sample Videos:

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Scroll to Top