Menu

Selva Prabhakaran

Selva is an experienced Data Scientist and leader, specializing in executing AI projects for large companies. Selva started machinelearningplus to make Data Science / ML / AI accessible to everyone. The website enjoys 4 Million+ readership. His courses, lessons, and videos are loved by hundreds of thousands of students and practitioners.

Spline Interpolation

Spline Interpolation – How to find the polynomial curve to interpolate missing values

Spline interpolation is a special type of interpolation where a piecewise lower order polynomial called spline is fitted to the datapoints. That is, instead of fitting one higher order polynomial (as in polynomial interpolation), multiple lower order polynomials are fitted on smaller segments. This can be implemented in Python. You can do non-linear spline interpolation […]

Spline Interpolation – How to find the polynomial curve to interpolate missing values Read More »

Interpolation in Python

Interpolation in Python – How to interpolate missing data, formula and approaches

Interpolation can be used to impute missing data. Let’s see the formula and how to implement in Python. But, you need to be careful with this technique and try to really understand whether or not this is a valid choice for your data. Often, interpolation is applicable when the data is in a sequence or

Interpolation in Python – How to interpolate missing data, formula and approaches Read More »

Missing Data Imputation Approaches

Missing Data Imputation Approaches | How to handle missing values in Python

Machine Learning works on the idea of garbage in – garbage out. If you put in useless junk data to the machine learning algorithm, the results will also be, well, ‘junk’. The quality and consistency of results depend on the data provided. Missing values in data degrade the quality. Why clean the data before training

Missing Data Imputation Approaches | How to handle missing values in Python Read More »

EDA

Exploratory Data Analysis (EDA) – How to do EDA for Machine Learning Problems using Python

Exploratory Data Analysis, simply referred to as EDA, is the step where you understand the data in detail. You understand each variable individually by calculating frequency counts, visualizing the distributions, etc. Also the relationships between the various combinations of the predictor and response variables by creating scatterplots, correlations, etc. EDA is typically part of every

Exploratory Data Analysis (EDA) – How to do EDA for Machine Learning Problems using Python Read More »

How to reduce the memory size of Pandas Data frame

How to reduce the memory size of Pandas Data frame

After importing with pandas read_csv(), dataframes tend to occupy more memory than needed. This is a default behavior in Pandas, in order to ensure all data is read properly. It’s possible to optimize that, because, lighter the dataframe, faster will be the operations you do on them later on. So, let’s first check how much

How to reduce the memory size of Pandas Data frame Read More »

An Introduction to AdaBoost

AdaBoost – An Introduction to AdaBoost

Adaboost is one of the earliest implementations of the boosting algorithm. It forms the base of other boosting algorithms, like gradient boosting and XGBoost. This tutorial will take you through the math behind implementing this algorithm and also a practical example of using the scikit-learn Adaboost API. Contents: What is boosting? What is Adaboost? Algorithm

AdaBoost – An Introduction to AdaBoost Read More »

How to formulate machine learning problem

Let’s understand how to define and formulate the machine learning problem (for predictive modeling) from a business problem. This structured approach should help you apply the process to most other types of predictive modeling problems at work. Introduction Often in ML teams, you will hear from the business/company departments about the problems and issues they

How to formulate machine learning problem Read More »

np.random.uniform

How to use numpy.random.uniform() in python.

The np.random.uniform() function is used to create an array with random samples from a uniform probability distribution of given low and high values. random.uniform(low=0.0, high=1.0, size=None) Purpose: The numpy random uniform function used for creating a numpy array with random float values from low to high interval. Parameteres: Low: float or array-like of floats,optional: Lowest

How to use numpy.random.uniform() in python. Read More »

Train Test Split – How to split data into train and test for validating machine learning models?

The train-test split technique is a way of evaluating the performance of machine learning models. Whenever you build machine learning models, you will be training the model on a specific dataset (X and y). Once trained, you want to ensure the trained model is capable of performing well on the unseen test data as well.

Train Test Split – How to split data into train and test for validating machine learning models? Read More »

What is a Data Scientist? – Roles, Responsibilities, Skillsets, Career Path and Salary

A Data scientist uses Data and AI to solve business problems, is skilled at working with data, extract meaningful insights, using ML to solve business problems, build applications that make predictions and recommendations, deploy and monitor the solutions. The perks of being a Data Scientist Data scientist is a relatively a new profession. By Data

What is a Data Scientist? – Roles, Responsibilities, Skillsets, Career Path and Salary Read More »

Scroll to Top
Course Preview

Machine Learning A-Z™: Hands-On Python & R In Data Science

Free Sample Videos:

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Scroll to Top