XGBoost in Python (Quickly) Explained (3min Read)

What is? XGBoost is an algorithm used for supervised learning problems. It means Extreme Gradient Boosting. How It works? You have to imagine a sequence of models and each model is trained from the error of its predecessor. Where It is applied? Classification And Regression Trees  is the base learner and you apply a gradient … Read moreXGBoost in Python (Quickly) Explained (3min Read)

My First Natural Language Processing Job Interview As a Selftaught DataScientist

A French startup ( wrote me on LinkedIn for a challenging job interview (spoiler: It didn’t go well). They identify potential fraud on-line through Natural Language Processing and Image Recognition on online marketplace. E.g., a house picture on a marketplace that is also on a stock photo website with a standard message could be classified … Read moreMy First Natural Language Processing Job Interview As a Selftaught DataScientist

How to become (a Self-Taught) Data Scientist

-Doctor my son want to become DataScientist, Have I to worry about? -It is a critical situation Miss, I am sorry for that, but I warn you. Sadly we don’t have an answer to this kind of illness. -You have to be prepared, you must be prepared, your son will go to IKEA, or to … Read moreHow to become (a Self-Taught) Data Scientist

PCA part.2 for unlucky boyfriend/husband

PCA Second Chapter I do not know if my explanation on PCA was clear, I do not think so. I will retry. PCA is a very common technique used in Machine Learning and represents the Principal Component Analysis. Imagine, for some unlucky reasons, you HAVE TO make a present to your girlfriend: a bag (I … Read morePCA part.2 for unlucky boyfriend/husband