Python Experience Updates and Forecast for 2021

When you are a self-taught Data Scientist is always hard to assess your progress. You must set some metrics and start monitoring them. In my case I decided to follow the 10.000 hours rule and track them with a Time Tracker. I used the Pomodoro Technique App. In the past week, I wanted to know … Read more

Disney Movies and Box Office Success – Datacamp Project Analysis [Why I think it is wrong] [Code and Notebook Included]

Some weeks ago I created a list of projects that I wanted to build. The goal of the list is to assess how my Data Science and Python knowledge changed during these years. In this list one of the projects key topic is the Linear Regression Model. For this topic I wanted something new from … Read more

Retention Rate, why is important and why is a convergent sequence

Last month I was explaining to a client the importance of (Customer) Retention. In my opinion, after Profits and Gross Margins,  Retention is the most important KPI in business.  What is Retention?  Customer Retention measures how much customers liked your product/service and would buy or use it again. If I like your product I will … Read more

Pomodoro Technique Python Script, track your goals and shape your future

You can have agency not just over your own life, but over a small and important part of the world. It begins by rejecting the unjust tyranny of Chance. You are not a lottery ticket.”― Peter Thiel, Zero to One: Notes on Startups, or How to Build the Future Awareness of Pomodoro Technique significantly increased during the … Read more

Plotting Google Trends with Bokeh and Python for multiple comparisons prt 1/2

One of my last job tasks activities was to compare different keywords on Google Trends. At the moment I am writing is possible to compare only 5 keywords between each other. If you want to compare more then 5 Key Search Terrs is not possible, or you have to manipulate a little bit data. Take … Read more

XGBoost in Python (Quickly) Explained (3min Read)

What is? XGBoost is an algorithm used for supervised learning problems. It means Extreme Gradient Boosting. How It works? You have to imagine a sequence of models and each model is trained from the error of its predecessor. Where It is applied? Classification And Regression Trees  is the base learner and you apply a gradient … Read more

My First Natural Language Processing Job Interview As a Selftaught DataScientist

A French startup ( wrote me on LinkedIn for a challenging job interview (spoiler: It didn’t go well). They identify potential fraud on-line through Natural Language Processing and Image Recognition on online marketplace. E.g., a house picture on a marketplace that is also on a stock photo website with a standard message could be classified … Read more

How to become (a Self-Taught) Data Scientist

-Doctor my son want to become DataScientist, Have I to worry about? -It is a critical situation Miss, I am sorry for that, but I warn you. Sadly we don’t have an answer to this kind of illness. -You have to be prepared, you must be prepared, your son will go to IKEA, or to … Read more

PCA part.2 for unlucky boyfriend/husband

PCA Second Chapter I do not know if my explanation on PCA was clear, I do not think so. I will retry. PCA is a very common technique used in Machine Learning and represents the Principal Component Analysis. Imagine, for some unlucky reasons, you HAVE TO make a present to your girlfriend: a bag (I … Read more