Privacy Policy python Archives - Andrea Ciufo

XGBoost in Python (Quickly) Explained (3min Read)

What is? XGBoost is an algorithm used for supervised learning problems. It means Extreme Gradient Boosting. How It works? You have to imagine a sequence of models and each model is trained from the error of its predecessor. Where It is applied? Classification And Regression Trees  is the base learner and you apply a gradient … Read more XGBoost in Python (Quickly) Explained (3min Read)

My First Natural Language Processing Job Interview As a Selftaught DataScientist

A French startup (https://navee.co/) wrote me on LinkedIn for a challenging job interview (spoiler: It didn’t go well). They identify potential fraud on-line through Natural Language Processing and Image Recognition on online marketplace. E.g., a house picture on a marketplace that is also on a stock photo website with a standard message could be classified … Read more My First Natural Language Processing Job Interview As a Selftaught DataScientist

The best selling drugs in Italy are the ones that could be advertised

The best selling drugs in Italy are the ones that could be advertised On the Ministry of Health Website, there is a open data section where you can find the information in a *.csv format on the top-50best selling drugs in Italy. I decided to investigate this dataset, grouping some information (Python Code Attached Below, … Read more The best selling drugs in Italy are the ones that could be advertised

What are the best selling drugs in Italy?

The Italian Ministry of Health published a dataset on the most distributed * drugs through drugstores (here you can find the dataset). In methodological terms I have aggregated all the drugs with the same starting word E.g. All kind of “Tachipirina” packs (most sold paracetamol drug in Italy) are grouped in a single variable, regardless of whether … Read more What are the best selling drugs in Italy?

A/B Testing explained to the Nerd who wants to pick up on Tinder

You just bought the ultimate fragrance, your abs are not perfectly sculpted, the beard is perfect, but the only match you get on Tinder Is with the fake profile made by your mate. You start thinking that you have a problem. Your “pick up strategy” is not working, obviously. You decide to rely on your … Read more A/B Testing explained to the Nerd who wants to pick up on Tinder

How to become (a Self-Taught) Data Scientist

-Doctor my son want to become DataScientist, Have I to worry about? -It is a critical situation Miss, I am sorry for that, but I warn you. Sadly we don’t have an answer to this kind of illness. -You have to be prepared, you must be prepared, your son will go to IKEA, or to … Read more How to become (a Self-Taught) Data Scientist

PCA (Chapter One)

First Version of this article was published  on my Italian blog uomodellamansarda.com Between July and August, I could lead an optimization project for cutting cost for a UK company. This project could be based on PCA application. In this article and in the following, I will try to explain a fundamental concept. It is also very … Read more PCA (Chapter One)

Not only Theory

I received some negative feedback on my last post on the Italian blog uomodellamansarda.com, from Filippo and Francesco, two dear friends and I am planning a dinner to discuss better their suggestions. A bbq, a bottle of wine(actually I would try this non-commercial-vermouth –> https://amzn.to/2v0oles ) , a friendly discussion and I hope on this … Read more Not only Theory

Applying Markov Inequality and Central Limit Theorem on Pomodoro Records to Estimate the Probability to Improve Daily Performance

One day I will improve how to publish a better post from Jupiter on WordPress, all is still work in progress. The script, that you can find on my GitHub,  will estimate based on my past records the probability to study more Python (or whatever variable are you tracking), in terms of Pomodoro time slots … Read more Applying Markov Inequality and Central Limit Theorem on Pomodoro Records to Estimate the Probability to Improve Daily Performance

Data Scientist Career Track with Python

The 30th of December I finished the “Data Scientist Career Track with Python” on DataCamp.com. It was a great journey and it lasted 226 h (tracked with the Pomodoro Technique). The Career Track is composed of 20 courses, I also enrolled other two, the first on SQL, the second on PostgreSQL This career track cost … Read more Data Scientist Career Track with Python