Data Science Projects

By Manish Gupta is a Principal Applied Researcher at Microsoft India R&D Private Limited at Hyderabad, India. 

Sentiment Analysis using Amazon and Twitter Reviews

For 10 popular mobile phones, scrape all reviews about those phones from Amazon. Also, scrape recent tweets about those phones.

Perform sentiment analysis on these reviews. Rank the mobile phones by popularity, and also by positive sentiment both on Twitter as well as on Amazon.

Scraping Structured Data from Yelp and Zomato

Download 1000 business pages from Each webpage is an HTML containing details about the business. It does not have the email id, but it has the website address for the business which can be used to find the contact us page for the website and thereby extract its email id.

Your task is to obtain structured data for the business: business name, business phone number, business home page URL, business address, opening hours, Takes Reservations, Delivery, Take-out, Accepts Credit Cards, Accepts Apple Pay, Accepts Android Pay, Accepts Bitcoin, Good For, Parking, Bike Parking, Good for Kids, Good for Groups, Attire, Ambience, Noise Level, Alcohol, Outdoor Seating, Wi-Fi, Has TV, Caters, Gender Neutral Restrooms, contact-us URL for the business, email id for the business.

Try to extract similar structured information for 1000 pages from Zomato.

Wikipedia Website Traffic Forecasting

This project focuses on the problem of forecasting the future values of multiple time series, as it has always been one of the most challenging problems in the field. More specifically, we aim the project at testing state-of-the-art methods designed by the participants, on the problem of forecasting future web traffic for approximately 145,000 Wikipedia articles.

The training dataset consists of approximately 145k time series. Each of these time series represent a number of daily views of a different Wikipedia article, starting from July, 1st, 2015 up until December 31st, 2016. Divide the data into train and test, and validate your approaches.

Customer Churn Prediction

Customers of a big international bank, who decided to leave (Exited) from the bank. A bank is investigating a very high rate of customer leaving the bank. Here is a 10.000 records dataset to investigate and predict which of the customers are more likely to leave the bank soon.

Use various classifiers to find which one provides better accuracy. Identify the most important features. Try out various feature selection techniques also.

Student Dropout Prediction

Students’ high dropout rate on MOOC platforms has been heavily criticized, and predicting their likelihood of dropout would be useful for maintaining and encouraging students’ learning activities.

In this competition, you are challenged to build a predictor that can predict the chance that a student will drop out of an enrollment after observing his/her early course activities.

In particular, you have access to the statistics of the student’s course-relevant activities during the first 10 days since its launch, such as working on course assignments, watching course videos, accessing the course wiki, etc.

Further, not many students dropout overall but their performance could suffer. The second part of the project concerns predicting student performance in secondary education (high school).

Diabetic Retinopathy Detection

Diabetic retinopathy is the leading cause of blindness in the working-age population of the developed world. It is estimated to affect over 93 million people. Currently, detecting DR is a time-consuming and manual process that requires a trained clinician to examine and evaluate digital color fundus photographs of the retina.

By the time human readers submit their reviews, often a day or two later, the delayed results lead to lost follow up, miscommunication, and delayed treatment. With color fundus photography as input, the goal of this project is to build an automated detection system. You are provided with a large set of high-resolution retina images taken under a variety of imaging conditions.

A left and right field is provided for every subject. Images are labeled with a subject id as well as either left or right (e.g. 1_left.jpeg is the left eye of patient id 1). A clinician has rated the presence of diabetic retinopathy in each image on a scale of 0 to 4, according to the following scale: 0 – No DR, 1 – Mild, 2 – Moderate, 3 – Severe, 4 – Proliferative DR. Your task is to create an automated analysis system capable of assigning a score based on this scale.

Manish Gupta


He is also an Adjunct Faculty at the International Institute of Information Technology, Hyderabad and a visiting faculty at the Indian School of Business, Hyderabad. He received his Masters in Computer Science from IIT Bombay in 2007 and his Ph.D. from the University of Illinois at Urbana-Champaign in 2013.


INR  24000

Length:  200+ Hours

Validity:  1 year (365 days)

Course Features


Please note that the videos are not downloadable. Sharing your access or trying to sell or distribute videos is a legally punishable offence. Earlier we caught some people doing this and they were punished legally and a huge penalty was imposed on them.

Rohini kumar. M

I have read a lot of DS books before joining this course, I had difficulty in understanding the intuition behind some algorithms..after watching Manish sir’s teaching of those complex algorithms I have got a clear understanding of those algorithms thanks to Ravi sir for bringing this course to students.


The support from the team was very quick, the questions are answered within 24 hrs through mails/phone calls…This course helped in cracking many interviews in DS field..most of the questions asked during interviews were taught in this course.-


I have not seen a course which teaches both python and R required for ds.Mathematical explanations given for algorithms were simply awesome.Thanks to Manish sir for making concepts clear.


The course content is the vast and best which is more than required for a fresher to start their career in ds field. Thanks to Ravindra babu Ravula sir and Manish sir for providing such a large content. Manish sir explained most of the complex concepts with some history behind those concepts to cutting edge use cases of those concepts in industry.


Manish sir covered each and every concept from scratch. I have attended many interviews all the questions asked in the interview were covered in these course in the simplest way possible.

Priya Basu

The best part about this course was customer support and No prerequisite. I feel anyone who is interested in a data science course can take this course. Manish sir’s way of teaching complex and advanced concepts will just simple blew you away.

Anjali Thakur

I have taken many courses for ds/ml.. but this course like heaven to me. They covered complete end to end concepts in ds from web scrapping to building optimal ml models. My queries regarding concepts were solved within 24 hours. Thanks to team for making my concepts much more clear.


I am very happy with the course content and customer support provided by MLminds.Course videos connected all the dots.Thanks to Ravindrababu ravula sir and Manish sir for providing such a great lectures.

Seema Sen

After finishing the course content now I can confidently say that I can give first cut solution to most of the ml problems.Mathematical explanations given for ml algorithms were simply awesome.A huge thanks to Manish sir..


I am addicted to Manish sir’s way of teaching difficult concepts in the simplest way possible. Thanks to the team for resolving all the queries.


In my personal opinion those who are looking to change their careers to the data science field. Mlminds is a one-stop solution. Completely impressed by the Manish sir’s teaching.


I finished the course last month and now I could able to crack most of the data science interviews very easily.Thanks to Manish sir for combining your industry experience and knowledge and delivering it.


The best part about the course the Manish sir has explained every integrity details of the ml algorithms with code and the customer support provided by the team was super.


This course will definitely change the way you think about new deep learning algorithms that are evolving nowadays. Thanks to Manish sir for explaining at a very deep level of each and every algorithm.


Thanks to Manish sir for making my foundations strong in maths. Now I’m confident to learn any new ml algorithm through research papers. Thanks to the team of MLminds for patiently explaining even my tiny doubts regarding the videos.


All the old school maths I have learned during my schooling and college were been just a dots in my mind. Thanks a lot to Manish sir for connecting all those dots.