Problem Solving with Data Science
This course will allow you to understand how to Apply programming languages and tools, such as UNIX commands, MySQL, Python, and R, to collect, clean, process, and analyze data, and more.
Prof.
Chirag Shah
I'm an Associate Professor of Information Science and an affiliate member of Computer Science at Rutgers University. Currently I am also a Visiting Research Scientist at Spotify.
My research interests include studies of interactive information retrieval/seeking, especially those involving social and collaborative aspects. I study social media and data generated by wearable devices as kinds of signals that can help us understand and impact human behaviors. I apply them to various problems related to search, personalization, and recommendation. My work falls under and uniquely connects Computer Science, Data Science, and Information Science.
As a constant flux of rapidly growing amounts of data is created and used in industries and research environments, there is an increasing demand for individuals and professionals who are able to pursue data-driven thinking and decision-making using meaningful insight derived from large and diverse data. This course offers students a practical introduction to the field of "Data Science," and common methods for quantitative and computational analytics, through which they can have an overview of key concepts, skills, and technologies used by data scientists. While the course covers several programming languages and tools, the focus is on solving problems or "hacking". "Hacking", in this context, refers to being able to find ways to address a problem with anything and everything available to one's disposal. The students will be introduced to several real-life problems that involve collecting and analyzing data, and it is in this context of solving problems that an appropriate set of tools and programming languages, including Python, PHP, R, and MySQL, will be taught.
Before we begin...
1.1 Structured vs unstructured data
1.2 Tools for programming
1.3 UNIX Basics
1.4 PS with UNIX
2.1 Introduction to Python
2.2 Statistical Essentials with Python
3.1 Analyzing Structured Data
3.2 Statistical Analysis with Python
4.1 ML with Python - Introduction and Classification
4.2 ML with Python - Clustering
5.1 Introduction to R
5.2 ML with R - Introduction and Regression
6.1 ML with R - Classification
6.2 ML with R - Clustering
7.1 Introduction to MySQL
7.2 Accessing MySQL with Python
7.3 Accessing MySQL with R
8.1 Using Python to collect Twitter data
8.2 Sentiment analysis with Twitter data
9.1 Using Python to collect YouTube data
9.2 Analyzing YouTube data using Python
10.1 Using R to analyze Yelp data
11.1 Time series analysis
12.1 Case Study Titanic