Data Science for Professionals - 7 days
Data Science This workshop is a geared towards business executives and managers who are interested in finding out how data science and machine learning can help their business take the next step. You will be introduced to technologies like product/service recommendation systems, natural language processing, Hadoop, Spark and Splunk. As an executive and manager, you will learn how to understand, interpret and visualize data using python, while dealing with variables and missing values. We will teach you how to come to sound conclusions about your data, despite some real-world challenges. By the end of this course, you will have an understanding of applied predictive modeling methods, and the know how to use existing machine learning algorithms in Python. This will allow you to lead and work with team members on a data science projects, find problems, and come up solutions. This is a 7-day intensive course with hands-on work. The course finishes with a takeaway project. Takeaways In this workshop you’ll learn an in-depth process of Data Science : Collect data from a variety of sources (e.g., Excel, web scraping, APIs and others) Explore large data sets Learn to understand and use Python for executing Data Science Projects Understand recommendation systems and natural language processing Know how to create data visualization to communicate your message This is a very practical and hands-on workshop that has lots of class exercises. Through this course, we strive to make you fully equipped to become a leader who can execute full-fledged Data Science projects. Day 1 The Basics In our first class, we will go over some basic data collection and data exploration fundamentals while introducing the packages that will be covered over the course and how to install them. Day 2 Fundamental Modeling Techniques We will start by introducing NumPy and Pandas and showcasing how to clean, manipulate, and analyze data. Students will practice on the Titanic dataset before moving onto web scraping techniques and extracting data from APIs. Day 3 Advanced Modeling Techniques and Analytics We will begin by reviewing NumPy and Pandas before delving deeper into more advanced techniques to clean and munge data. Using Matplotlib and Seaborn packages, students will learn to visualize data and identify trends. Day 4 Data Mining and Machine Learning We will be introducing the Cross Industry Standard Process for Data Mining (CRISP-DM) and data mining with supervised learning and unsupervised learning. Afterwards, students will explore machine learning algorithms such as Regression (Linear, Multivariable, and Logistic), Naïve Bayes, Decision Trees, and Clustering. Day 5 Recommendation Systems Students will review machine learning concepts and will start by building their own recommendation system with a MovieLens dataset, understanding dimension reduction with Principal Component Analysis, exploring Support Vector Machines, and learning A/B Testing with T-Tests and P-Values. Day 6 Natural Language Processing and Sentiment Analysis Students will explore the Natural Language Toolkit to process and extract text data. Students will then start a Natural Language Processing project with Yelp data before we move onto Sentimental Analysis to predict positive versus negative Yelp reviews. Day 7 Big Data with Spark Students will be introduced to Big Data and data engineering with the Hadoop ecosystem, the MapReduce paradigm, and the up-and-coming Apache Spark. Prereqs & Preparation Bring a laptop and install Anaconda, which is a free package that includes python and a number of tools that will be used in class (http://continuum.io/downloads). This course does not require any background in programming or data science. Price $4950 includes airfare (airfare limited to a maximum of $500), accommodation, and training.