- Develop using iPython notebooks
- Understand statistical measures such as standard deviation
- Visualize data distributions, probability mass functions, and probability density functions
- Visualize data with matplotlib
- Use covariance and correlation metrics
- Apply conditional probability for finding correlated features
- Use Bayes’ Theorem to identify false positives
- Make predictions using linear regression, polynomial regression, and multivariate regression
- Understand complex multi-level models
- Use train/test and K-Fold cross validation to choose the right model
- Build a spam classifier using Naive Bayes
- Use decision trees to predict hiring decisions
- Cluster data using K-Means clustering and Support Vector Machines (SVM)
- Build a movie recommender system using item-based and user-based collaborative filtering
- Predict classifications using K-Nearest-Neighbor (KNN)
- Apply dimensionality reduction with Principal Component Analysis (PCA) to classify flowers
- Understand reinforcement learning – and how to build a Pac-Man bot
- Clean your input data to remove outliers
- Implement machine learning, clustering, and search using TF/IDF at massive scale with Apache Spark’s MLLib
- Design and evaluate A/B tests using T-Tests and P-Values
- You’ll need a desktop computer (Windows, Mac, or Linux) capable of running Enthought Canopy 1.6.2 or newer. The course will walk you through installing the necessary free software.
- Some prior coding or scripting experience is required.
- At least high school level math skills will be required.
- This course walks through getting set up on a Microsoft Windows based desktop PC. While the code in this course will run on other operating systems, we cannot provide OS-specific support for them.
Course Description By Instructor-
Data Scientists enjoy one of the top-paying jobs, with an average salary of $120,000 according to Glassdoor and Indeed. That’s just the average! And it’s not just about money – it’s interesting work too!
If you’ve got some programming or scripting experience, this course will teach you the techniques used by real data scientists and machine learning practitioners in the tech industry – and prepare you for a move into this hot career path. This comprehensive course includes over 80 lectures spanning 12 hours of video, and most topics include hands-on Python code examples you can use for reference and for practice. I’ll draw on my 9 years of experience at Amazon and IMDb to guide you through what matters, and what doesn’t.
Each concept is introduced in plain English, avoiding confusing mathematical notation and jargon. It’s then demonstrated using Python code you can experiment with and build upon, along with notes you can keep for future reference. You won’t find academic, deeply mathematical coverage of these algorithms in this course – the focus is on practical understanding and application of them. At the end, you’ll be given a final project to apply what you’ve learned!
The topics in this course come from an analysis of real requirements in data scientist job listings from the biggest tech employers. We’ll cover the machine learning and data mining techniques real employers are looking for, including:
- Deep Learning / Neural Networks (MLP’s, CNN’s, RNN’s)
- Regression analysis
- K-Means Clustering
- Principal Component Analysis
- Train/Test and cross validation
- Bayesian Methods
- Decision Trees and Random Forests
- Multivariate Regression
- Multi-Level Models
- Support Vector Machines
- Reinforcement Learning
- Collaborative Filtering
- K-Nearest Neighbor
- Bias/Variance Tradeoff
- Ensemble Learning
- Term Frequency / Inverse Document Frequency
- Experimental Design and A/B Tests
…and much more! There’s also an entire section on machine learning with Apache Spark, which lets you scale up these techniques to “big data” analyzed on a computing cluster. And you’ll also get access to this course’s Facebook Group, where you can stay in touch with your classmates.
If you’re new to Python, don’t worry – the course starts with a crash course. If you’ve done some programming before, you should pick it up quickly. This course shows you how to get set up on Microsoft Windows-based PC’s; the sample code will also run on MacOS or Linux desktop systems, but I can’t provide OS-specific support for them.
If you’re a programmer looking to switch into an exciting new career track, or a data analyst looking to make the transition into the tech industry – this course will teach you the basic techniques used by real-world industry data scientists. I think you’ll enjoy it!
- Software developers or programmers who want to transition into the lucrative data science career path will learn a lot from this course.
- Data analysts in the finance or other non-tech industries who want to transition into the tech industry can use this course to learn how to analyze data using code instead of tools. But, you’ll need some prior experience in coding or scripting to be successful.
- If you have no prior coding or scripting experience, you should NOT take this course – yet. Go take an introductory Python course first.