Python for Data Science & Machine learning

Course by QcFinance.in

Skills that you will GAIN

- Python Programming Language
- Statistical Hypothesis Testing
- IPython
- Hypothesis-testing
- NetworkX
- Matplotlib
- Numpy
- Pandas
- Scipy
- Python Lambdas
- Python Regular Expressions

Python Basics

An introduction to the basic concepts of Python. Learn how to use Python both interactively and through a script. Create your first variables and acquaint yourself with Python's basic data types.

Learn to store, access and manipulate data in lists: the first step towards efficiently working with huge amounts of data.

Functions and Packages

To leverage the code that brilliant Python developers have written, you'll learn about using functions, methods and packages. This will help you to reduce the amount of code you need to solve challenging problems!

NumPy

NumPy is a Python package to efficiently do data science. Learn to work with the NumPy array, a faster and more powerful alternative to the list, and take your first steps in data exploration.

Course Syllabus

Section 1: Python Basics

Take your first steps in the world of Python. Discover the different data types and create your first variable.

Section 2: Python Lists

Get the know the first way to store many different data points under a single name. Create, subset and manipulate Lists in all sorts of ways.

Section 3: Functions and Packages & Control flow and Pandas

Learn how to get the most out of other people's efforts by importing Python packages and calling functions.

Write conditional constructs to tweak the execution of your scripts and get to know the Pandas DataFrame: the key data structure for Data Science in Python.

Section 4: Numpy and Matplotlib

Write superfast code with Numerical Python, a package to efficiently store and do calculations with huge amounts of data.

Create different types of visualizations depending on the message you want to convey. Learn how to build complex and customized plots based on real data.

Collection of powerful, open-source, tools needed to analyze data and to conduct data science. Specifically, you’ll learn how to use:

- python
- jupyter notebooks
- pandas
- numpy
- matplotlib
- git
- and many other tools.

We'll cover the machine learning and data mining techniques real employers are looking for, including:

- Regression analysis
- K-Means Clustering
- Principal Component Analysis
- Train/Test and cross validation
- Bayesian Methods
- Decision Trees and Random Forests
- Multivariate Regression
- Multi-Level Models
- Support Vector Machines
- Reinforcement Learning
- Collaborative Filtering
- K-Nearest Neighbor
- Bias/Variance Tradeoff
- Ensemble Learning
- Term Frequency / Inverse Document Frequency
- Experimental Design and A/B Tests

Statistics and Probability Refresher, and Python

- Bayes' Theorem
- Predictive Models
- Linear Regression
- Polynomial Regression
- Multivariate Regression, and Predicting Car Prices
- Multi-Level Models
- Machine Learning with Python
- Supervised vs. Unsupervised Learning, and Train/Test
- Using Train/Test to Prevent Overfitting a Polynomial Regression
- Bayesian Methods: Concepts
- Implementing a Spam Classifier with Naive Bayes
- K-Means Clustering
- Clustering people based on income and age
- Measuring Entropy
- Install GraphViz
- Decision Trees: Concepts
- Decision Trees: Predicting Hiring Decisions
- Ensemble Learning
- Support Vector Machines (SVM) Overview
- Using SVM to cluster people using scikit-learn
- User-Based Collaborative Filtering
- Item-Based Collaborative Filtering
- Finding Movie Similarities
- Improving the Results of Movie Similarities
- Making Movie Recommendations to People
- Improve the recommender's results
- More Data Mining and Machine Learning Techniques
- K-Nearest-Neighbors: Concepts
- Using KNN to predict a rating for a movie
- Dimensionality Reduction; Principal Component Analysis
- PCA Example with the Iris data set
- Data Warehousing Overview: ETL and ELT
- Reinforcement Learning
- Dealing with Real-World Data
- Bias/Variance Tradeoff
- K-Fold Cross-Validation to avoid overfitting
- Data Cleaning and Normalization
- Cleaning web log data
- Normalizing numerical data
- Detecting outliers
- –Apache Spark: Machine Learning on Big Data
- Installing Spark - Part
- Spark Introduction
- Spark and the Resilient Distributed Dataset (RDD)
- Introducing MLLib
- Decision Trees in Spark
- K-Means Clustering in Spark
- TF / IDF
- Searching Wikipedia with Spark
- Using the Spark 2.0 DataFrame API for MLLib
- Experimental Design
- A/B Testing Concepts
- T-Tests and P-Values
- Hands-on With T-Tests
- Determining How Long to Run an Experiment
- A/B Test Gotchas

Please email info@qcfinance.in to know more information.

Some links from online search:

www.skilledup [dot] com/articles/list-data-science-bootcamps

Generalassemb[dot]ly/education/data-science

Some general Videos That are suggested:

https://www.youtube.com/playlist?list=PL5-da3qGB5ICCsgW1MxlZ0Hq8LL5U3u9y

## No comments:

## Post a Comment