## Friday, March 15, 2019

### Machine Learning Topics

Error functions, how to minimize errors (gradient descent)

What is alpha
Gradient descent keeps changing the Parameters to reduce the cost function gradually. With each iteration we shall come closer to a minimum. With each iteration the parameters must be adapted simultaneously! The size of a “step”/iteration is determined by the parameter alpha (the learning rate).
https://towardsdatascience.com/machine-learning-basics-part-1-concept-of-regression-31982e8d8ced
Partial Derivative Function

How to tune algorithms
Add parameters - time series, lags

Regularization - ridge, lasso
minimization of coeffeiences

When to use what
Process, steps, examples of data prep
https://scikit-learn.org/stable/modules/preprocessing.html

Scaling, one-hot encoding, outliers

One hot encoding is a process by which categorical variables are converted into a form that could be provided to ML algorithms to do a better job in prediction

Error functions, how to minimize errors (gradient descent)
What is alpha
How to tune algorithms
Regularization - ridge, lasso
When to use what
Process, steps, examples of data prep
Scaling, one-hot encoding, outliers

Demantra | Oracle Products - to predict demand using Time series modeling (Lags, Dummy Variables, Time Series)
Gradient descent - Derivation, partial differentiation
PCA analysis - Derivation
supply chain management machine learning
ARIMA Model for Time Series Forecasting in Python

Exploratory data analysis with Spark

Cholesky transform
https://en.wikipedia.org/wiki/Cholesky_decomposition
Lower diagnol matrix

We are going to calculate a matrix that summarizes how our variables all relate to one another.
We’ll then break this matrix down into two separate components: direction and magnitude.

https://towardsdatascience.com/a-one-stop-shop-for-principal-component-analysis-5582fb7e0a9c

https://www.kaggle.com/nishantbigdata/exploratory-data-analysis-with-spark

## Thursday, February 14, 2019

### Amazon Interview Questions

Amazon Interview Questions

Maven GIT

RDD vs Dataframe

Spark Partition

Data Lakes on Amazon

http://geeks-plug-in.blogspot.com/2016/09/cloudera-manager-installation-on-ubuntu.html
http://cloudera-cluster.blogspot.com/

## Wednesday, February 6, 2019

### CCA Spark and Hadoop Developer Exam (CCA175)

https://github.com/xuezhizeng/CCA175-Exam-Preparation/
https://github.com/Anoosha-Shetty/CCA-175
https://github.com/Sailendra-R-D/Prep-Resource-CCA175
https://github.com/smartlin5228/CCA175
https://github.com/Roshan4u/cca175
https://github.com/write2sivakumar/cca175
https://github.com/okmich/cca175notes
https://github.com/write2sivakumar/cca175

## Wednesday, January 30, 2019

### Sales Force Exam Prep NYC

Sales Force Exam Prep NYC

http://salesforce-401-dumps.blogspot.com/2015/06/starting-salesforce-development.html

http://docshare.tips/salesforce-dumps_58dcbfdfee3435da3e991693.html

## Wednesday, January 2, 2019

### Python Analytics Individual Looking in NYC (Freelancing, tutoring)

Python Analytics Individual Looking in NYC (Freelancing, tutoring)

Developed various industry based courses in Data Analytics & Data Science and have connections and linkages to a good network (with tutors) in USA and India for best outcome to global learners and efficient project delivery in Python Data Science and Big Data Hadoop Project.

My training and projected theme is centered around projects, for example, your portfolio or even themes you are doing at work.
This is very different from the repetitive and common courses given by other tutors with a fixed syllabus.
The outcome of such engagement is a product you can use at your work or real life implementation.

I am an Electrical Engineering graduate, GAARP-FRM certified, PG Dip in Fin analysis and Risk Management, PG Diploma in IT, cleared CFA L1, International - MBA (15-16) and MS Information System NY.

I am working as senior instructor & consultant with three boot camps in NYC.
I have experience of tutoring Analytics using SQL, Excel, VBA, Python, R and matlab for Financial, risk, trading, and other applications. matlab for Finance has been by flagship course with the last organization.

Earlier in my career I have worked and also developed several passive recorded courses in analytics.
I have developed courses on Excel optimization to make Excel programming more robust and make excel do the best for you.

Eight years of Tutoring experience with various organizations for different audience based out of US, India and East Asia.
Have learned who learned Analytics from various banks and companies.
Have a satellite team based out of India to support your questions that I might not be able to answer.

Developed several courses on Udemy - 6 courses with around 20k learners.

Available only in NYC (no remote option)

## Friday, September 7, 2018

### Python / SQL / VBA-Excel Private 1 on 1 Tutor NYC Analytics

New York Python SQL Bootcamp Coding Classes (Affordable & Cost-effective Machine Learning). Best Free classes in NYC. SQL 101 & Python 101 Classes. Big Data Science Classes for beginners interested in Analytics & Data Science. Weekend part time and full time classes in Manhattan & Queens. 1 on 1 Tutoring also available. Free weekend 2hrs class. Small group courses (2-3 attendees), free takes and 1 on 1 : Python 101, Python Data Science Immersive Python for Data Analytics. VBA Macros Immersive. SQL 1 day Class.Project and Portfolio Oriented on weekends and also free evening classes in NYC. Upload your portfolio to get better job. Best Python Class in NYC. FREE RETAKES.

--------------------------------------------------------------------------------------------------------
Python 101 Intro to Python
Create Azure Notebook Account (15 minutes)
Intro to common terminology (AWS, Jupyter, Azure Notebook)
Hello World Practice, Variables, data types, functions, loops
Questions (Intro to adv features Part 2)
Python 102
Numpy & Pandas
Importing from different sources, Cleaning Data & Handling Missing Data
Data Wrangling - Group by, Joins and Pivot
Python 103
Intro to common Datasets (15 minutes)
Visualization from Matplotlib
User defined Functions and in built commands for Wrangling
Applying all to your dummy dataset / online dataset
* Easy Revision from the notebook
-----------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------
Python 101 and Object Oriented Python Advanced Python 102 SQL Basics
Machine Learning Fundamentals Scikit Learn 101
EDA Charting using Matplot, Seabourne and Pyplot 101
Pandas for Analytics (SQL and Excel equivalence) 101
Regression and Logistic Regression - Python and the Math Behind 101
SV, Stochastic Gradient Descent, Naive Bayes Classification
Decision Trees and Random Forest Ensemble Models 101
Unsupervised Learning Clustering K-means Neural Network 101
Dimension Reduction using PCA, Lasso and Ridge 101
Big Data Hadoop Spark Mapreduce 101
Natural Language Processing 101
Web Scraping Python using beautifulsoup and selenium web driver 101
Tensor Flow and Keras 101

Full day Course schedule:
Sunday Python Part 1 & Part 2
Monday: Blockchain
Tuesday: Pandas Data Analytics & Wrangling
Wed: Hadoop Big Data / SQL
Thursday: VBA Macro
Friday: Machine Learning
Saturday: Hadoop Big Data / SQL
https://www.meetup.com/New-York-Python-SQL-Bootcamp-Data-Science-Analytics/

http://blockchainainyc.com/

BAINYC Math Matters Python / SQL / VBA Classes in Queens, NYC
www.bainyc.com
1 on 1 Private Tutoring at \$29 per hour

Tutoring Venue::
Mathmatter Tutoring
3707 74 Street 3rd Fl Suite 7 · Jackson Heights, ny 11372
How to find us: Roosevelt Ave, Jackson heights subway

Small group courses:

Python 101 \$49
3 hours
Print Hello World Azure Notebooks & Anaconda Book and Content Functions (Arguments and Return) Loops (For While) If else List/Dictionary

Intro to Python (Intermediate) \$174
9 hours in total, 3 sessions/days of 3 hrs
Print Hello World Azure Notebooks & Anaconda Book and Content Functions (Arguments and Return) Loops (For While) If else List/Dictionary
Nested Loops with if else List/Dictionary (JSON) Class Lambda Functions List Comprehension

Python Immersive \$499
35 hours in total, 5 sessions/days of 7 hours each

Print Hello World Azure Notebooks & Anaconda Book and Content Functions (Arguments and Return) Loops (For While) If else List/Dictionary
Nested Loops with if else List/Dictionary (JSON) Class Lambda Functions List Comprehension
File Handling Web Scraping Exception handling SQLite Python
Capstone Project for Github Portfolio

Python for Data Science \$499
35 hours in total 5 sessions/days of 7 hours each
Print Hello World Azure Notebooks & Anaconda Book and Content Functions (Arguments and Return) Loops (For While) If else List/Dictionary
Nested Loops with if else List/Dictionary (JSON) Class Lambda Functions List Comprehension
Pandas Plotting Time Series Data Cleaning
Capstone Project for Github Portfolio

Excel VBA 101 / SQL 101 \$49
3 hrs / 1 Session / Days
\$49

Introduction to Excel VBA / Introduction to SQL (Intermediate) \$174
9 hours, 3 sessions / days

Excel VBA Immersive / SQL Immersive (Advanced) \$499
35 Total hours, 5 Sessions / Days

Also available* at:
Practical Programming
Address: 115 W 30th St 5th fl, New York, NY 10001
Email me for prices.

To register for classes:: https://www.meetup.com/New-York-Python-SQL-Bootcamp-Data-Science-Analytics/

Notebook:
https://notebooks.azure.com/shivgan3/libraries/PythonClassesNYCBootcamp

PPT:

Instructor
Shivgan Joshi
929 356 5046

## Tuesday, September 4, 2018

### QcFinance Indore India

How many times have you seen problems or opportunities for improvement in your workflow and promised to fix them if you only had the time or if you only could train my employees with such expertise?

Unforeseen, undocumented, forgiven, hidden cost of issues like below are bringing your valuations down:
1.    Poorly-understood, legacy framework mostly into excel—dilapidated on unknown, unstated assumptions
2.    Slow process of calculations
3.    Un-scalable excel models that are difficult to maintain, improve, replicate or extend
4.    Increased pressure, requiring added reporting and enhanced auditing
5.    Lack of in-house research resources for SQL,Python and R – both of which are open source (free – with no extra burden)
6.    Over-reliance on external novel untested software from vendors
QCFin provide solutions for such hurdles, working quickly, properly, and thoroughly to create more value in your employers and finally an increased valuation for your firm.

Our company qcfinance.in, Indore specializes in developing advanced modeling, simulation, data analysis and visualization software for clients. Using the MATLAB, R, Python suite of numerical analysis software from The MathWorks, together with other technologies such as C++ and VBA, allows us to solve a wide range of design and implementation problems for our clients including risk analysis in active and passive strategies.

Leveraging expert knowledge in areas such as computational finance, we have immense experience in delivering customer specific requirements from clients across the nations.

Get support in Research Writing, Software Development & Back testing.

Beside this, we also provide training for CFA, FRM, CQF, BAT, other Finance Exams. We also provide consultancy on MATALB, R, VBA, SQL, Python, MongoDB, SAS.

Sub topics in Excel VBA Bootcamp:
Basics of Excel: Vlookups, Hlookups, Index Match, Dependents, Data tables (1 way and 2 ways), Pivots, Charting, Filters, SQL integration, VBA coding, Address and indirect, Offset, Array functions, etc.
Applications: Regression, Histograms, Monte carlo simulation, rank correlation, dashboards, more.
Automation using VBA: Loops (for, do while, case), Recorder, Arrays and Matrices, if else, indexing, etc.

Automate and Debug Office Tasks using Excel VBA / Python

Tired of Excel and Access? Move to Python / SQL and if needed little VBA.

Check out portfolio at website, youtube channel.

PYTHON PANDAS FLASK MATPLOTLIB / VBA / SQL FOCUSED
*** Automatic emails sent utilizing excel with the ability to send 100s at a time.
*** Python Pandas code to play with CSV to reduce daily processes spent on timely tasks.
*** Customized, automated, & streamlined reporting from small and large data sets using Pandas, Matplot, Flask websites in Python

FINANCE FOCUSED
*** Financial modeling and analysis
*** Quoting and pricing tools that incorporate the use of forms

SQL Tables
SQL Queries
SQL Power Pivots Reports
VBA (Visual Basic for Applications)
Importing and Exporting Data in SQL, VBA, Python seamlessly
Merging Data To MS Word from Python, SQL or Flask
Linked SQL Server Data / Python / PostGRESQL to Excel
Data Migration

Office in NYC and India.
Analytics company in operations in India from over 8 year.
Consultant with exp in Banks and Hedge Funds in India, Israel and USA.

## Monday, August 27, 2018

### SQL Installation MySQL Workbench

----Start of MYSQL---
#switch to main user
su - joshi
sudo apt-get update
cd
sudo apt-get install mysql-server

sudo mysql_secure_installation

----Installing Workbech---

sudo mysql

\$sudo apt update && sudo apt upgrade
\$sudo apt install mysql-workbench
\$sudo mysql-workbench

----------------------------------

To run anytime

\$sudo mysql-workbench

## Saturday, July 21, 2018

### Python Data Science 101 Bootcamp (Beginners and Non Programmers) 6 hrs PAID \$65 Python Data Science Machine Learning Bootcamp NYC

New York Python SQL Bootcamp Coding Classes (Affordable & Cost-effective Machine Learning). Best Free classes in NYC. SQL 101 & Python 101 Classes. Big Data Science Classes for beginners interested in Analytics & Data Science. Weekend part time and full time classes in Manhattan & Queens. 1 on 1 Tutoring also available. Free weekend 2hrs class.

https://www.meetup.com/New-York-Python-SQL-Bootcamp-Data-Science-Analytics/events/251782354/

Python Data Science Machine Learning Bootcamp NYC

The course is developed for non programmers and non stat audience.
It consist of games, graphics, and examples to sensitize you to the terms used in Data Science.

Check out our PPT and Jupyter Notebook for 1st Session:

https://notebooks.azure.com/shivgan3/libraries/PythonClassesNYCBootcamp

Group size is 5.

This course is prerequisite for Part 2.

Part 1 / 2
Two day intensive boot camp for Python Data Science Enthusiast.

Topics:
Introduction to Python
Foundations of programming: Python built-in Data types
Concept of mutability and theory of different Data structures
Control flow statements: If, Elif and Else
Definite and Indefinite loops: For and While loops
Writing user-defined functions in Python
Classes in Python
Read and write Text and CSV files with python
List comprehensions and Lambda
How to start using Python
Parsing information with Python
Practice Python to solve the real-world tasks

Skills that you will GAIN while working on the course are:

Python Programming Language
Statistical Hypothesis Testing
IPython
Hypothesis-testing
Matplotlib
Numpy
Pandas
Scipy
Python Lambdas
Python Regular Expressions

Collection of powerful, open-source, tools needed to analyze data and to conduct data science. Specifically, you’ll learn how to use:

python
jupyter anaconda notebooks
pandas
numpy
matplotlib
git
and many other tools.

We’ll cover the machine learning and data mining techniques are used for in a simple example in Python:

Regression analysis
K-Means Clustering
Principal Component Analysis
Train/Test and cross validation
Bayesian Methods
Decision Trees and Random Forests
Multivariate Regression
Multi-Level Models
Support Vector Machines
Reinforcement Learning
Collaborative Filtering
K-Nearest Neighbor
Ensemble Learning
Experimental Design and A/B Tests

Joshi

## Thursday, June 14, 2018

https://staff.brighton.ac.uk/is/Published%20Documents/Excel%20Formulae%20and%20Functions%20(PC)%20QRC.pdf

Takeaways
The workshop will include the following topics:
Build SQL Database in the Cloud (using Amazon Web Services)
Import and Export of Data
Create and Manage Users
Advanced Querying Techniques (e.g., case statements, extract, union)

Meta Classes

Abstract Class

Unit Testing Python

global optimization for spark

Decorators

Inheritance in Python

Object oriented techniques.
- Iterators, Generators and Closures.
- Exception handling.

__new__ vs __init__
dunder

https://docs.python.org/3/reference/datamodel.html

https://www.learnpython.org/en/Closures

decorators and functional programming

james powell python

david beazley generators

Below are all the questions that one should study for the pedantic interviews

Functional Programming

Notice this, you can create functions on runtime, more or less like lambdas in c++. So basically you are iterating over a list, making n take values 1,2 and 3

for n in [1, 2, 3]:
def func(x):
return n*x
so, by each iteration you are building a function named func, with takes a value and multiplies it for n. By appending it to the functions list you will have this functions stored, so you can iterate over the list to call the functions.

[function(2) for function in functions]

For decorators:
higher order function in Python

Callback:

Pass-by reference vs Pass by value:

Q3. What is the difference between deep and shallow copy?
Ans: Shallow copy is used when a new instance type gets created and it keeps the values that are copied in the new instance. Shallow copy is used to copy the reference pointers just like it copies the values. These references point to the original objects and the changes made in any member of the class will also affect the original copy of it. Shallow copy allows faster execution of the program and it depends on the size of the data that is used.

Deep copy is used to store the values that are already copied. Deep copy doesn’t copy the reference pointers to the objects. It makes the reference to an object and the new object that is pointed by some other object gets stored. The changes made in the original copy won’t affect any other copy that uses the object. Deep copy makes execution of the program slower due to making certain copies for each object that is been called.

How is Multithreading achieved in Python?
Ans:

Python has a multi-threading package but if you want to multi-thread to speed your code up, then it’s usually not a good idea to use it.
Python has a construct called the Global Interpreter Lock (GIL). The GIL makes sure that only one of your ‘threads’ can execute at any one time. A thread acquires the GIL, does a little work, then passes the GIL onto the next thread.
This happens very quickly so to the human eye it may seem like your threads are executing in parallel, but they are really just taking turns using the same CPU core.
All this GIL passing adds overhead to execution. This means that if you want to make your code run faster then using the threading package often isn’t a good idea.

How is memory managed in Python?
Ans:

Memory management in python is managed by Python private heap space. All Python objects and data structures are located in a private heap. The programmer does not have access to this private heap. The python interpreter takes care of this instead.
The allocation of heap space for Python objects is done by Python’s memory manager. The core API gives access to some tools for the programmer to code.
Python also has an inbuilt garbage collector, which recycles all the unused memory and so that it can be made available to the heap space.

Explain Inheritance in Python with an example.
Ans: Inheritance allows One class to gain all the members(say attributes and methods) of another class. Inheritance provides code reusability, makes it easier to create and maintain an application. The class from which we are inheriting is called super-class and the class that is inherited is called a derived / child class.

They are different types of inheritance supported by Python:

Single Inheritance – where a derived class acquires the members of a single super class.
Multi-level inheritance – a derived class d1 in inherited from base class base1, and d2 are inherited from base2.
Hierarchical inheritance – from one base class you can inherit any number of child classes
Multiple inheritance – a derived class is inherited from more than one base class.

List out the inheritance styles in Django.
Ans: In Django, there is three possible inheritance styles:

Abstract Base Classes: This style is used when you only wants parent’s class to hold information that you don’t want to type out for each child model.
Multi-table Inheritance: This style is used If you are sub-classing an existing model and need each model to have its own database table.
Proxy models: You can use this model, If you only want to modify the Python level behavior of the model, without changing the model’s fields.

Explain the use of decorators.
Ans: Decorators in Python are used to modify or inject code in functions or classes. Using decorators, you can wrap a class or function method call so that a piece of code can be executed before or after the execution of the original code. Decorators can be used to check for permissions, modify or track the arguments passed to a method, logging the calls to a specific method, etc.

Question 4
Python and multi-threading. Is it a good idea? List some ways to get some Python code to run in a parallel way.

There are reasons to use Python's threading package. If you want to run some things simultaneously, and efficiency is not a concern, then it's totally fine and convenient. Or if you are running code that needs to wait for something (like some IO) then it could make a lot of sense. But the threading library won't let you use extra CPU cores.

Multi-threading can be outsourced to the operating system (by doing multi-processing), some external application that calls your Python code (eg, Spark or Hadoop), or some code that your Python code calls (eg: you could have your Python code call a C function that does the expensive multi-threaded stuff).

## Question 10

Consider the following code, what will it output?
``````class A(object):
def go(self):
print("go A go!")
def stop(self):
print("stop A stop!")
def pause(self):
raise Exception("Not Implemented")

class B(A):
def go(self):
super(B, self).go()
print("go B go!")

class C(A):
def go(self):
super(C, self).go()
print("go C go!")
def stop(self):
super(C, self).stop()
print("stop C stop!")

class D(B,C):
def go(self):
super(D, self).go()
print("go D go!")
def stop(self):
super(D, self).stop()
print("stop D stop!")
def pause(self):
print("wait D wait!")

class E(B,C): pass

a = A()
b = B()
c = C()
d = D()
e = E()

# specify output from here onwards

a.go()
b.go()
c.go()
d.go()
e.go()

a.stop()
b.stop()
c.stop()
d.stop()
e.stop()

a.pause()
b.pause()
c.pause()
d.pause()
e.pause()
``````

The output is specified in the comments in the segment below:
``````a.go()
# go A go!

b.go()
# go A go!
# go B go!

c.go()
# go A go!
# go C go!

d.go()
# go A go!
# go C go!
# go B go!
# go D go!

e.go()
# go A go!
# go C go!
# go B go!

a.stop()
# stop A stop!

b.stop()
# stop A stop!

c.stop()
# stop A stop!
# stop C stop!

d.stop()
# stop A stop!
# stop C stop!
# stop D stop!

e.stop()
# stop A stop!

a.pause()
# ... Exception: Not Implemented

b.pause()
# ... Exception: Not Implemented

c.pause()
# ... Exception: Not Implemented

d.pause()
# wait D wait!

e.pause()
# ...Exception: Not Implemented
``````

### Why do we care?

Because OO programming is really, really important. Really. Answering this question shows your understanding of inheritance and the use of Python's `super` function. Most of the time the order of resolution doesn't matter. Sometimes it does, it depends on your application.

## Question 11

Consider the following code, what will it output?
``````
class Node(object):
def __init__(self,sName):
self._lChildren = []
self.sName = sName
def __repr__(self):
return "<Node '{}'>".format(self.sName)
def append(self,*args,**kwargs):
self._lChildren.append(*args,**kwargs)
def print_all_1(self):
print(self)
for oChild in self._lChildren:
oChild.print_all_1()
def print_all_2(self):
def gen(o):
lAll = [o,]
while lAll:
oNext = lAll.pop(0)
lAll.extend(oNext._lChildren)
yield oNext
for oNode in gen(self):
print(oNode)

oRoot = Node("root")
oChild1 = Node("child1")
oChild2 = Node("child2")
oChild3 = Node("child3")
oChild4 = Node("child4")
oChild5 = Node("child5")
oChild6 = Node("child6")
oChild7 = Node("child7")
oChild8 = Node("child8")
oChild9 = Node("child9")
oChild10 = Node("child10")

oRoot.append(oChild1)
oRoot.append(oChild2)
oRoot.append(oChild3)
oChild1.append(oChild4)
oChild1.append(oChild5)
oChild2.append(oChild6)
oChild4.append(oChild7)
oChild3.append(oChild8)
oChild3.append(oChild9)
oChild6.append(oChild10)

# specify output from here onwards

oRoot.print_all_1()
oRoot.print_all_2()
``````

`oRoot.print_all_1()` prints:
``````<Node 'root'>
<Node 'child1'>
<Node 'child4'>
<Node 'child7'>
<Node 'child5'>
<Node 'child2'>
<Node 'child6'>
<Node 'child10'>
<Node 'child3'>
<Node 'child8'>
<Node 'child9'>
``````

`oRoot.print_all_2()` prints:
``````<Node 'root'>
<Node 'child1'>
<Node 'child2'>
<Node 'child3'>
<Node 'child4'>
<Node 'child5'>
<Node 'child6'>
<Node 'child8'>
<Node 'child9'>
<Node 'child7'>
<Node 'child10'>
``````

### Why do we care?

Because composition and object construction is what objects are all about. Objects are composed of stuff and they need to be initialised somehow. This also ties up some stuff about recursion and use of generators.
Generators are great. You could have achieved similar functionality to `print_all_2`by just constructing a big long list and then printing it's contents. One of the nice things about generators is that they don't need to take up much space in memory.
It is also worth pointing out that `print_all_1` traverses the tree in a depth-first manner, while `print_all_2` is width-first. Make sure you understand those terms. Sometimes one kind of traversal is more appropriate than the other. But that depends very much on your application.

## Tuesday, June 12, 2018

### Big Data hadoop part 1

Bring something from hadoop fs and then do spark

https://spark.apache.org/examples.html

Spark SQL

Spark Streaming to process a live data stream

Python Map reduce program:

Hive create tables

Impala, Partitioning in Hive, re-doing partitioning local and external, file formats - tab

Scoop and Hive

Spark Streaming API can consume from sources like Kafka ,Flume, Twitter source to name a few. It can then apply transformations on the data to get the desired result which can be pushed further downstream.

Connecting Kafka with Streaming Spark API

import java.io.IOException;
public class MaxTemperatureMapper  extends Mapper<LongWritable, Text, Text, IntWritable> {
private static final int MISSING = 9999;    @Override  public void map(LongWritable key, Text value, Context context)      throws IOException, InterruptedException {        String line = value.toString();    String year = line.substring(15, 19);    int airTemperature;    if (line.charAt(87) == '+') { // parseInt doesn't like leading plus signs      airTemperature = Integer.parseInt(line.substring(88, 92));    } else {      airTemperature = Integer.parseInt(line.substring(87, 92));    }    String quality = line.substring(92, 93);    if (airTemperature != MISSING && quality.matches("[01459]")) {      context.write(new Text(year), new IntWritable(airTemperature));    }  } }

## Monday, May 21, 2018

### SQL Commands

GRANT SELECT ON OBJECT::dbo.Table1 TO Kalyan;
GRANT INSERT ON OBJECT::dbo.Table1 TO Kalyan;
GRANT UPDATE ON OBJECT::dbo.Table1 TO Kalyan;
GRANT DELETE ON OBJECT::dbo.Table1 TO Kalyan;

## Wednesday, April 18, 2018

### Advanced Topics in Python Programming (Not Scripting)

New York Python SQL Bootcamp Coding Classes (Affordable & Cost-effective Machine Learning). Best Free classes in NYC. SQL 101 & Python 101 Classes. Big Data Science Classes for beginners interested in Analytics & Data Science. Weekend part time and full time classes in Manhattan & Queens. 1 on 1 Tutoring also available. Free weekend 2hrs class. Small group courses (2-3 attendees), free takes and 1 on 1 : Python 101, Python Data Science Immersive Python for Data Analytics. VBA Macros Immersive. SQL 1 day Class.Project and Portfolio Oriented on weekends and also free evening classes in NYC. Upload your portfolio to get better job. Best Python Class in NYC. FREE RETAKES.

Object Oriented and Functional Programming Python

• Different type of Inheritance in Python
• Generators, Iterators, Decorators, and Context Managers
• __new__ vs __init__
• Dunder
• global optimization for spark
• Python Generators and Iterator Protocol
• Python Meta-programming
• Python object model (at least a passing understanding of metaclasses, slots, and descriptors, as well as how inheritance works), bonus points for recent additions like __prepare__ and __init_subclass__
• Python's ABCs and inferred types (ie. Iterable, Iterator, Generator, etc
• Standard library: math, itertools, functools, random, collections, logging, sys, os, and threading/multiprocessing/asyncio

• Meta Classes
A metaclass is the class of a class. Like a class defines how an instance of the class behaves, a metaclass defines how a class behaves. A class is an instance of a metaclass.
A metaclass is most commonly used as a class-factory. Like you create an instance of the class by calling the class, Python creates a new class (when it executes the 'class' statement) by calling the metaclass

Ref: https://stackoverflow.com/questions/100003/what-are-metaclasses-in-python

• Abstract Class
Abstract classes: Force a class to implement methods. Abstract classes can contain abstract methods: methods without an implementation. Objects cannot be created from an abstract class. A subclass can implement an abstract class.

• Generators, Iterators, Decorators, and Context Managers
A decorator is a function that takes a function as an argument and returns a function as a return value.
an_iterator.__iter__()
itertools is a collection of utilities that make it easy to build an iterator that iterates over sequences in various common ways.
Generators give you the iterator immediately: no access to the underlying data ... if it even exists.
Context Managers: You can encapsulate the setup, error handling and teardown of resources in a few simple steps. The key is to use the with statement.

• Unit Testing Python
Python standard library is called unittest. The principles of unittest are easily portable to other frameworks, like:

1. unittest
2. nose or nose2
3. pytest

pytest supports execution of unittest test cases. The real advantage of pytest comes by writing pytest test cases. pytest test cases are a series of functions in a Python file starting with the name test_.

• Decorators

Decorators in Python are used to modify or inject code in functions or classes. Using decorators, you can wrap a class or function method call so that a piece of code can be executed before or after the execution of the original code. Decorators can be used to check for permissions, modify or track the arguments passed to a method, logging the calls to a specific method, etc.

• How is Multithreading achieved in Python? Why is it a bad idea?

• Why a list comprehension is faster than a for loop (which really is to say understand how bytecode is generated, at high level)

https://stackoverflow.com/questions/22108488/are-list-comprehensions-and-functional-functions-faster-than-for-loops

Reference: Advanced Python 3 Programming Techniques By Mark Summerfield

----------------------------------------------------------------------------

Design Pattern - Using decorators, constructors, classes and data structures in Python
Using Flask framework in the same way as React using the same folder config and other settings. In place of JS we will use Python
Functional Programming in Python and passing on functions in a function. More list comprehensions.

__init__

single underscore vs double underscore

Python Generators and Iterator Protocol
Python Meta-programming
Python Descriptors
Python Decorators (class and method based)
Python Buffering Protocol
Python Comprehensions
Python GIL and multiprocessing and multithreading
Python WSGI protocol
Python Context Managers
Python Design Patterns

System Programming (pipes, threads, forks etc.)
Graph Theory (pygraph, Networkx etc)
Polynomial manipulation using python
Linguistics (FSM, Turing manchines etc)
Numerical Computations with Python
Creating Musical Scores With Python
Databases with Python
Python Generators and Iterator Protocol
Python Meta-programming
Python Descriptors
Python Decorators (class and method based)
Python Buffering Protocol
Python Comprehensions
Python GIL and multiprocessing and multi-threading
Python WSGI protocol
Python Context Managers
Python Design Patterns

Third party libraries aside here are some:
metaclasses
writing decorators, generators, iterators
writing context managers
C/c++ extensions
Multiprocessing

• Understand the python object model (at least a passing understanding of metaclasses, slots, and descriptors, as well as how inheritance works), bonus points for recent additions like `__prepare__` and `__init_subclass__`, but also simpler things like when `__new__` is useful
• Understand python's ABCs and inferred types (ie. Iterable, Iterator, Generator, etc.)
• Understand the c-level data model (ie. at a high level how an int, a list, and a dict are laid out in memory), bonus points if they are actually correct about the way a dict works in cpython, but simply understanding how an unoptimized dict would work is fine.
• Know why a list comprehension is faster than a for loop (which really is to say understand how bytecode is generated, at high level)
• Advanced unittesting. Mocks, patches, possibly a more advanced library like pytest
• Working knowledge of recent features (async/await, type hints)
• A decent knowledge of the important parts of the standard library: `math`, `itertools`, `functools`, `random`, `collections`, `logging`, `sys`, `os`, and `threading`/`multiprocessing`/`asyncio` (I realize these aren't the same, but still). That is, I'd expect a senior dev to know what `contextlib.contextmanager`, `functools.wraps`, and `itertools.chain` were, and when/why one might want to use them. No need to know every function, but where to look at least.
• A decent knowledge of some non-standard library modules in the domain. This would highly depend on the field, but scipy stack, django/flask/sqla/jinja2, etc.
• Know at least one sane way to manage environments. This could be a bare venv, or it could be a docker based solution, or a combination, or pipenv, but something
Coroutines (not just generators)
Decorators
C/Cython extensions
Data structures
Ability to debug and profile code
Tests

Section 1: What This Short Cut Covers 3
Section 2: Branching Using Dictionaries 4
Section 3: Generator Expressions and Functions 5
Section 4: Dynamic Code Execution 9
Section 5: Local and Recursive Functions 16
Section 6: Function and Method Decorators 21
Section 7: Function Annotations 25
Section 8: Controlling Attribute Access 27
Section 9: Functors 31
Section 10: Context Managers 33
Section 11: Descriptors 37
Section 12: Class Decorators 42
Section 13: Abstract Base Classes 45
Section 14: Multiple Inheritance 52
Section 15: Metaclasses 54
Section 16: Functional-Style Programming 59
Section 17: Descriptors with Class Decorators 63
Section 18: About the Author 65

https://python.swaroopch.com/oop.html
https://jakevdp.github.io/blog/2012/12/01/a-primer-on-python-metaclasses/
http://blog.thedigitalcatonline.com/blog/2014/10/14/decorators-and-metaclasses/

### SQL security

--authentication and is associated
--with a windows security group
--access views to verify that
SELECT * FROM   sys.server_principals
create a login for a specific windows user
--create database users and database roles
--first activate the database
USE touropharmacy
--find out who is connected now
SELECT Suser_name()
--set up a user associated with
CREATE USER [TPDoctors] FOR login [TC\TP_Doctors]
--set up a user associated with
CREATE USER [MD1] FOR login [TC\md1]
--execute as user = 'MD1'
--create a server role
USE mastergo
--create server role
CREATE server role [dbOnlyCreator]
--view the types of permissions available on the server level
SELECT *FROM   sys.Fn_builtin_permissions('SERVER')
--view the permissions granted to dbcreator
EXEC Sp_srvrolepermission   @srvrolename = 'dbcreator'
--assign a server level permission to a login
GRANT CREATE any DATABASE TO dbonlycreator
to view the explicit permissions granted to a server loginSELECT     *
FROM       sys.server_principals PR
INNER JOIN sys.server_permissions PER
ON         PR.principal_id = per.grantee_principal_id
USE touropharmacy
--create a database role
CREATE role doctorrole
--assign database level permission to doctor role
SELECT * FROM   sys.Fn_builtin_permissions('DATABASE')GRANT
SELECT to doctorrole

--assign schema level permission to doctor role
DENY SELECT ON SCHEMA::sales TO doctorrole
--assign table level permission
DENY SELECT ON hr.job TO doctorrole
--assign object level permission
DENY SELECT ON object::hr.physician(dr_licenseid) TO doctorrole
--add doctor user as a member of DoctorRole
ALTER role doctorrole ADD member tpdoctors

GO
DENY SELECT ON [Production].[ScrapReason] ([ModifiedDate]) TO [productionofficer.awuser]
GO
GO
GRANT SELECT ON [Production].[ScrapReason] ([Name]) TO [productionofficer.awuser]
GO
GO
DENY SELECT ON [Production].[ScrapReason] ([ScrapReasonID]) TO [productionofficer.awuser]
GO

-- list permissions of all users
SELECT DB_NAME() AS 'DBName'
,p.[name] AS 'PrincipalName'
,p.[type_desc] AS 'PrincipalType'
,dbp.permission_name as 'PermissionName'
,p2.[name] AS 'GrantedBy'
,dbp.[state_desc]
,so.[Name] AS 'ObjectName'
,so.[type_desc] AS 'ObjectType'
FROM [sys].[database_permissions] dbp LEFT JOIN [sys].[objects] so
ON dbp.[major_id] = so.[object_id] LEFT JOIN [sys].[database_principals] p
ON dbp.[grantee_principal_id] = p.[principal_id] LEFT JOIN [sys].[database_principals] p2
ON dbp.[grantor_principal_id] = p2.[principal_id]

WHERE p.type = 'R'