Tuesday, December 26, 2017

Deep Learning Skills / Data Science

Check out my Data Science Bootcamp on: http://www.qcfinance.in/python-for-data-science-machine-learning/

PDF of Pricing and Outline: http://qcfinance.in/wp-content/uploads/2018/06/Data-Science-Course-Curriculum-v3.pdf

Programming languages (Python, R, Lua, Scala …) and multiple frameworks and technologies (Tensorflow, Torch, Hadoop, Spark, RDBMS…) to support the modeling requirements


Deep learning, other AI, natural language processing, data mining, information theory, and optimization


Python, R, Lua, Scala, C++


Major deep learning libraries:. TensorFlow, Torch, DeepLearning4J


GPU (CUDA), ASIC, or FPGA


Distributed system (e.g. Spark, Hadoop, Ignite …)


Big data visualization

Substantial programming experience with almost all of the following: SAS (STAT, macros, EM), R, H2O, Python, SPARK, SQL, other Hadoop. Exposure to GitHub.
Modeling techniques such as linear regression, logistic regression, survival analysis, GLM, tree models (Random Forests and GBM), cluster analysis, principal components, feature creation, and validation. Strong expertise in regularization techniques (Ridge, Lasso, elastic nets), variable selection techniques, feature creation (transformation, binning, high level categorical reduction, etc.) and validation (hold-outs, CV, bootstrap).
Database systems (Oracle, Hadoop, etc.), ETL/data lineage software (Informatica, Talend, AbInitio)

Data visualization (e.g. R Shiny, Spotfire, Tableau)

AWS ecosystem: experience with S3, EC2, EMR, Lambda, Redshift

Data pipelines  Airflow, Luigi, Talend, or AWS Data Pipeline

APIs:  Google, YouTube, Facebook, Twitter, or Oauth

version control (Github, Stash etc.)

http://qcfinance.in/Data%20Science%20Course%20Curriculum%20(1).pdf


Sunday, December 24, 2017

MS Excel VBA Data Analytics-101 [2 hrs] Instructor Led, In-Person Training New York - VBA code for loops play

MS Excel VBA Data Analytics-101 [2 hrs] Instructor Led, In-Person Training New York

MS Excel VBA Data Analytics-101 [2 hrs] Instructor Led, In-Person Training

Details
# About This Meetup:
Have you ever felt like you are limited to do your calculation in excel cells? Do you want to harness the power of Excel Visual Basic Applications(VBA)? If this sounds like you, this course is for you. This meetup is for Excel beginners and non-programmers and is all about Excel VBA.

# Key Takeaways:
* Referencing
* Shortcuts
* Dragging.
* If command
* Locking
* Sum-Product
* Vlookup
* Hlookup
* Index Match
* Offset
* Find Dependable
* Complex example of mixing all commands
* Matrix multiplication.
* Array Formulas
* Handling big files.

Pre-requisites:
Please bring your own laptop.

About the Speaker:
Shivgan Joshi
Lead Instructor
# 929 256 5046

** Payment Policy: We only accept payment on site and before the class. We accept payment through cash, Venmo & Paypal(+5). **

If you have already attended session 101, take a look at what we offer in session 102.
* Correlation,
* Regression
* Linear modeling, Goal seek,
* Optimization,
* Internal Rate of Return,
* Linest command,
* Graphing,
* Conditional formatting,
* Pivot tables
* Monte Carlo Intro,
* Min, Max, Bin
* Histograms
* Rank,
* Spearman correlation,
* Frequency,
* Error Handling,
* CountIFs,
* ISSomething,
* Cal modes
* Data Tables: Sensitivity, 2 way, and 3-way data table,
* Vlookups with data tables,
* Monte Carlo using Data tables.






'Option Explicit

Sub Sample()
    Dim i As Long, j As Long, k As Long, l As Long
    Dim CountComb As Long, lastrow As Long

    Range("G2").Value = Now

    Application.ScreenUpdating = False

    CountComb = 0: lastrow = 6

    For i = 1 To 4: For j = 1 To 4
    For k = 1 To 8: For l = 1 To 12
        Range("G" & lastrow).Value = Range("A" & i).Value & "/" & _
                                     Range("B" & j).Value & "/" & _
                                     Range("C" & k).Value & "/" & _
                                     Range("D" & l).Value
        lastrow = lastrow + 1
        CountComb = CountComb + 1
    Next: Next
    Next: Next

    Range("G1").Value = CountComb
    Range("G3").Value = Now

    Application.ScreenUpdating = True
End Sub



Sub Sample2()
    Dim i As Long, j As Long, k As Long, l As Long
    Dim CountComb As Long, lastrow As Long

    Application.ScreenUpdating = False

    CountComb = 0: lastrow = 6

    For i = 1 To 4: For j = 1 To 4
    For k = 1 To 8: For l = 1 To 12
   
    Cells(i, 20) = i
     Cells(j, 21) = j
      Cells(k, 22) = k
      Cells(l, 23) = l
      lastrow = lastrow + 1
      CountComb = CountComb + 1
   
      Cells(lastrow, 24) = lastrow
      Cells(CountComb, 25) = CountComb
   
    Next: Next
    Next: Next

    Application.ScreenUpdating = True
End Sub

Sub Sample3()
    Dim i As Long, j As Long, k As Long, l As Long
    Dim CountComb As Long, lastrow As Long

    Application.ScreenUpdating = False

    CountComb = 0: lastrow = 6

    For i = 1 To 4
    For j = 1 To 4
    For k = 1 To 8
    For l = 1 To 12
   
    Cells(i, 20) = i
     Cells(j, 21) = j
      Cells(k, 22) = k
      Cells(l, 23) = l
      lastrow = lastrow + 1
      CountComb = CountComb + 1
   
      Cells(lastrow, 24) = lastrow
      Cells(CountComb, 25) = CountComb
   
    Next l
    Next k
    Next j
    Next i

    Application.ScreenUpdating = True
End Sub
Sub playarray()


Dim myThirdColumn As Variant

myThirdColumn = Application.Index(myArray, , 3)



End Sub

' https://usefulgyaan.wordpress.com/2013/06/12/vba-trick-of-the-week-slicing-an-array-without-loop-application-index/

Sub Test()

    Dim varArray()          As Variant
    Dim varTemp()           As Variant
Dim myRng As Range

'Application.Index([A1:E10], , 2) = Application.Index(varArray, , 2)

Set myRng = Worksheets("SheetA").Range("A1:E10")
     varArray = myRng.Value
   varTemp = Application.Index(varArray, , 2)
 '  varTemp = Application.Index(varArray, Array(2, 3), 0)
  '  varTemp = Application.Index(varArray, , Application.Transpose(Array(2)))
 
MsgBox UBound(varTemp) - LBound(varTemp) + 1
    'MsgBox varArray(1, 1)

End Sub


Sub Test2()

    Dim varArray()          As Variant
    Dim varTemp()           As Variant
Dim myRng As Range

'Application.Index([A1:E10], , 2) = Application.Index(varArray, , 2)

Set myRng = Worksheets("SheetA").Range("A1:Z10")
     varArray = myRng.Value
   varTemp = Application.Index(varArray, 3)
    varTemp2 = Application.Index(varArray, , 3)
 '  varTemp = Application.Index(varArray, Array(2, 3), 0)
  '  varTemp = Application.Index(varArray, , Application.Transpose(Array(2)))
 
'MsgBox UBound(varTemp) - LBound(varTemp) + 1
'MsgBox varArray(1, 1)
'MsgBox UBound(varTemp2) - LBound(varTemp2) + 1
MsgBox varTemp2(10, 1)
' VBA Array starts at 1



End Sub



''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

Sub Test3()

    Dim varArray()          As Variant
    Dim varTemp()           As Variant
Dim myRng As Range

'Application.Index([A1:E10], , 2) = Application.Index(varArray, , 2)

Set myRng = Worksheets("SheetA").Range("A1:Z10")
     varArray = myRng.Value
   varTemp = Application.Index(varArray, Array(1, 2))
 
    'first two row elements
    'varTemp2 = Application.Index(varArray, , 3)
 '  varTemp = Application.Index(varArray, Array(2, 3), 0)
  '  varTemp = Application.Index(varArray, , Application.Transpose(Array(2)))
 
 
 
'  MsgBox Array(1, 2)(0)
 MsgBox varTemp(1)
 ' the first element actually using array command
 ' the above var temp starts with 1 and not with 0
'MsgBox UBound(varTemp) - LBound(varTemp) + 1
'MsgBox varArray(1, 1)
'MsgBox UBound(varTemp2) - LBound(varTemp2) + 1
'MsgBox varTemp2(10, 1)
' VBA Array starts at 1



End Sub

Thursday, December 14, 2017

Big Data in Data Science

Tools and plays



Kafka, Elastic Map Reduce, Avro, Parque, Storm, Hbase


NodejS or Java
- Either:

 Kafka, Storm, Neo4j or Hbase
- Mongoose
- Solr/Lucene

Cassandra, Spark



Deep working experience applying machine learning and statistics to real world problems
Solid understanding of a wide range of data mining / machine learning software packages (e.g., Spark ML, scikit-learn, H2O, Weka, Keras)
Experience with version control systems (git) and comfortable using command-line tools


Preferred:
Knowledge of semantic web technology (e.g., RDF, OWL, SPARQL)
Knowledge of search technologies (e.g., Solr, ElasticSearch)
A link to a portfolio and/or code samples demonstrating your work experience (GitHub, Kaggle, KDD contributions earn major props)



Data Analyst – BI - Training:

Coding data extraction, transformation and loading (ETL) routines.
APIs and databases to pull data together

Hadoop, SQL and NoSQL technologies is required, as well as basic scripting experience in a dynamic language, such as Python or R.
Tools like Jethro, Kyvos, Dremio, AtScale etc.
BI tools like Tableau, Domo, Qlikview etc.
Sata visualization
Relational Databases (eg., Postgres, SQL Server, Oracle, MySQL)
Distributed Databases (eg., Hive, Redshift, Greenplum)
NoSQL Data Frameworks (eg., Spark, Mongo, Cassandra, HBase)
Data Analysis and Transformation (eg., R, Matlab, Python, etc.)

Big Data providers: Cloudera CDH, Hortonworks HDP and Amazon EC2/EMR for deploying and developing large scale solutions.
Hadoop/Spark Big Data Environment Clusters using Foreman, Puppet and Vagrant. Deploy Big Data Platforms (including Hadoop & Spark) to multiple clusters using Cloudera CDH, on both CDH4 and CDH5.
Hadoop MapReduce, YARN, HBase, Spark performance for large-scale data analysis.
Spark performance based on Cloudera and Hortornworks HDP cluster setup in Production Server.
Machine learning data models on Terabytes of data using Spark Ml and Mlib libraries.
 ETL systems using Python, HIVE and Apache spark SQL framework. Storing all the result files in Apache parquet and mapping them to HIVE for Enterprise Datawarehousing.
Real-time data pipelines using Kafka and Python consumers to ingest data through Adobe Real-time Firehorse API into Elastic Search and built real-time dashboards using Kibana.
Aribnb Airflow tool, to run the machine learning scripts in a DAG manner.
Test cases using Python Nose framework.
Scikit learn python scripts to Ml\Mlib spark scripts, which resulted to scalable pipeline framework computing.
PySpark.
Data Pipelines using Spark and Scala on AWS EMR framework and S3.
Real-time Data pipelines using Spark Streaming and Apache Kafka in Python.
Real-time Data pipelines using Apache Storm Java API for processing live streams of data and ingesting to Hbase.
Data pipelines on Cloudera/Hortornworks Hadoop Platform using Apache PIG and automating workflow using Apache Oozie.

Technology: Hadoop Ecosystem /Spring Boot/Microservices/AWS /J2SE/J2EE/Oracle
DBMS/Databases: DB2, My SQL, SQL, PL/SQL
Big Data Ecosystem: HDFS, Map Reduce, Oozie, Hive/Impala, Pig, Sqoop, Zookeeper and Hbase,
Spark, Scala
NOSQL Databases: Mongo DB, Hbase
Version Control Tools: SVN, CVS, VSS, PVCS