• Avail JUMBO PASS (Limited Period Offer)
  • Faculty are working Data Scientists, and Alumni of IIT, IIM, ISB & Ph.D. qualified
  • Live Projects - Internship opportunity with US-based firm
  • 24/7 Support
  • Interview preparation assistance
  • Special discounts available

Data Science Training

Prerequisite

Computer Skills

Tools

Python and R

Course Duration

4 Months

Data Science Training

₹ 49000 ₹45000

What Will I Learn?

Syllabus

$ Business Problem Identification

  • Objectives
  • Constraints

$ Data Collection by research on business model

  • Traditional Database
  • Primary Data
  • Social Media
  • Web Extraction
  • Data Privacy

$ Data Pre-processing

  • EDA
  • Feature Engineering
  • Data Cleansing
  • Data Wrangling
  • Imputation
  • Outlier Treatment

$ Data Partitioning

  • Training
  • Validation
  • Test

$ Model Building

  • Unsupervised
  • Supervised

$ Model evaluation

  • Cost / Loss / Error function
    • MSE/RMSE
    • Binary Cross entropy
    • Cross entropy
  • Accuracy Measure

$ Deployment of Final Model

  • RStudio
  • Flask
  • Installation
  • Basic Syntaxes
  • Understand Pro’s and Con’s
  • Hands-On

Data types

Probability

Probability distribution

  • Continuous Probability Distribution
  • Discrete Probability Distribution

Sampling Variation

  • Inferential Statistics
  • Sampling Techniques
    • Probability Sampling
    • Non-Probability Sampling
  • Balanced / Unbalanced Data
  • Predictions and Inferencing of Parameters
  • What is Data Mining
  • Stages of Analytics
    • Descriptive Analytics
    • Diagnostic Analytics
    • Predictive Analytics
    • Prescriptive Analytics
  • Types of Machine Learning
    • Supervised Learning
    • Unsupervised Learning
    • Reinforcement Learning

Regression Analysis

a. Simple / Multiple Linear Regression

  • Correlation
  • Scatter plot & Correlation Coefficient
  • OLS
  • LINE assumptions
  • Make changes to model assumptions & step AIC
  • Over fitting & Under fitting
  • Lasso regression
  • Ridge regression

b. Logistic Regression

  • Binomial distribution
  • Link/Probability function
  • Logit & Probit Analysis
  • Confusion Matrix
  • ROC curve and AUC
  • Model Validation
  • Imputation, Log-likelihood

c. Advanced Regression

  • Multilogit function
  • Multinomial Regression

d. Survival Analysis – Kaplan Meier estimator

e. Data Mining Supervised learning (Classification Techniques)

  • Decision Tree (C5.0)
  • Ensemble models
    • Boosting
    • Bagging
    • Stacking
    • Adaboost
    • Gradient Boosting
    • Random Forest
  • Lazy learner - k-NN classifier
  • Naive Bayes

f. Black-Box Techniques

  • SVM
  • Artificial Neural Network
  • Convolution Neural Network
  • Recurrent Neural Network

a. Clustering

  • Hierarchical
  • Dendrogram
  • K-means
  • Within Sum of Squares
  • Between Sum of Squares
  • K-medoids
  • CLARA
  • DBSCAN

b. Dimension Reduction

  • PCA
  • SVD

c. Association Rules

  • Probabilistic if-then Pairs
  • Support
  • Confidence
  • Lift Ratio
  • Apriori Algorithm

d. Recommendation engine

e. Network Analysis

  • Web extraction
  • Bag of Words
  • DTM / TDM
  • TFIDF
  • Sentiment Analysis using word clouds
  • Parsers / Lexicons
  • Emotion Mining
  • Natural Language Processing
  • Named Entity Recognition

I. Time Series Data

II. Components of Time Series

III. Graphical Representation of Time Series

IV. Data Partition for Prediction

V. Model Validation using Error Measures

VI. Forecasting Models

  • Regression models
  • Auto-Regressive
  • ARMA, ARIMA
  • Moving Average
  • Simple Exponential Smoothing
  • Holts/Double Exponential Smoothing
  • Winters/Holt-Winters
What Will I Learn?

Why Take This Course?

Data Science is an interdisciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge and insights from data (as per wiki).

The certification program of Data Science will cover the concepts right from basics of statistics, Inferential statistics for predictive analytics, Graphical representation of Data to perform Descriptive Analytics, followed by the Data Mining modules which are segregated into supervised learning and Unsupervised learning techniques to unearth hidden patterns in data.

Data Science course covers the life cycle of an analytics project which are handled in agile methodology in the industry. The concepts are explained using real-time case studies to bring the flavor of a few domains. Students will gain knowledge on defining Business problems within the given constraints to draw solutions which eventually benefit the organization.

Data collection and Data Cleansing, Feature Engineering, Data Wrangling, Imputation, etc., techniques are discussed to give a wide range of scope of understanding of Data collection steps to students.

Machine Learning algorithms from scratch give students a clear understanding of its application in Prediction Analytics. Application of Data Mining concepts like Supervised and Unsupervised learning techniques on business data. Deploying the best solution to the problem by evaluating the models using accuracy measures. Optimizing the model to get the best results by addressing overfitting and underfitting problems, adjusting hyper-parameters and various other techniques are explained in detail to participants to help them gain confidence to take-up analytics job roles like Data scientist, Data Analysts or Data Engineers in various data-rich companies and enhance their careers.

Python programming language is explained right from basics along the statistical programming language R to deal with all the concepts. These functional oriented programming languages are fun to learn and are super easy to grasp even for a non-programmer or novice.

Hide

Enroll Now


  Enroll Now
whatsapp
Copyrights © Instilit.
Call Us