Apache Spark MLlib Training Course

Course Code

spmllib

Duration

35 hours (usually 5 days including breaks)

Requirements

Knowledge of one of the following:

  • Java
  • Scala
  • Python
  • SparkR.

Overview

MLlib is Spark’s machine learning (ML) library. Its goal is to make practical machine learning scalable and easy. It consists of common learning algorithms and utilities, including classification, regression, clustering, collaborative filtering, dimensionality reduction, as well as lower-level optimization primitives and higher-level pipeline APIs.

It divides into two packages:

  • spark.mllib contains the original API built on top of RDDs.

  • spark.ml provides higher-level API built on top of DataFrames for constructing ML pipelines.

 

Audience

This course is directed at engineers and developers seeking to utilize a built in Machine Library for Apache Spark

Course Outline

spark.mllib: data types, algorithms, and utilities

  • Data types
  • Basic statistics
    • summary statistics
    • correlations
    • stratified sampling
    • hypothesis testing
    • streaming significance testing
    • random data generation
  • Classification and regression
    • linear models (SVMs, logistic regression, linear regression)
    • naive Bayes
    • decision trees
    • ensembles of trees (Random Forests and Gradient-Boosted Trees)
    • isotonic regression
  • Collaborative filtering
    • alternating least squares (ALS)
  • Clustering
    • k-means
    • Gaussian mixture
    • power iteration clustering (PIC)
    • latent Dirichlet allocation (LDA)
    • bisecting k-means
    • streaming k-means
  • Dimensionality reduction
    • singular value decomposition (SVD)
    • principal component analysis (PCA)
  • Feature extraction and transformation
  • Frequent pattern mining
    • FP-growth
    • association rules
    • PrefixSpan
  • Evaluation metrics
  • PMML model export
  • Optimization (developer)
    • stochastic gradient descent
    • limited-memory BFGS (L-BFGS)

spark.ml: high-level APIs for ML pipelines

  • Overview: estimators, transformers and pipelines
  • Extracting, transforming and selecting features
  • Classification and regression
  • Clustering
  • Advanced topics

Testimonials

★★★★★
★★★★★

Bookings, Prices and Enquiries

Private Classroom

Private Remote

From $11250 (482)

Public Classroom

Location Date Course Price [Remote/Classroom]
FL, Fort Lauderdale - Corporate Center2019-03-04 09:30:00$11250 / $13400
CT, Hartford - Downtown2019-03-04 09:30:00$11250 / $13800
FL, Jacksonville - Bank of America Tower2019-03-04 09:30:00$11250 / $13300
FL, Tallahassee – Alliance Center2019-03-04 09:30:00$11250 / $13150
FL, Miami Beach - Miami Beach2019-03-04 09:30:00$11250 / $13300
CA, Sacramento - Promenade Circle2019-03-04 09:30:00$11250 / $13300
WA, Seattle - Smith Tower2019-03-04 09:30:00$11250 / $13550
FL, Fort Lauderdale - Plantation2019-03-04 09:30:00$11250 / $13150
CA, Sunnyvale - Downtown Sunnyvale2019-03-04 09:30:00$11250 / $12800
CT, Stamford - Downtown2019-03-04 09:30:00$11250 / $13300
FL, Fort Lauderdale - Corporate Center2019-03-04 09:30:00$11250 / $14050
VA, Stafford - Quantico Corporate2019-03-04 09:30:00$11250 / $13050
NY, Brooklyn - One Pierrepont Plaza2019-03-04 09:30:00$11250 / $14050
FL, Delray Beach – The Arbors2019-03-04 09:30:00$11250 / $13050
IA, Des Moines - Hub Tower2019-03-04 09:30:00$11250 / $13050
Cannot find a suitable date? Choose Your Course Date >>Too expensive? Suggest your price

Course Discounts

CourseVenueCourse DateCourse Price [Remote / Classroom]
SQL in MySQLNew York (NYC) - Midtown Manhattan - Madison & E38-39thWed, Feb 27 2019, 9:30 am$2890 / $4690
Excel VBA IntroductionRemote Training (Instructor-led) Mon, Mar 4 2019, 9:30 am$1200 / N/A
FreeRTOS: Programming for Real Time Operating SystemsOH, Columbus - Galleria at PNC PlazaMon, Mar 4 2019, 9:30 amN/A / $2800
Deep Learning for Banking (with R)AR, Little Rock - Regions CenterTue, Apr 9 2019, 9:30 am$9000 / $11240
Excel VBA IntroductionRemote Training (Instructor-led) Wed, Apr 10 2019, 9:30 am$1589 / N/A
SequoiaDB for Administrators OR, Portland - World Trade CenterThu, May 2 2019, 9:30 am$4050 / $5250

Course Discounts Newsletter

We respect the privacy of your email address. We will not pass on or sell your address to others.
You can always change your preferences or unsubscribe completely.

Some of our clients

is growing fast!

We are looking to expand our presence in the US!

As a Business Development Manager you will:

  • expand business in the US
  • recruit local talent (sales, agents, trainers, consultants)
  • recruit local trainers and consultants

We offer:

  • Artificial Intelligence and Big Data systems to support your local operation
  • high-tech automation
  • continuously upgraded course catalogue and content
  • good fun in international team

If you are interested in running a high-tech, high-quality training and consulting business.

Apply now!