loader

Training & Certifications

Certified Data Science Practitioner (CDSP)


Offered by CertNexus®, Certified Data Science Practitioner™ (CDSP) certificate is an industry-validated certification which helps professionals differentiate themselves from other job candidates by demonstrating their ability to put data science concepts into practice. The Certified Data Science Practitioner™ (CDSP) training program offered by Multimatics is designed to help participants gain the ability to use data science principles to address business issues, use multiple techniques to prepare and analyze data, evaluate datasets to extract valuable insights, and design a machine learning approach. The training material is prepared based on the latest edition of COBIT® 2019 Framework, accompanied by discussions and exercises to work on the questions.

Multimatics is an Authorized Training Partner for the Certified Data Science Practitioner™ (CDSP) training and certification program accredited by the CertNexus®.


By the end of the program, participants will be able to:

  • Use data science principles to address business issues
  • Apply the extract, transform, and load (ETL) process to prepare datasets
  • Use multiple techniques to analyze data and extract valuable insights
  • Design a machine learning approach to address business issues
  • Train, tune, and evaluate classification models
  • Train, tune, and evaluate regression and forecasting models
  • Train, tune, and evaluate clustering models
  • Finalize a data science project by presenting models to an audience, putting models into production, and monitoring model performance

This program is designed for professionals across different industries seeking to demonstrate the ability to gain insights and build predictive models from data.


This program is 5 days of intensive training class.


The program provided by Multimatics will be delivered through interactive presentation by professional instructor(s), group debriefs, individual and team exercises, behavior modelling and roleplays, one-to-one and group discussion, case studies, and projects.


There is no specific requirement to join this program, although the following knowledge, skills, and abilities are recommended:

  • A working level knowledge of programming languages such as Python® and R
  • Proficiency with a querying language
  • Strong communication skills
  • Proficiency with statistics and linear algebra
  • Demonstrate responsibility based upon ethical implications when sharing data sources
  • Familiarity with data visualization

Participants will take CDSP Exam which consists of 100 multiple choice questions. They will be given 2 hours to finish the exam. Participants who successfully passed the exam will be given an official Certified Data Science Practitioner™ (CDSP) certification from CertNexus®.


  1. Identify the project scope

    • Identify project specifications, including objectives (metrics/KPIs) and stakeholder requirements
    • Identify mandatory deliverables, optional deliverables
    • Identify project limitations (time, technical, resource, data, risks)
  2. Understand stakeholder challenges

    • Understand stakeholder terminology
    • Become aware of data privacy, security, and governance policies
    • Obtain permission/access to data
  3. Classify a question into a known data science problem

    • Access references
    • Identify data sources and type
    • Select modeling type
  1. Gather relevant datasets

    • Read data
    • Research third-party data availability
    • Collect open-source data
  2. Clean datasets

    • Identify and eliminate irregularities in data
    • Parse the data
    • Check for corrupted data
    • Correct the data format for storing/querying purposes
    • Deduplicate data
  3. Merge datasets

    • Join data from different sources
  4. Apply problem-specific transformations to datasets

    • Apply word embeddings
    • Generate latent representations for image data
  5. Load data

    • Load into DB
    • Load into DataFrame
    • Export to CSV files
    • Load into visualization tool
    • Make an endpoint
  1. Examine data

    • Generate summary statistics
    • Examine feature types
    • Visualize distributions
    • Identify outliers
    • Find correlations
    • Identify target feature(s)
  2. Preprocess data

    • Identify missing values
    • Make decisions about missing values (e.g., imputing method, record removal)
    • Normalize, standardize, or scale data
  3. Carry out feature engineering

    • Apply encoding to categorical data
    • Assign feature values to bins or groups
    • Split features
    • Convert dates to useful features
    • Apply feature reduction methods
  1. Prepare datasets for modeling

    • Decide proportion of dataset to use for training, testing, and (if applicable) validation
    • Split data to train, test, and (if applicable) validation sets
  2. Build training models

    • Define algorithms to try
    • Train model
    • Tune hyperparameters, if applicable
  3. Evaluate models

    • Define evaluation metric
    • Compare model outputs
    • Select best performing model
    • Store model for operational use
  1. Test hypotheses

    • Design A/B tests
    • Define success criteria for test
    • Evaluate test results
  2. Test pipelines

    • Put model into production
    • Ensure model works operationally
    • Monitor pipeline for performance of model over time
  1. Report findings

    • Implement model in a basic web application for demonstration (POC implementation)
    • Derive insights from findings
    • Identify features that drive outcomes (e.g., explainability, variable importance plot)
    • Show model results
    • Generate lift or gain chart

Scroll to Top