Tallinn Machine Learning - Introduction to Machine Learning
This 3-day course will help to familiarise participants with common concepts and terminology used in data science and machine learning. It will also provide a hands-on introduction to Tallinn and demonstrate its use on a range of problems involving predictive analytics.
Learning Objectives: On successful completion of this course, you should be able to:
- Describe the concept of machine learning and the stages involved in the typical data analytics project lifecycle/machine learning workflow;
- Recognise differences between common algorithms for predictive data analytics;
- Explain steps commonly involved in
- Describe stages in the Tallinn workflow and how they relate to machine learning;
Use the Tallinn application to perform predictive data analytics.
Public, Private & Bespoke Training available on-demand
Tallinn is a browser-based platform and service for automated machine learning from Peak Indicators. Tallinn provides the functionality to simplify and automate many of the standard steps within the machine learning lifecycle. This includes feature enrichment, selection and engineering; model selection and parameter tuning; evaluation and in-depth analysis of outcomes. The tool enables machine tasks to be carried out using a graphical front-end and guided workflow.
Each day of the course will consist of 4 x 90min sessions following the suggested schedule and content as outlined below. However, the level of technical knowledge required, particularly for Day 1, can be tailored to the individual needs of clients..
Day 1 – An introduction to data science and machine learning
The aim of the first day is to provide enough background to concepts and terminology to be able to use Tallinn. Training can be carried on-site or at Peak Indicators in Chesterfield. The software used in the course is Tallinn, a browser-based application, that requires only internet access.
Introduction to machine learning
- The motivation for being data-driven and analytical
- What is machine learning?
- The machine learning project lifecycle (CRISP-DM)
- Understanding business needs and example use cases
- Considering wider issues (e.g. ethics, privacy and implementation)
Data understanding and preparation
- Working with different types of data (e.g. nominal/numeric, raw/derived, etc.)
- How to explore and describe data (e.g. data summaries, quality issues, etc.)
- Gathering and collecting data
- Data pre-processing aka ‘feature engineering’ (e.g. aggregating and creating new features, scaling, binning, dealing with missing values, hot-encoding etc.)
- Structuring data for machine learning (i.e. the Analytics Base Table)
Performing predictive data analytics
- Problems in machine learning (e.g. categorisation, clustering and rule mining)
- Categories of machine learning algorithms (e.g. supervised and unsupervised)
- Common algorithms for predictive data analysis (regression, decision trees, and nearest neighbour approaches)
- Assumptions and limitations of machine learning algorithms
Evaluation, improvement and deployment
- Training, validation and test sets (including cross-validation)
- Commonly used performance metrics (e.g. accuracy, RSME)
- Improving model performance (e.g. ensembles, hyperparameter tuning, etc.)
- Common tools and technologies for machine learning
Further areas of interest (e.g. natural language processing)
Day 2 – An introduction to Tallinn
Tallinn and predictive data analytics
- An overview of the Tallinn application (motivation, architecture etc.)
- The concept of automated machine learning (or data science)
- Tallinn features and use within the machine learning pipeline
- Introduction of worked examples (e.g. Titanic, Boston house prices, churn prediction etc.)
Tallinn and feature engineering Tallinn and training models Tallinn and evaluating models
Day 3 – Client-specific workshop
The aim of day 3 is to start to work though client specific examples using Tallinn and put into practice what you have learned in the first 2 days.