How to win with Machine Learning – for the Citizen Data Scientist

If machine learning was quick and easy then everyone would be doing it!

Whilst there are many tools or platforms that facilitate the process, it remains the case that machine learning requires a lot of labour intensive work with many rounds of trial and error. For example: 

    • Data preparation (fixing NULL values, standardising hierarchies and so on) 
    • Feature engineering (building extra metrics to improve model accuracy) 
    • Feature selection (selecting only the features that will improve model accuracy) 

There are also various advanced techniques for machine learning that are often overlooked or take a reasonable amount of time to implement (categorising numerical data, one-hot-encoding, imputing NULL values, oversampling, hyper-tuning, ensembles etc). 

Download the article below, to see in depth how the Tallinn Machine Learning platform is an ideal way for citizen data scientists to quickly build and deploy machine learning models without having to write any code or have any in-depth knowledge of machine learning 

To demonstrate this, we will use Tallinn ML to enter the infamous Kaggle Titanic competition, showing the high-level steps involved with producing a machine learning model that performs well enough to be placed inside the top 5% of data scientists (out of >10,000 that have entered).    

In this competition, we have to predict which passengers survived. We are given 858 rows which are used for training our predictive model and a further 418 rows are used for testing the results and uploading back to Kaggle: 



I
f you wish to see things in more detail then please contact us and we will happy to arrange a live demo. 

To see the in-depth look at how we used Tallinn Machine Learning in the infamous Kaggle Titanic competition download the article below. 


Subscribe to our Newsletter

If you enjoyed this article why not get great insights straight to your inbox

Leave a comment