Tallinn Machine Learning: Now With ‘Deep Learning’ Capabilities
- 2nd October 2019
- Big Data & Advanced Analytics
- Tony Heljulia
We are pleased to announce that our Tallinn Machine Learning platform now ships with extensive “deep learning” capabilities!
Deep learning is quite different from your traditional machine learning algorithms. Machine learning algorithms are highly mathematical and can typically only process structured or relational data (e.g. customers, products and sales revenue).
Deep learning algorithms act more like the human brain. By using layers of artificial neural networks (ANNs), they are powerful enough to process unstructured data such as images, videos, text and speech in real-time.
With deep learning, you can perform some truly exciting techniques such as Image or face recognition, Object detection, Motion tracking, Gesture detection & Speech recognition
The use cases for this technology is wide ranging, but here are some simple examples:
- Engineering quality: automatically identify defects that have occurred on the production line
- Workplace Capacity: Optimise the use of desk space and other resources by analysing busy areas within the workplace
- Footfall heatmaps: retailers can analyse where customers focus their attention when wandering around their stores
Example Deep Learning Exercise
Consider this deep learning example where we wish to perform real-time object detection and gesture detection during a tennis match:
Our aim is to: Outline the tennis ball on every frame & Detect when the players hit the ball. But How do we do this?
1. Image Pre-Processing
The first step is to do some basic image processing on each frame of the video, this can be done in real-time. In this instance, we will reduce image complexity by filtering out noise and apply blurring, this will result in a smoother image:
2. Object Detection
With so many different colours and shapes (contours) on the image, training a deep learning model to recongise tennis ball can be extremely challenging.
To help matters, we apply a sophisticated “background removal” algorithm that literally can take away the static background areas of a video in real-time, only the moving parts of the video remain:
By removing the background “noise”, we greatly simplify the process detecting and tracking a tennis ball on the video.
Image such as the one above will be used to train a deep learning model to recognise a tennis ball. You need somewhere between 100 to 1000 images to achieve acceptable levels of accuracy when performing object detection. We naturally have to “label” each image to record the precise position and outline of each ball.
NOTE: The process of obtaining and labelling images can be extremely time consuming!
Once the model is trained with a sufficient number of images, you can then apply the model to the video to detect & track the ball in real-time. The python scripting language is ideal for this purpose.
3. Gesture Detection
We will use gesture detection to identify when a player hits the ball.
Gesture detection is in fact extremely similar to object detection. The only real difference is that we use binary images instead of colour:
Binary images are more effective for gesture detection since we are trying to pick out the general shape of the player’s arm and racket without being concerned with their skin, hair or clothing.
As before, we will need somewhere between 100-1000 images to train a deep learning model to recognise a gesture.
For a tennis player, we actually need to train the deep learning model to recognise a variety of shots such as forehand, backhand, serve, volley and smash!
4. The Result
And here is how the final result will look! As the video plays we can overlay shapes to show the position of the tennis ball and identify when a player is about to hit:
If you’re keen to discover more about Tallinn Machine Learning or discuss the opportunities AI technology can bring to your business call us on 01246389000 or email us on firstname.lastname@example.org