Spotlight On Microsoft Azure: Does It Live-Up To It’s Reputation& What Is It Like To Work With?
- 2nd October 2019
- Paul Clough
Debbie Edwards is a BI Consultant at Peak Indicators and uses Microsoft Azure as a part of her day-to-day work. Paul, our Data Science Lead, interviewed her to find out more about what she’s been doing with Azure and what her experiences with the technology has been to date.
Can you briefly describe your career in BI and Analytics to date?
Debbie: I have been working with Microsoft BI for 16 years. I previously worked as an NHS Data Consultant and as a senior BI Officer for Derbyshire County Council, mostly working within the Children and Younger Adults department. Within both these roles my soul focus was the on-premise Microsoft Stack. I worked with Microsoft SQL Server Enterprise Editions (2000 through to 2012) Reporting services, Integration Services and Analysis Services.
In 2017, I moved to Peak indicators and began working with Power BI and Azure. These were starting to become major buzzwords just before I left the council and its been so exciting to transfer my knowledge to the cloud. Its astonishing to see how quickly the Business intelligence environment has moved on from my days working with SQL Server 2000. I have given talks on Power BI for the Chamber of Commerce and I also run training workshops.
So how much are you using cloud-based BI&A solutions?
Debbie: Every day. I use Power BI and Azure for all the projects I work on. I would say I have been fully utilising cloud based solutions for just over a year now and its freed me up to do things I could never have done when working with on-premise technologies.
Can you provide any examples of specific projects you’ve worked on involving the use of Azure? Why did you select the Azure platform?
Debbie: Power BI was one of the main drivers to using Azure but it was the obvious choice for me as someone who has used Microsoft technology throughout my career.
There are many tools to choose from such as Tableau, Qlikview, IBM Cognos, Oracle OBI EE Plus, etc. I believe Power BI is now one of the best known, and best solutions.
You can basically download the Power BI Desktop, set yourself up with a free Power BI account and get cracking on your very own high-end reports and dashboard solutions for free. You can become an expert at creating beautiful and useable dashboards without spending any money.
Obviously as soon as you want to share your results you need to start thinking about moving to Power BI Pro, the monthly cost based solution, but the above free set up is a great way of gaining all the knowledge you need to work with Power BI.
One of my first tasks was to attempt to create a solution to look at social media feeds for a Northern utilities company. Power BI was selected, not only because I had an interest but the company in question was keen to look at what Power BI could do.
Azure Cloud solutions are scalable and offer pricing tiers that charge only for the resources you use. It was the obvious choice when working with Power BI. My first question was, how can I bring in social media and analyse it so I can find positive, negative and neutral tweets?
The set up was two text based cognitive services provided by Azure. One looks at the text within your data to pick on the sentiment and keyphrases (over which we can build a word cloud in power BI). The second text API we set up was to look for profanity within the text and set up a flag. Both services are free (up to 5,000 transactions per month) unless your needs are greater in which case there is a small charge.
Once these were set up, I moved to Azure Logic Apps to set up a trigger workflow. If a keyword is mentioned in twitter, the application will pull in the data, analyse and set up the sentiment score, keywords and pull through into Azure Table Storage. Logic Apps is an example of serverless computing within Azure Cloud. It helps you schedule, automate, and orchestrate tasks, and set up workflows. You create your workflows within the Azure web-based designer or in a Microsoft Visual Studio module, if you prefer that way of working.
Azure Table Storage is a NoSQL store that hosts unstructured data independent of any schema. This was the logical choice in which to store the social media information, including any media attached to the tweet.
Because I needed to do a bit of work on the table structures before importing it into Power BI, I decided to pull the data into a SQL Database using Data Factory, the Azure ETL orchestration tool. There were several issues with the raw data. For example, when a tweet is retweeted, the full tweet breaks the twitter logic for number of characters. Therefore, these tweets come through as truncated. I made the decision to find the retweets and replace the text with the original, none truncated tweet.
I do this in an SQL Database with a SQL Stored Procedure. There are a few issues I needed to iron out before loading so this seemed to be the best place for the job. It also made it easier to be able to merge the social media data with other data stored within the SQL Database.
Data factory Loads up the data on a schedule so the company has a great resource to look at their social media feeds in more detail. Have they replied to the negative tweets? What other influences are at play with negative tweets? What volume of tweets are they getting compared with the number of customers? Are any of the people tweeting considered vulnerable?
So far, for this specific task I have used very low-cost text APIs, Logic apps, Azure Table Storage, Azure SQL Server and SQL DB, Data Factory and Power BI. Another great reason to go with Azure, everything you need is under one roof (Or within the one cloud solution).
Why do you think Azure is a leading public cloud-based platform? What makes it a leader in the market?
Debbie: There are of course other cloud based providers. According to TechRadar15 (2019) Microsoft Azure sits in 2nd place with Amazon web services in 1st. When you are choosing a provider, there are certain questions you need to ask:
- How does the provider deal with compliancy when handling sensitive data?
- How compliant are the service offerings?
- How do we deploy cloud based solutions that have specific compliance requirements?
- What terms are in the Provider’s privacy statement?
Microsoft provides the most comprehensive set of compliance offerings of any other cloud service provider. Their list is long and includes General Data Protection Regulation (GDPR), Health insurance Portability and Accountability Act (HIPAA), Cloud Security Alliance (CSA) STAR Certification and many others. Azure also has a more exact pricing model where it charges per minute rather than by hour.
Another big advantage of the Azure Cloud is that it excels in hybrid cloud solutions. On-premise and public cloud solutions can integrate easily. This is one of the biggest plus points of Azure. So many companies work in a Microsoft environment already, it’s the obvious choice. It may have appeared 4 years after AWS, but it has certainly caught up.
These are just a few examples of why I think Azure is the best solution to go for. Considering the complexity on offer its incredibly easy to use. Start-up is straightforward and with the initial free trial you can get to work on your projects in no time. Many services are free and low cost so you can start as small as you like.
What are the benefits of businesses in using Azure?
Debbie: When I worked for the council on BI on premise projects, we had an on-site data centre with a finite amount of resources. Whenever we started on a project, the DBA team would ask you to fill in a form to establish the peak needs of the server and database. I was working with big data sets in an environment that couldn’t really cope with big data because there was a very rigid boundary set for us by a service used to dealing with transactional systems.
This meant that I was faced with many obstacles to try and create as small a data warehouse as possible. In one way this is good practice, but in another, we could never fully achieve what we needed to do. Something always had to give in order to provide a service.
This is a real-world example of the CapEx vs the OpEx model. In my previous role I had to attempt to figure out the full costs of the entire project. Once established it was basically set in stone. Growth is unpredictable and this is where Azure Cloud excels.
Within my new role, we initially create proof of concept areas with basic services. The users are always happy with the initial concepts and are eager to move forward. So long as everyone is kept aware of possibilities and ways to keep costs down, the BI world is as big as your budgets allow.
When doing a rigorous analysis of costs against the cloud and on premise, Cloud will always be a good cost based solution because you don’t have to pay for your own data centre consisting of server, storage, networking, backup, disaster recovery, infrastructure and personnel.
What challenges/problems have you faced when using Azure services?
Debbie: One of the key issues is, its quite difficult to establish costings for large scale projects within development and production environments without first having established a few projects to get an understanding of best practice.
For example, when starting to work on a project, we run a 10-day pilot to produce reports and dashboards within a proof of concept project. With this in mind, all services are created at a base level. For example, I will use an Azure SQL Database, probably in basic mode with 5 DTUs (Database Throughput Unit) and move up to Standard with more Storage and DTUs as required. This is in development mode. Once you move through into production you have many more costings to think about. What purchase model, service tier and compute do you require? SQL Databases are available in managed instance, single database and elastic pool deployment options16.
You also need to think about your disaster recovery options, Azure provides some great services for this like Geo restore or Geo replication. (Backup to a separate region or create up to 4 continuous copies in a separate region). There are also options to think about like long-term retention and backup storage.
Along with all the above options, should you move to a SQL Data Warehouse? Data Warehouses give you further considerations to think about and can be more expensive than a SQL Database. A SQL Database currently has a storage limit (it should be increasing to 10 TB in the near future) However, the Data Warehouse has no limit apart from costs. You can also stop a data warehouse during none peak hours to take costs down (for example, in Dev Mode). Basically, you can start from £4.50 a month up to thousands of pounds a month.
The many options can be bewildering and can be difficult to cost within this OPex model or come up with the best strategy for a large project when you first start working in the cloud.
However, once you start working with it and realise how many possibilities all these options give you, it’s an incredible scalable service to have at your fingertips. You have to think differently in terms of you how initiate a project. Expect growth and change rather than setting a solid boundary around your project.
In your opinion what is the future of Azure in BI and Analytics?
Debbie: Power BI and Azure are moving forward at an exponential rate. It’s such an exciting place to be at the moment. I personally want to take advantage of the Gen 2 Azure Data Lake and Databricks. I have already had a lot of success running Azure Data Lake Analytics over Gen1 Storage, pulling through files and then transforming them for analysis purposes without the need of a SQL Database.
I’m always interested in advances in real-time analytics. Power BI offers real-time streaming visuals within the Power BI Service and I’m very certain that these will start improving as we go on. Real-time analytics again involves the use of Azure HD Insight, Databricks, Polybase and Azure Cosmos DBs.
Databricks17 is especially noteworthy as it is an Apache Spark-based analytics platform that allows collaboration between data scientists, data engineers, developers and business analysts. I’ve spent most of my time working within traditional BI and structured data and now I have all these tools to hand to extend my knowledge across into data science and unstructured data. So many more opportunities to deliver fantastic BI solutions and knowledge are now available.
Automated Machine Learning (AutoML) is another great service that I believe will get better and better. There are certain visuals that allow you to do some forecasting in Power BI already but AutoML will take prediction to the next level. You can also integrate Python and R into Power BI via automated machine learning.
I think that Azure Cognitive Services are also going to become increasingly more powerful and varied. I use the standard text API to establish sentiment, but new ones are appearing. For example, cognitive services can also look at images and speech recognition.
Power BI has recently had an updated visual called Key Influencers. It is the first AI visual available and allows you to add factors to a visual and get them ranked. For instance, what makes a negative tweet? Why are these products not selling? There are advancements every month with the influencers tile and I’m sure we will see huge advancements to AI visuals in the coming months like distribution changes and decomposition trees.
There are so many options to choose from and in-depth areas to decide to train up in you really can find everything you need within Azure and Power BI.
Thank you Debbie!
If you're interested in learning more download our Azure Fact Sheet that takes an in-depth technical look at the Azure offering below or if you would like to discuss how Azure could support your business or need technical support or training around Azure and Power BI then why not call us on 01246389000 or email us on firstname.lastname@example.org