Performing Real-Time Sentiment Analysis in Oracle BI 12c
- 3rd January 2016
- Big Data & Advanced Analytics
- Antony Heljula
"Embedded R" is one of the great new features of Oracle BI 12c and this article will show how it can be used to perform real-time sentiment analysis within your BI dashboards....completely open source!
In this example, the sentiment analysis function will classify sentences or paragraphs of text as either "negative", "neutral" or "positive" in sentiment:
This capability demonstrates the versatility and value behind the new R integration feature, since we would normally have to turn to other products such as Oracle Endeca or Oracle Big Data Discovery (BDD) to perform sentiment analysis.
In this case, we are using the open source "tm.plugin.sentiment" R package to perform the sentiment analysis function. Other packages are available.
Here are the required steps:
2) Install the additional R packages to perform sentiment analysis
R (command line executable)
install.packages("Rstem_0.4-1.tar.gz", repos=NULL, type="source")
install.packages("sentiment.tar.gz", repos=NULL, type="source")
3) Deploy file peak.Sentiment.xml
At the bottom of this article you can find a link to download the "peak.Sentiment.xml" file. This is a custom R script we have developed that will invoke the tm.plugin.sentiment package and score your text as positive/negative/neutral.
Download this file and then copy it to the following "script_repository" location on your OBIEE server:
4) Create your Analysis within Oracle BI
For the R script to work, you simply need an analysis with a minimum of 2 columns:
- A column that uniquely identifies the records
- The text on which to perform sentiment analysis
5) Add your Sentiment Analysis calculation
Add a new column to your analysis with the following formula:
EVALUATE_SCRIPT('filerepo://peak.Sentiment.xml', 'sentiment', 'id=%1;text=%2', [ID Column], [Text Column])
You need to replace the parts in square brackets with the two corresponding columns from your analysis (you can supply expressions such as CASE..WHEN and also string literals instead of column names). For example:
EVALUATE_SCRIPT('filerepo://peak.Sentiment.xml', 'sentiment', 'id=%1;text=%2', "Products"."Product Number", "Products"."Feedback")
6) That's It!
If you run your analysis you can see the sentiment analysis now being performed:
NOTE: Sentiment Analysis is never 100% perfect, you can customise the R package to improve accuracy, or consider alternative packages
7) Summarise your Analysis
You may like to summarise your analysis by bringing in additional attributes. In the example below we have summarised by Product Brand and presented the results on a 100% stacked bar graph. You can of course configure a navigation to enable the user drill-down on the graph to see the underlying comments: