We recently attended Qonnections 2012 event in Miami, the annual QlikView partner summit, where we were invited to present in 4 of the sessions and, more importantly, had the great opportunity to meet face to face with a large number of partners and QlikView consultants who have been working with QVSource - many since the very early beta releases.
It was quite evident given the active community of QlikView users on Twitter that there was likely to be a lot of Tweeting going on at the conference and so, on arriving at the Fontainbleau in Miami on the Sunday - rather than going out to the beach only a few metres away and enjoying the sun - we decided it would be fun to see if we could put together a QVSource powered QlikView application to track the traffic.
In this post I hope to give an overview as to how we were able to quickly achieve this in just a couple of hours!
The brief specs we had for the application were:
- Capture Tweets containing the word qonnections.
- Score the tweets for sentiment.
- Retrieve Klout score for all the Tweeters.
- Display a simple 1 page dashboard with some key metrics such as volume of tweets and average sentiment versus day, top tweeters, common hash tags and keywords used.
- Regular refreshing - preferably every 5 minutes so the dashboard is up to date with the state of tweets.
Twitter Search
After creating a blank QlikView application, the first step was to add the Twitter Search load script. To do this we fire up QVSource, open the Twitter Connector and enter the search term we are interested in:
We then move to the 'Connector Data' tab and on the Search table click the option to copy the load script:
And paste this into our QlikView application - the code will look something like this (the field names have been removed for compactness as there are a lot of them):
TwitterConnector_Search:
LOAD
... [long list of fields removed for clarity]
FROM
[http://localhost:5555/QVSource/TwitterConnector/?table=Search&searchTerm=qonnections&searchmaxpages=15&
fixedcolumns=false&format=html]
(html, utf8, embedded labels, table is @1);
Note the highlighted text above illustrating where the search term is. In general, any useful configuration parameters in the QVSource connector are carried through into the QlikView script so that they can easily be updated and manipulated in your load script.
Because the Twitter search will only return a limited number of Tweets (usually up to 1500 or 1 weeks worth of Tweets per query) we then edit this script so that the Tweets are accumulated into a QVD file each time the reload script is run.
TwitterConnector_Search:
LOAD
*
FROM 'Twitter.qvd' (qvd);
TwitterConnector_Search:
LOAD
... [long list of fields removed for clarity]
FROM
[http://localhost:5555/QVSource/TwitterConnector/?table=Search&searchTerm=qonnections&searchmaxpages=15&
fixedcolumns=false&format=html]
(html, utf8, embedded labels, table is @1)
where not exists ('Search_id', id);
STORE TwitterConnector_Search INTO 'Twitter.qvd';
Note that we had to modify this script in two stages, just adding the line to create the QVD during the initial reload and then adding in the line to load from the QVD in future reloads.
Klout Score
The next step is to introduce the Klout score. We can use the Klout Connector UI to generate the template QlikView load script for this (similar to how we did for the Twitter search above) - what we end up with is the following which loops through all the users and calculates the score.
LET noRows = fieldvaluecount('Search_from_user');
for i=1 to $(noRows)
let user = FieldValue('Search_from_user',$(i));
Klout:
LOAD
'$(user)' as Search_from_user,
status as KloutScore_status,
score as KloutScore_score,
fromcache as KloutScore_fromcache
FROM
[http://localhost:5555/QVSource/KloutConnectorV2/?table=UserScore&TwitterId=$(id)&format=html]
(html, utf8, embedded labels, table is @1);
next
Sentiment Analysis/Text Analytics
The final piece is to add in sentiment analysis and keyword extraction. As above we can use QVSource to generate the outline QlikView script for this - we then make a few additions and edits to this to loop through every Tweet in the application and for each we score it for sentiment using two different APIs and extract keywords using one API.
LET noRows = NoOfRows('TwitterConnector_Search');
for i=0 to $(noRows)-1
let id = peek('Search_id',$(i),'TwitterConnector_Search');
let text = peek('Search_text',$(i),'TwitterConnector_Search');
let textEncoded = text;
call urlEncode(textEncoded);
Sentiment:
LOAD
'Repustate' as SentimentEngine,
'$(id)' as Search_id,
status as sentiment_status,
score as sentiment_score
FROM
[http://localhost:5555/QVSource/SentimentAnalyser/?table=Sentiment&SentimentEngine=Repustate&Message=$(textEncoded)&format=html] // specify url param
(html, utf8, embedded labels, table is @1);
Sentiment:
LOAD
'AlchemyAPI' as SentimentEngine,
'$(id)' as Search_id,
status as sentiment_status,
score as sentiment_score
FROM
[http://localhost:5555/QVSource/SentimentAnalyser/?table=Sentiment&SentimentEngine=Alchemy&Message=$(textEncoded)&format=html] // specify url param
(html, utf8, embedded labels, table is @1);
set errormode=0;
SentimentForKeywords:
LOAD
'$(id)' as Search_id,
text as KeywordsWithSentiment_text,
relevance as KeywordsWithSentiment_relevance,
sentiment_type as KeywordsWithSentiment_sentiment_type,
sentiment_score as KeywordsWithSentiment_sentiment_score
FROM
[http://localhost:5555/QVSource/SentimentAnalyser/?table=KeywordsWithSentiment&Message=$(textEncoded)&
SentimentEngine=Alchemy&format=html]
(html, utf8, embedded labels, table is @1);
set errormode=1;
next
Note that the urlEncode method is a function which we must also include at the start of our load script - this function is available on our wiki.
Schema
We are now able to reload the application - which results in the following schema.
We now have the data model and a QlikView application that we can reload at regular intervals to load in new Tweets. Note that QVSource will take care of caching the results of sentiment analysis, keyword extraction and Klout score lookups which will help speed everything up. However, if this were a production application we could refine the script further to improve load performance (for example storing the sentiment results in a QVD and never even requesting the sentiment results from QVSource more than once). However as we only have a couple of hours before the conference starts we figure we can live with this load script!
The Dashboard
We are now into standard QlikView territory, we just need to add a few charts to our application - for example:
Tweets vs day:
Sentiment vs day:
Top Tweeters:
Tweet Feed:
Keywords:
A few more tweaks and additions and an hour later and we have a functional dashboard:
We then upload this to our server and set the reload schedule to reload the application to every 5 minutes:
And we're good to go!
Summary
You can see the application here - what we have actually created is a solution which could easily be used to monitor any event, simply by changing the Twitter search term from 'qonnections' to a hashtag or other word which is likely to be used by attendees at the event - of course we could also search for multiple terms and enhance the application in a hundred other ways - but for only two hours work we were really pleased with the results and feedback we got, including a blog post from Visual Data Group on how it helped them feel like they were at the conference!
If you would like more information about QVSource - The QlikView API Connector - please visit our website.