With all the media hype surrounding the use of social networking and the London riots, it left me wondering what was actually being said on Twitter in the UK. It was also a good opportunity to test out the new MapTube map creation software which can handle 35,000 clickable points on a map with ease (we tested with 500,000). So, the aim was to create a map of UK tweets which I could explore around the areas where the riots were happening to see what people were saying about them on Twitter.
In order to do this, I used the Twitter client that Steven Gray wrote to collect geocoded tweets from Twitter. This has been used for things like the real-time heatmap of London 2012 #1yeartogo tweets: http://bigdatatoolkit.org/
The resulting map can be seen below:
http://www.maptube.org/map.aspx?s=DHxSpVYxbLGkFyNsERbBwcCnVsChF9 (link to live map)
Once the data had been collected as a CSV file containing “UserId”, “Time”, “Tweet”, “lat” and “lon”, the processing was done using Excel. This will feature as a separate blog post in more detail, but I created columns of riot related hashtags and cleanup related hashtags. Then these were combined into a colour code based on whether the tweet is a general tweet (Blue), contained a riot tag (red), contained a cleanup tag (green) or both riot and cleanup tags (yellow).
There are full details in the “more information” link on the live map, but I collected 34,314 geocoded tweets in the period, of which 1,330 contained riot hashtags and 87 contained cleanup hashtags.
What’s interesting about this map is that I was expecting more tweeting about the riots. 13,330 is less than 4% of all geocoded tweets.
Where this map really comes into its own is the ability to click on messages around the riot areas and see what people are saying. I think what this highlights is that you need some sort of natural language processing as there is obviously a lot of discussion about the riots not using hash tags. People are tweeting that they or their children are scared as it sounds really close, or tweeting to people to tell them they have arrived home safely.
The other interesting thing for natural language processing researchers is that Twitter has a language all of its own. When you start reading some of the tweets it’s obvious that the contractions and slang that is being used will be a challenge to understand.
One thing I need to fix in MapTube is that it returns too much information on the popup when you click on a location. The point and click functionality returns all points covering the area of your click. If you are zoomed out a long way, then this can be hundreds of points on the speech bubble popup which causes the client browser a number of problems. I think limiting to around 20 returned points would be a safer option.