Trackernet: Where are all the Tube Trains?

This is starting to become obsessive, but I can’t help wondering how many trains are running on the London Underground and where they all are. The Trackernet web service released by TfL allows you to see all the running boards for stations on a line, but doesn’t tell you where all the trains are. I did an earlier post about just the Victoria line trains, but I’ve now built this into a web service that works out locations for trains on the whole network.

 

Trains on the London Underground network for 11:30am on 30th November 2011

The map colours follow the normal line colours, so District (Green), Victoria (Light Blue), Central (Red), Northern (Black), Bakerloo (Brown), Jubilee (Grey), Piccadilly (Dark Blue), Waterloo and City (light green). Note that Circle and Hammersmith and City are all shown as yellow and there are no pink markers on the map. This is because the Trackernet API does not distinguish between Circle and Hammersmith and City trains and both lines are queried in one web request, so they’re difficult to separate out.

The idea is to build this into a web service and publish it on MapTube as a real-time Tube map. Using the locations of trains and the time to station information we can build a model of whether a line is running normally and where delays are occurring.

The basic technique behind how the positions are calculated relies on using the time to station information from the running boards at every station on the route to find the minimum time for every unique train. This is then taken as the most accurate location estimate and its position interpolated between the last and next stations based on the time. It is actually a lot harder to work out which line a train is on due to the fact that multiple lines can share platforms at the same station. For example, query the Piccadilly line and the District line and the resulting data will contain Barons Court for both, so you have to separate out the Piccadilly trains and the District trains and make sure you don’t count the same ones twice.

Now that the code can handle the Underground network, the next steps are to do the same for National Rail, London Buses and London River Services.

If you’re interested in live train data, it’s also worth looking at the following site that was created by Matthew Somerville: http://traintimes.org.uk/map/tube

FBX Exporters Part 4

In the previous three parts, I outlined the plan for getting geometry from MapTube via C Sharp into an FBX file using the C++ SDK provided by Autodesk. This final part shows data in a world map exported from MapTube and imported into 3DS Max.

The above image shows the world countries 2010 outline from MapTube. There a few countries missing as there was no data for them in the original dataset, so they show as blank on a MapTube map and their geometry is not exported. This is more visible in the perspective view where you can see the holes in Africa and the Middle East:

The exported geometry will eventually be coloured in the same way as MapTube, but for the moment all the geometry objects have a green material assigned to them.

The final export file can be downloaded from the following link: MyFBXExport

 

It’s worth pointing out that I’ve had a number of problems with the Quicktime FBX plugin that comes with the Autodesk FBX SDK. It seems to crash every time I close it and when displaying the above file there are some significant problems with how it renders the geometry. Most notably around the Hudson Bay area in Canada, parts of Europe and much of Russia. As it displays fine in Max, I can only assume this is a limitation of the Quicktime FBX renderer. I’ve also had to do some re-scaling of the geometry as it is exported in the Google Mercator projection, using metres. This means that the numbers are too big for Max to handle, so I’ve had to rescale them.

To recap on how this process works, here are the development steps needed to achieve it:

1. A C Sharp program which is a modification of the MapTubeD tile rendering procedure reads the geometry and data from the MapTube server and returns it as an iteration of SqlGeometry objects.

2. Each SqlGeometry object is simplified using the Reduce operation as we don’t need the full level of detail in the output FBX file.

3. The C Sharp program uses native methods in a C++ DLL, which I’ve written to control the operation of the FBX exporter in Autodesk’s SDK. A handle to the FBX document and scene that we want to create is obtained and then a native “AddGeometry” function is used on every geometry object until a final “WriteFile” function is called. The geometry is passed using the OGC well known binary format which is an efficient way of passing large blocks of complex geometry and is also independent of byte ordering.

The DLL which does the actual export is a 32 bit program with functions exported using C names rather than decorated C++ names to make it easy to link to the C Sharp function stubs. Internally, I’m using the GEOS library to parse the well known binary geometry, extract polygons and write the points to the FBX scene hierarchy.

That’s the proof of concept to demonstrate that this method works. The aim now is to see what we can do with geographic data now that we have the ability to load it into art tools like Max and Maya, game engines like Unity, or frameworks like XNA.

 

 

FBX Exporters Part 3

The first two parts of the FBX export process dealt with getting the FBX SDK working and exporting some simple geometry. Now what’s required is the ability to pass complex geometry from the MapTube side using C# over to the FBX side using C++. The obvious way to do this is to pass the geometry in the OGC well known binary format (WKB), so I’ve been looking at GEOS which is a C++ port of the Java Topology Suite (JTS). I’ve managed to use this in conjunction with the FBX exporter to create simple geometry from WKT which I’ve loaded into 3DS Max.

One of the problems I had was building a debug version of GEOS version 3.3.1 as the instructions aren’t quite right. The make command for a debug build is:

nmake /f makefile.vc BUILD_DEBUG=YES

As I’m using Visual Studio 2008, I had to run “autogen.bat” first to create the required header files, and also make sure I do a clean between the release build and the debug build. Once this library was built successfully, I could use the WKT reader to read in some test geometry and build an FBX exporter around it.

[c language=”++”]
string version = geos::geom::geosversion();
cout<<version;
//geom::Geometry* geos::io::WKBReader::read ( std::istream & is );
std::string poly("POLYGON ((30 10 0, 10 20 10, 20 40 20, 40 40 30, 30 10 0))");
cout<<poly<<endl;
WKTReader* reader = new WKTReader();
Geometry* geom = reader->read(poly);
delete reader;
[/c]

The entire FBX exporter is too big to replicate here, but the part that extracts the geometry from the GEOS geometry object and creates the FBX control points is as follows:

[c language=”++”]
//create control points
int NumPoints = geom->getNumPoints();
lMesh->InitControlPoints(NumPoints);
KFbxVector4* lControlPoints = lMesh->GetControlPoints();
CoordinateSequence* coords = geom->getCoordinates();
for (int i=0; i<NumPoints; i++)
{
lControlPoints[i]=KFbxVector4(coords->getOrdinate(i,0), coords->getOrdinate(i,1), coords->getOrdinate(i,2) );
cout<<coords->getOrdinate(i,0)<<","<<coords->getOrdinate(i,1)<<","<<coords->getOrdinate(i,2)<<endl;
}
[/c]

The only other thing I’ve done is to create a material for the polygon so it shows up as red in 3DS Max.

Now I’ve demonstrated that all the component parts work, the final stage of getting geometry from MapTube into 3DS Max will be to write a C++ library on top of the FBX exporter and GEOS which can be used as a native library from C#.

Exporting Geographic Data in FBX Files

I’ve been looking at how to export the geographic information contained in a MapTube map into an art tool like 3DS Max or Maya. The reason for this is firstly to make it easier to produce high quality geographic presentations, but also, by employing a recognised art tool chain, we can also get the data into 3D visualisation systems built around XNA (XBox) or Unity.

Originally, I was going to implement a 3DS exporter as this is a well-used format that would allow geometry to be imported by Google Sketchup, Blender, or a long list of professional art tools. After coming across Autodesk’s FBX SDK, I decided to create an FBX exporter instead. Although this is a format that can’t be loaded by either Sketchup or Blender, the SDK is quite flexible and can also export Collada (DAE) and Wavefront OBJ files which the free tools can import. In addition to this, it can be imported by both Unity and XNA.

Autodesk supply a viewer plugin for Quicktime, but I had some problems getting this to work with my first export attempts. The example below shows a simple screenshot:

Although a flat black plane on a grey background isn’t fantastic for a first attempt, it took a while to get this far as the examples don’t tell you that the Quicktime viewer doesn’t like ASCII format FBX files and you have to change the example format to BINARY.

[c language=”++”]
//altered export from ASCII to Binary
int lFormatIndex, lFormatCount = pSdkManager->GetIOPluginRegistry()->GetWriterFormatCount();

for (lFormatIndex=0; lFormatIndex<lFormatCount; lFormatIndex++)
{
if (pSdkManager->GetIOPluginRegistry()->WriterIsFBX(lFormatIndex))
{
KString lDesc =pSdkManager->GetIOPluginRegistry()->GetWriterFormatDescription(lFormatIndex);
printf("%s\n",lDesc.Buffer()); //print out format strings
//char *lASCII = "ascii";
char *lBinary = "binary";
if (lDesc.Find(/*lASCII*/lBinary)>=0)
{
pFileFormat = lFormatIndex;
break;
}
}
}
[/c]

This is a copy of the “ExportDocument” example that comes with the SDK, but with the type changed to binary to allow it to load.

The next problem is learning how to create my own geometry and figuring out a way of connecting the native C++ library to the managed C# code used by MapTube. My initial thought was to create a managed wrapper for the FBX SDK and use marshalling, but, on further examination of the SDK, it’s much too complicated to do in any reasonable amount of time. So, plan B is to write the code to do the export as a native C++ process, expose enough methods to allow this to be controlled through marshalling and interop via the C# code and do the FBX export through that route. This only depends on being able to marshall the large amount of geometry data, but this should be possible to work out.

After these first experiments, it’s looking like the pattern will be something like a reader/writer object with a choice of export formats as FBX, Collada or OBJ to allow the assets to be loaded into as many art packages as possible.

The next post will cover the generation of the geometry and its export to FBX.

Trackernet: The Victoria Line

I’ve been meaning to look at TfL’s Trackernet API for a while now. It works through a REST based web service which gives access to all the London Underground running boards on a line by line basis. You issue an http request of the form:

http://cloud.tfl.gov.uk/TrackerNet/PredictionSummary/V

and the result is an XML file containing train information for every station on the Victoria Line. Substitute “B” instead of “V” and you get the Bakerloo line instead. I had managed to figure out a way to get approximate train locations when the Victoria Line got suspended one morning, so I couldn’t resist looking to see where all the trains had ended up:

According to my data, there are 25 trains on the line. The way the positions are calculated is quite complicated as the original information comes from the running boards for every station and the time to platform estimates. Trains are uniquely identified through a train number and a set number as a composite key. I simply iterate through all the data for every station and take the lowest time to station for every train, which gives me the train’s next station. Then I use the location code provided by the API and the time to station estimate to interpolate between the last station and the next station.

One feature worth noting is that because the time to station is given for every station along the train’s whole route, you can use the data to build up a dataset of the time required to travel between any pair of stations. Also, because the information is processed from the running boards, the program should be able to process National Rail train locations from the information on their website.

By using only the information provided in the XML response from the API means that I am able to construct a web service that doesn’t require any state information to be retained between calls. In addition to this, it doesn’t require any knowledge of the tube network and how the stations are connected together.

This is still very much a prototype, but once it’s working for all the lines, it will be released as a real-time feed on MapTube.

Twitter Maps and #ukriots

With all the media hype surrounding the use of social networking and the London riots, it left me wondering what was actually being said on Twitter in the UK. It was also a good opportunity to test out the new MapTube map creation software which can handle 35,000 clickable points on a map with ease (we tested with 500,000). So, the aim was to create a map of UK tweets which I could explore around the areas where the riots were happening to see what people were saying about them on Twitter.

In order to do this, I used the Twitter client that Steven Gray wrote to collect geocoded tweets from Twitter. This has been used for things like the real-time heatmap of London 2012 #1yeartogo tweets: http://bigdatatoolkit.org/

The resulting map can be seen below:

Tweets captured from Twitter between 15:00 and 22:00 on Tuesday 9th August 2011. See text for further details.

http://www.maptube.org/map.aspx?s=DHxSpVYxbLGkFyNsERbBwcCnVsChF9 (link to live map)

Once the data had been collected as a CSV file containing “UserId”, “Time”, “Tweet”, “lat” and “lon”, the processing was done using Excel. This will feature as a separate blog post in more detail, but I created columns of riot related hashtags and cleanup related hashtags. Then these were combined into a colour code based on whether the tweet is a general tweet (Blue), contained a riot tag (red), contained a cleanup tag (green) or both riot and cleanup tags (yellow).

There are full details in the “more information” link on the live map, but I collected 34,314 geocoded tweets in the period, of which 1,330 contained riot hashtags and 87 contained cleanup hashtags.

What’s interesting about this map is that I was expecting more tweeting about the riots. 13,330 is less than 4% of all geocoded tweets.

Where this map really comes into its own is the ability to click on messages around the riot areas and see what people are saying. I think what this highlights is that you need some sort of natural language processing as there is obviously a lot of discussion about the riots not using hash tags. People are tweeting that they or their children are scared as it sounds really close, or tweeting to people to tell them they have arrived home safely.

The other interesting thing for natural language processing researchers is that Twitter has a language all of its own. When you start reading some of the tweets it’s obvious that the contractions and slang that is being used will be a challenge to understand.

One thing I need to fix in MapTube is that it returns too much information on the popup when you click on a location. The point and click functionality returns all points covering the area of your click. If you are zoomed out a long way, then this can be hundreds of points on the speech bubble popup which causes the client browser a number of problems. I think limiting to around 20 returned points would be a safer option.

Making Maps From Wikipedia

I’ve been looking at web-based sources of geographic data and Wikipedia links are something I’ve wanted to try out for a while. I found the following page containing Worldwide fossil sites:

List of fossil sites

This gives a list of sites, but with no locations:

The “Site” column contains href links which can be followed to pages like the following:

The coordinates can just be made out in the top right hand corner of the page.

As all Wikipedia pages follow a common theme, the coodinates are embedded in a <span> tag with class=”geo”. I already had a Java program for loading a web page and converting it into xhtml, so I used this to turn the original list page into a csv file by extracting the data out of the html tables. One of the columns in this file contained the links to the site-specific pages, so another program was written to follow all these links and extract the location from the site page.

While the general technique works, only about 25% of the data had links to pages with lat/lon coordinates, so the final map is somewhat incomplete, but this can be edited manually. The map itself was built using MapTube’s new map creation system which works using the CSV file of data. The final map can be viewed at the following link:

Fossil Sites

New MapTube Map Creation Feature

We released a new feature on the MapTube website today which will make it easier to create new maps from data in CSV files. The underlying technology is used on the SurveyMapper site and for other real-time visualisations like http://bigdatatoolkit.org/2011/07/26/1yeartogo/ which shows tweets using the #1yeartogo hashtag for the London 2012 Olympics.

Creating a map of abandoned vehicles from the London Datastore using MapTube

The new update to MapTube adds a graphical user interface which allows the user to upload a data file, choose a colour scale and publish the map on MapTube directly. One of the driving forces behind this was the idea that creating a map should be simple enough that you could do it using an iPad. Data on the London Datastore  is in the correct format, so you can copy the CSV link directly from the site, which is exactly what has been done in the above image. I’ve created a YouTube clip showing the whole process, which can be viewed at the following link:

http://www.youtube.com/watch?v=naaSv7ihGOQ

This feature is still experimental, but at the m0ment it handles point data in lat/lon coordinates (WGS84) or OS coordinates for the UK (OSGB36). Point data can be drawn using markers, or as a heatmap showing point density. For area data, one column in the data is selected as a key field and this is joined with the geographic data stored in MapTube’s database to draw the map. For example, using the following data:

We have four columns: Constituency, Party, PartyCode and Change. In the CSV file the first line must be the column headings, then every subsequent line contains data. The CSV file would contain the following:

Constituency,Party,PartyCode,Change
Aberavon,LAB,1,LAB Hold
Aberconwy,CON,2,CON Gain
etc...

The “Constituency” column is the area key in this case, but MapTube determines this automatically when the CSV file is loaded, along with the type of geography, which is Parliamentary Constituencies. In order to colour the map, numeric data is required, so in this example, a column labelled “PartyCode” has been added where “LAB”=1, “”CON”=2, LD=”3” etc.

The colour scale is then chosen and the finished map submitted to MapTube where it can be viewed along with any of the other maps. There are help pages accessible through the ‘i’ icon on each section which contain further information.

As mentioned before, this feature is still experimental and we will be gradually adding more geographic data to the MapTube database to allow maps to be built from additional geographies. The aim is for MapTube to be able to automatically detect the geography just by analysing the data and, at the moment, the following geographies can be used:

Government Office Regions (UK) (GOR)
Lower level super output areas (UK) (LSOA)
Medium level super output areas (UK) (MSOA)
Output Areas (UK) (OA)
Postcode Districts (UK) (PostcodeDistricts)
County and Unitary Authority (UK) (CountyUA and ONSCountyUA)
Districts (UK) (Districts and ONSDistricts)
Census Area Wards (UK) (CASWards)
World Borders 2010 (WorldBorders2010ISO2 and ISO3 using the ISO country codes)
Parliamentary Constituencies 2010 (UK) (PCON2010)

US States and Zip code areas will be added shortly, along with adminsitrative and Census boundaries for other parts of the World.

Weather Underground

I’ve been looking at the Weather Underground API (http://wiki.wunderground.com/index.php/API_-_XML) which gives access to the observation stations and the data they are collecting.

All the stations returned from the Weather Underground XML API when using "London" as the search string. Colour indicates air temperature with blue=12.7C, green=13.9C and red=20.5C

The API uses simple commands to query for a list of stations, for example:

http://api.wunderground.com/auto/wui/geo/GeoLookupXML/index.xml?query=london,united+kingdom

Using C# and .net, this is accomplished as follows:
[csharp] WebRequest request = WebRequest.Create(string.Format(GeoLookupXML, @"london,united+kingdom"));
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
XmlDocument doc = new XmlDocument();
doc.Load(response.GetResponseStream());[/csharp]
Then the returned XML document is parsed using XQuery to extract the station name, lat/lon coordinates and whether it is an ICAO station or a personal weather station.
[csharp]XmlNodeList Stations = doc.GetElementsByTagName("station");
foreach (XmlNode Station in Stations)
{
XmlNode IdNode = Station.SelectSingleNode("id");
XmlNode ICAONode = Station.SelectSingleNode("icao");
}[/csharp]
This gets us a list of stations ids and ICAOs which can then be used to build individual queries to obtain real time data from every station:
[csharp]foreach (string Id in PWSStations)
{
XmlDocument ob = GetCurrentPWSOb(Id);
XmlNode Ntime = ob.SelectSingleNode(@"current_observation/observation_time_rfc822");
XmlNode Nlat = ob.SelectSingleNode(@"current_observation/location/latitude");
XmlNode Nlon = ob.SelectSingleNode(@"current_observation/location/longitude");
XmlNode NairtempC = ob.SelectSingleNode(@"current_observation/temp_c");
string time = Ntime.FirstChild.Value;
string airtempC = NairtempC.FirstChild.Value;
string lat = Nlat.FirstChild.Value;
string lon = Nlon.FirstChild.Value;

//do something with the data…
}

//NOTE: only slight difference in xml format between PWS and ICAO
foreach (string ICAO in ICAOStations)
{
XmlDocument ob = GetCurrentICAO(ICAO);
XmlNode Ntime = ob.SelectSingleNode(@"current_observation/observation_time_rfc822");
XmlNode Nlat = ob.SelectSingleNode(@"current_observation/observation_location/latitude");
XmlNode Nlon = ob.SelectSingleNode(@"current_observation/observation_location/longitude");
XmlNode NairtempC = ob.SelectSingleNode(@"current_observation/temp_c");
string time = Ntime.FirstChild.Value;
string airtempC = NairtempC.FirstChild.Value;
string lat = Nlat.FirstChild.Value;
string lon = Nlon.FirstChild.Value;

//do something with the data…

}[/csharp]
After that it’s simply a matter of writing all the data to a CSV file so that you can do something with it.

Air temperature for London plotted using the MapTubeD heatmap tile renderer

A Week in the Life of a Tile Server

Recently, BBC Look East have been running a “Broadband Speed Survey”, asking people to use an online tester to check their broadband speed, and then enter the value, along with their postcode, into SurveyMapper. This generated 16,311 responses to the survey, but for each response people get to view the map containing the latest data, so the tile server drawing the data on the map gets about 100 times as many hits.

When the survey was advertised on the 18:30 news bulletin on the Tuesday that week, we started to get a huge number of hits in a very short space of time. The following graph shows the hits by hour of day for all five days that week.

The peaks tie in quite well with the 18:30 and 22:30 news bulletins, but it can be seen from the statistics that the tile server took over a million hits in the space of a couple of hours. The tile server itself is a single machine running Server 2008 R2 Core, virtualised with two processors assigned. Once it became apparent how many hits we were getting, this was increased to 4 processors and 4GB of RAM. This shows the main benefit of virtualisation for us, which is that we could shutdown non-operational machines used purely for research and divert the computing power to the operational web servers which were taking the high loads. In order for the maps on SurveyMapper to work, we are also dependent on a database server and the dedicated web server which runs the MapTube and SurveyMapper sites, in additional to the tile server. What’s interesting about this experience is that it taught us that the database server is capable of handling a much higher load than this.

From the graph of the daily hits, it can be seen that most of the traffic was on Tuesday 22nd February, which is the first day it was advertised on the news. After this it tails off as the week progresses. One other interesting thing that was noticed when analysing the log files is the browser and operating system statistics.

Browsers used to access SurveyMapper
Browsers used to access SurveyMapper

 

Operating Systems
Operating Systems

So, from these statistics, it’s a three way split between Windows XP, Vista and 7, with IE8 the most popular browser. Chrome, Firefox and Safari are lagging behind, which is surprising bearing in mind the profileration of Macs.

Now that we’ve proved a single element IIS7.5 server can take a million hits, we’re looking into the possibility of creating multiple tile servers dsitributed across two virtualisation servers and load balancing.