More Dopplr Raumzeitgeist detail: Cluster Cities
We got great reaction to our Raumzeitgeist post, including a lot of queries about the data-set and, unfortunately, some confusion over what we were showing in the main image.
Most people assumed (probably due to me omitting the details - my bad) that we had plotted every trip that people had made last year.
Here’s the amazing part - even with the coverage of the globe shown on that map, it’s not nearly everywhere Dopplr travellers went!
What we are showing are what we call ‘Cluster Cities’ which is the basis of some new functionality you might have already noticed if you’re a Dopplr user that’s been studying their journal feeds/emails closely.
Rewind to Reboot, Copenhagen at the end of May 2007.
We’re sitting on the grass in the sunshine with a bunch of early Dopplr users, including Stowe Boyd and Stephanie Booth - when Stephanie is the first to voice something we’ve heard a lot from Dopplr users since: “make my trips more ‘fuzzy’”.
By which, she and others meant that they would like to see coincidences in the surrounding area of ‘social spacetime’ to their trip - i.e. “show me if there are going to be people I know nearby the stated destination of my trip when I’m going to be there, as I’d probably like to change my plans a little to see them.”
This is a cornerstone of our goal to help optimise travel for Dopplr users - surfacing information about such near coincidences to let them judge whether to alter their plans to make their trip more worthwhile.
We’re going to be releasing a lot of functionality to exploit fuzzy, social spacetime through the early part of 2008, but the first part of it has leaked out into the journal.
Here’s an example.

Previously, if I’d added a trip to San Francisco, I’d have missed out on seeing coincidences with people who live in Oakland, or are visiting Berkeley for instance. Now we fuzz up the edges of your trip’s destination to include those nearby, so you can make that valuable visit on the BART…

Cluster Cities are the way we’ve made this happen. To explain them, here’s Matt B. with the science bit!
To make the database queries perform well enough to implement this feature, we needed to classify cities in densely populated areas into groups. By considering groups of cities as one, we cut down the work the database has to do when calculating who is affected by someone arriving in their area. We decided that these groups should be small enough that a traveller could reasonably expect to travel between any two cities in a group within a day.
Algorithms to cluster a spatial dataset are well known and not hard to implement. Unfortunately, they take a bit of tuning and experimentation to achieve satisfying results. Intuitively we expect cities like London, Tokyo and San Francisco to be at the centres of their clusters. In reality it’s rather hard to teach the cultural/social/economic conditions that cause this to an algorithm that’s only looking at latitude and longitude.
After some initially disappointing results, I stopped looking solely at the geographical data and considered what I could do if I incorporated the historical trip data that Dopplr has built up over our first year. I quickly came to the obvious conclusion: weight the clustering by the popularity of trip destination and let our travellers decide whether San Francisco or San Jose is the gravitational centre of Silicon Valley.
In analysing the top 2000 destinations I discovered that many of the top cities are very close together — for example, Glasgow and Edinburgh are only 40 miles apart. Again I used our trip data to eliminate overlaps. Within any 50 miles radius, only the most popular of two popular cities gets to be the cluster centre. This decision is one reason for the beauty of our central Raumzeitgeist visualisation. The layout has an appealing rhythm to it because the points in popular areas are a natural and fairly efficient circle packing.
The map of Cluster Cities visited in 2007 is what we’re using in the Raumzeitgeist image, and we’ve made a KML file for you to explore it in Google Earth that you can download here, and we hope you create some of your own images / visualisations from the data.
Aaron Straup Cope who helped me create the main image also did some fun visualisations based upon mashing it up through our API with Dopplr’s city colour-coding (as described previously) that I’ve illustrated this post with.
If you make something shiny - tag it “raumzeitgeist“, or let us know and we’ll try to blog it here.
Have fun!



nice work, i especially like the visualization. what software generates these things and how would a map look, that shows the actual routes? is there a way to modify the bars to reflect the number of trips taken to?
thanks!
[…] The Dopplr guys have listened to the users, again: [from More Dopplr Raumzeitgeist detail: Cluster Cities] […]
[…] our travel plans, to wallow in the giant ball pool of trips and coincidences that we create. Their enthusiasm for the data is […]