Re-coloring Illustrator graphics based on JSON data files

When working on choropleth maps or charts in Illustrator, sometimes the (final) data is not yet available by the time you’re designing the graphic.

The typical work-around is to re-import the updated part of the graphic and align it with the rest of the artwork. But this is tedious work, especially if you’re dealing with multiple maps.

To address this problem I wrote an Illustrator script that can re-color the artwork based on a JSON file. This blog post will walk you through how to use the script.

Continue reading

Analyzing bias in opinion polls with R

Never trust a statistic that you
haven’t visualized yourself.

It’s election time in Germany and, as usual, there are tons of opinion polls telling us who is going to win the election anyway. It is debatable whether or not pre-election polls are healthy for our democracy in general, but at least everybody agrees that the polls should be kind of neutral. And if they are not, the institutes publishing the polls should be blamed publicly.

But how do we know if an institute publishes ‘biased’ polls? You guessed it: with data. More precisely: with data and the unique power of data visualization.
Continue reading

Start using databases, today!

This post is written to welcome dataset, a new library to simplify working with databases in Python.

Let’s face it. Relational databases, such as MySQL, SQLite and PostgreSQL, are pretty cool – but nobody actually uses them. At least not in the day-to-day work with small to medium scale datasets. But why is that? Why do we see an awful lot of data stored in static files in CSV or JSON format, even though

  • they are hard to query (you need to write a custom script every time)
  • they are messy, as they cannot store meta data such as data types
  • it is a pain to update them incrementally, say if some record has changed

click to read the answer :)

Map Symbol Clustering —
k-Means vs. noverlap

While working on the soon-to-be-released map widget for Piwik (heck, it’s been over two years since the first sketches!) I implemented two map symbol clustering algorithms into Kartograph.js. Last year I wrote about why this is a good idea, and now I turned that advice into re-usable code.

In this post I want to share my findings after experimenting with different clustering techniques.
Continue reading

Using Twitter as Social Bookmarking Service

A while ago I realized that I totally stopped using social bookmarking services since I started tweeting. Whenever I find an interesting link I share it on Twitter. If I find interesting links tweeted by the people I follow I’m most likely to favorite that tweet. I guess that’s the way most people use Twitter. How often did you check someone’s public links on delicious? I rarely did.

Continue reading

Why We Need Another Mapping Framework

Over the last two years, cartography has drawn my attention from time to time. In 2009 I started my work in the field by porting the PROJ.4 library to ActionScript. My first notable interactive map application was a world map widget for the Piwik Analytics project, which is in use until today. It was born from the need to have a simple world map that is lightweight, easy to use and completely independent from external map services like Google Maps.

Continue reading

How To Avoid Equidistant HSV Colors

As some of you pointed out in the comments of my last post, taking equidistant colors in the HSV color space is no solution for finding a set of colors that are perceived as equidistant. This post describes what’s wrong with HSV and what we can do about this. Note that since this post contains interactive elements built on the latest web technologies, you might need a modern browser to get the most out of it.
click here for ultimate color geekyness

Matching Regions of GeoIP and Natural Earth

In this post I’ll describe the process of bringing together the region shapes from the Natural Earth dataset with the regions provided in the GeoCityLite database. In the GeoCityLite db, the regions are referenced by a two-letter ID (FIPS10-4 for some countries, ISO3366-2 for others). Initially I thought that those IDs would be same as used in the Geonames admin-level 1 region db, which brought me to the first idea of mapping the regions via name similarity.

Continue reading