Hi, I'm Gregor, welcome to my blog where I mostly write about data visualization, cartography, colors, data journalism and some of my open source software projects.

Why I Hate Coding With Processing – At Least Sometimes..


Well, as all of you might know, Processing is cool and easy to learn, especially for graphic designers and makers of all kinds, and finally, Processing is a fun thing to work with. Really? Actually this is not quite the experience I get whenever I start sketching a visualization. Yesterday was such a day, when I found myself running from obstacle into obstacle. Here’s a short documentary of what went wrong and why.

Sketch names must be at least three characters long

Surprise, surprise, I’m a geek, and as such, I usually prefer shorter file names. So, yesterday I called the first of a planned series of sketches “p0”. I started coding a bit and then, after running my sketch, I got this handy little error message:

java.lang.IllegalArgumentException: Prefix string too short

After googling around (the first time), I found this thread in the Processing forum. Someone explained the behaviour as follows.

I noticed that if you use a short name for a script (i.e. “p3”), it won’t run.

Surprisingly, his example used almost the same sketch name, which clearly identifies him as a geek as well. I know, this clearly looks like a bug and I instantly filed a bug report.

No built in function for reading CSV files

Next thing I want to do is to read in a CSV file, which is almost always the first step of visualizing not-random data. Unfortunately I remembered that there was no built-in function for doing this in Processing. For instance, in R all you need to type is:

data = read.csv('data.csv')

In Python, opening a csv file is a two-liner.

import csv
data = csv.reader(open('data.csv', 'r'), dialect='excel-tab')

But in Processing you have to do the parsing yourself:

String [] lines = loadStrings("data.csv");
int i = 0;
for (String l : lines) {
    if (i++ == 0) {
        // special treatment for headers
    String[] cols = split(l, '\t');
    // do something with the data

Separating the data parsing and the data processing is not easily possible. I know that Ben Fry wrote a class Table that handles CSV import, but what I don’t understand is why this is not a core functionality of Processing. I mean, Ben Fry actually created Processing, he simply could have included this. The same is true for XML, JSON and many other data formats.

Charset handling while opening files

Now things get even worse. Actually, I didn’t managed to parse the CSV file because loadStrings didn’t manage to read the file. Instead I got this neat error message (at least without the useless Java stacktrace):

The file “data.csv” is missing or inaccessible, make sure the URL is valid or that the file has been added to your sketch and is readable.

I checked that the file is located in the data folder inside my sketch root and I even dragged it in the sketch window to make sure the IDE didn’t miss anything. I reduced the code to the loadStrings() call, but nothing works. Googling this error (again) leads me to this thread in the Processing forum. Somebody mentioned that this could have something to do with charset issues and suggests the following:

String[] lines = loadStrings("data.txt", "ISO-8859-1");

As I’m used to deal with different charsets in Python (2.7), I’m kind of used to this and was happy to discover the charset parameter. But, unfortunately this doesn’t work (anymore) in Processing 1.5.

The method loadStrings(String) in the type PApplet is not applicable for the arguments (String, String)

Ok, looks like somebody removed the second parameter. Later in the same thread, Ben Fry explained the current behaviour.

No, please read revisions.txt for changes. All files are now treated as UTF-8 by default, to deal with this issue.

I admit that I haven’t read the revisions.txt for a while, but I finally looked up the current version for the term “loadStrings”, but no match. Guess I have to read it in more detail, sometimes later. Now I want to read in the file. I looked up the encoding and found out that my text editor stored the file as Western (Mac OS Roman), which, for some strange reasons, seems to be the default behaviour of the editor. Converting the file to Unicode finally fixed the bug. One could argue that treating all files as UTF-8 by default is not as smart as allowing the user to specify the encoding of their files. But the crucial point about this error is the error message. A friendly message like this would have saved me lots of time and nerves.

There was an charset error while reading “data.csv”. Please check that the file is encoded in UTF-8 – and please read the revisions.txt!

I hope you don’t understand this post as a pure rant. Instead I just wanted to document an user experience, through which probably many users will run. Still, I love Processing and all the things that it makes possible. And I will continue using it, although I really hate Java. This is maybe the worst design decisions of all: basing an amazing project like Processing on a language like Java. Update: I just read that Processing 2.0 will include built-in support for JSON and Tables, which is pretty cool.