Archive for the ‘Data Visualization’ Category

Global Agenda Survey 2011

Friday, November 11th, 2011

I’ve had the honor to collaborate with the fabulous Moritz Stefaner on a project for the World Economic Forum. We have created several interactive visualizations that allow users to explore the Global Agenda Survey 2011, a survey held each year by the World Economic Forum to discover the most important trends in the world according to its council members.

Except for the network graph, the visualization is compatible down to IE6. We achieved this by using raphael.js, underscore.js. The network has been created in d3.js.

Interview on Fell In Love With Data

Friday, October 14th, 2011

Enrico Bertini from Fell In Love With Data has been so kind to post an interview with me about Protovis and D3.

You can read the full interview here.

Thanks Enrico!

Nerds Unite recap

Monday, September 26th, 2011

Wednesday I had the honor to give a small presentation about data visualization at Nerds Unite in Utrecht (NL). It was a small group of people who are interested in data, open data, government and of course visualization. After a short introduction I showed some of my work, which was well received. After that Eugene Tjoa gave a talk about creating visualizations for the Central Bureau of Statistics (CBS) in The Netherlands.

It was a great evening with many interesting people. Thanks!

Photo by Sebastiaan Terburg

Guest posts about Eyeo

Wednesday, June 29th, 2011

I am writing 3 daily guest posts about Eyeo Festival on Infosthetics.com and 1 summary guest post on Visualizing.org. I’ll update this post with links to each of the individual posts.

Visualizing Europe

Wednesday, June 15th, 2011

Today was one of those days I won’t forget very soon: today was Visualizing Europe day, and many enthusiasts, practitioners, researchers and users of data visualization gathered in Brussels for an inspiring day of talks and meeting interesting and kind people from the data visualization community. The day was divided into 3 sessions:

  1. the power and potential of data visualization
  2. a vision for Europe
  3. where do we go from here?

The first sessions showed some of the best works currently created in data visualization: Santiago Ortiz from Bestiaro showed the power of the visual programming paradigm of Impure can be used to create sophisticated data visualizations in minutes (did I say minutes? seconds!)

Next, Moritz Stefaner showed two of his recent and impressive projects: the Better Life Index project that was recently launched by the OECD. And his previous famous project: Notabilia, which shows deletions on Wikipedia.

 

Enrico Bertini gave a fantastic talk from a research perspective and explained different approaches of making a data visualization for the public and for the tiny group of people who are actually solving real world problems with data visualization. A quote that was tweeted numerous times immediately: “data visualization is useless, it is indispensable”. He also highly recommends the book: “How Maps Work” (which of course is on my wishlist now!).

Last but not least of this first sessions was Dave McCandless from Information is Beautiful. Dave showed some of his work, and a remarkable quote was: “I disagree with Moritz, I’m not looking for 1000 stories, I’m looking for 1 story that’s interesting”.

After a short coffee break, session 2 started with Gregor Aisch showing how he creates data visualizations as the Open Spending project for the Open Knowledge Foundation. He proposed a new approach for data visualization, namely ‘open data visualization’, which is open source + open data + open to community. An fascinating idea I’d like to learn more about.

Assaf Biderman from the MIT Senseable City Lab impressed us with some of their cutting edge projects they do together with governments and cities, like the Trash Tag project which tracks and visualizes where trash is being transported all over the USA after people have emptied their trash bin. Another project that keeps impressing me is the Copenhagen Wheel, an augmentation to bicycles that allows bikers to track their own performance, and at the same time measures various air conditions of the city. This data is collected and visualized to understand more about the city’s air pollution.

Salvatore Iaconesi from Art is Open Source elaborated on how the artistic world uses data and visualization to change paradigms, for example in supermarkets: while in the supermarket data is visualized on your iPhone and shows the geographic origins of the chemical compounds of your products.

Last but not least, Peter Miller from ITO World had to rush through his slides where he showed some very compelling and sometimes fine-grained user contributions to the Open Streetmap project. It’s impressive to see how user contributions can lead to sometimes more correct maps than non-crowd-sourced maps.

The final session was a discussion between Franco Accordino and Jean-Claude Burgelman from the Europen Commission and Toby Green from the OECD. The main subject was: what did they take from today’s sessions, and what will they do with it. It was very good to see that the value of data visualization was recognized, and that the EU sees data visualization as one possible and valuable way to create new knowledge, which is very important.

Finally, the day was finished by meeting so many people from the data visualization community. It was amazing to meet so many people whom I’ve been in contact with for quite some time now. Thanks Visualizing.org for organizing this wonderful day, and everybody who has contributed. It was a memorable experience!

Gost Counties visualization wins DataViz challenge

Monday, June 6th, 2011

Just a short note that I am honored, thrilled and excited to announce that my Ghost Counties data visualization has won the Visualizing.org ‘Visualizing the 2010 US Census data visualization challenge‘. A great thank you to the jury from Visualizing.org and Eyeo Festival and all the kind messages I received!

Eyeo Data Visualization Challenge: Ghost Counties

Monday, May 16th, 2011

About a day before the deadline I have submitted my entry for the Eyeo Data Visualization Challenge by Visualizing.org where the grand prize is a ticket to the brilliant Eyeo Festival.

You can see the final result here: http://www.janwillemtulp.com/eyeo.

The visualization depicts the number of homes and vacant homes for all the counties for each state. The size of the outer bubble represents the total number of homes, the size of the inner bubble represents the number of vacant homes. The y-axis shows the population size (on a logarithmic scale) and the x-axis of the bubbels shows the number of vacant homes per population. Each bubble is also connected with a line to another axis: the population / home ratio. On the top right you can see some exact numbers for this data.

This time I built the visualization in Processing, mainly because I expected to work with large datasets from the US Census Bureau and I might had to use some OpenGL for better performance. Eventually I didn’t use OpenGL. Building the visualization in Processing was lots of fun. To get sense of the data I tried as many as 5 completely different approaches. Here are some of the sketches that eventually led to this visualization (view this selection on my Flickr stream).

The data itself was not very complex, but rather big, and the biggest challenge was to find a creative approach to visualize this data, but without using a map (which would be rather obvious since it’s about locations).

SEE#6 conference

Tuesday, April 12th, 2011

This weekend I have visited SEE#6 in Wiesbaden, Germany. Although this was the sixth SEE conference, it was my first visit. And I must say, it was overwhelming! Here’s a brief overview of my experience.

Day 1 – conference

The conference was situated in the beautiful Lutherkirche, which was one of the most beautiful and atmostpheric conference locations I’ve ever been to. After a friendly welcome from Micheal Volkmer from the hosting organization Scholz & Volkmer, prof. dr. Harald Welzer. He gave an inspirational talk about sustainability, the main subject of the conference, and he gave us his view on how to fight climate change and change human behavior to improve a sustainable society. 

After the keynote Carlo Ratti of the MIT Senseable City Lab in Boston showed us some of his recent projects on how ubiquitous computing is entering our society more and more, and how sensors can help cities and citizens to be more aware of the environment, and improve sustainability. On this image you see a visualization by the Lab that shows real-time whether and rain situation in combination with taxi locations: are taxi’s on places where they are needed the most?

After a break, the young and talented Alexander Lehman took over. He showed us some of his animated infographics, and how he uses satire to tell a story. He elaborated on his most successful project: “du bist terrorist” where he uses satire in an attempt to make people more aware about the increasing danger of the government collecting all kinds of data about its citizens.

Brendan Dawes had a very humorous and inspirational talk about how he is using his creativity to do very inspiring projects, both for customers as personal research projects. He had many great examples of his projects and one of them was a homebuilt digital/wooden weather indicator: not because it really solves a problem, but just because it’s fun and cool:

Wesley Grubbs from Pitch Interactive showed us some of his great projects. Projects where he has visualized many rows of data that resulted in high resolution (and long rendering time) images. One of his most compelling examples was a visualization where he shows some insight in how the US Defense is spending its money.

After the beautiful images from Wesley, Joshua-Prince Rasmus from REX architects gave his presentation. Before the conference I didn’t really know how architecture would fit in a conference about information visualization, but it turned out that this was one of the talks that impressed me most. At REX they’ve defined a very strong process of doing work for clients. Joshua talked us through it. He showed the beautiful images he uses in presentations to show his ideas, and at REX they’re brilliant in working within constraints, creating flexible buildings and, perhaps most importantly, really understand their clients, so that they eventually build something clients really understand and agree with. Great talk!

Final talk was by the talented Justin Manor from Sosolimited. Justin inspired us by showing a few of his great installations. One that he gave special attention was real-time analysis of political debates, where a very large number of different approaches were shown how to interpret, categorize and analyze the words and sentences politicians are saying. His visualization is built in Processing. This concluded the first day.

Day 2 – discussion and workshop

The second day was an extra day for data visualization die hards, to have some good discussions about data visualization. The discussion was led by 3 prominent people from the data visualization community: Moritz Stefaner, Andrew vandeMoere and Benjamin Wiederkehr. The day was basically split into 2 halves where the first part was a discussion, and the second part some of the presenters of the conference gave some insight in how they do their work.

The discussion took of immediately after the main topics had been presented:

  • how to engage people through visualization?
  • how to use visualization to change people?
  • “facts are useless, stories are everything!” (quote by prof. dr. Harald Welzer made during the keynote on the first day)
  • what is the impact of data.gov shutting down?
  • how can design critique push the community forward?

Although all of these topics were a used as a starting point for the discussion, the main topic of the discussion evolved around questions like:

  • as a (visualization) designer, should you help the user make his conclusion?
  • is linear storytelling, like the satire movies from Alexander Lehman, the right way of passing information to the user?
  • can you really be objective if you visualize data?
  • is showing war casualties, for example like the one from Stamen design, a good way to really engage people?

The discussion was really interesting, and people had some very good points. But at the same time I found it somewhat hard to really take a stand in this, because I really think it depends. Anyway, after a short break, some of the presenters from day 1 gave some insight into their work process, things they run into, etc.

  • Wesley Grubbs showed us some more details on how he approaches visualization work for both clients and as personal research projects, like handling large amounts of data.
  • Alexander Lehman showed us how he uses 3DS Max to create his infographic movies, and how the overkill on functionality of 3DS Max actually helps him to be very productive.
  • Moritz Stefaner gave a very short but great introduction to Protovis, and how you could use Google Docs cleverly to use it as a real-time data source for your visualization
  • Justin Manor showed some more of his great installations with water, pneumatics, LED, Processing, Arduino, etc., and various ways of how they were made.

That wraps it up.

The conference was a blast! It was so inspiring, a great location, fantastic speakers, and very good talks! I am looking forward to SEE#7 conference.

If you want to see the full talks of SEE#6, go to http://www.see-conference.org/video-stream/. On the conference website you can also see video registrations of previous talks, which are highly recommended to watch!

Tutorial: Line chart in D3

Friday, April 1st, 2011

One of the most common visualizations is a line chart. D3 is not a charting framework, but instead allows you to manipulate the document based on data. That’s what you’re actually doing with D3: adding elements to a document, removing them, updating them, etc. The advantage is that you are much more flexible in creating the visualization that you want. Some may consider it a slight disadvantage that you may do have to do some extra work to get things done.

For this tutorial we’re going to create a basic line chart with an x-axis and y-axis, tickmarks and labels.

First, we defined some variables:

var data = [3, 6, 2, 7, 5, 2, 1, 3, 8, 9, 2, 5, 7],
w = 400,
h = 200,
margin = 20,
y = d3.scale.linear().domain([0, d3.max(data)]).range([0 + margin, h - margin]),
x = d3.scale.linear().domain([0, data.length]).range([0 + margin, w - margin])

The data variable contains our dataset we want to display as a line chart. The w and h variables are used for the width and height of our chart and the margin will be used to create some room to display our labels. The y and x variables are our linear scale functions, and remember that you can use x and y as functions. The scales are used to convert our data values (which are defined in the domain of the scale) to x- and y-positions on the screen (which are defined in the range of the scale). Note the use of margin in the range.

Next we append an svg element to our document with the proper width and height, and then we append a g element to this svg element so that all the elements that will be appended to this g element will be grouped together. The transformation that we apply on the g element makes sure that our coordinate grid moves down 200 pixels.

var vis = d3.select("body")
    .append("svg:svg")
    .attr("width", w)
    .attr("height", h)

var g = vis.append("svg:g")
    .attr("transform", "translate(0, 200)");

In order to create a line chart, we’re going to add an path element to our visualization. This path element needs a d attribute that contains the data for the path. Now, you could write that yourself, but that may be a little hard to do, so we use the helper function d3.svg.line to do that for us (see Tutorial: Line interpolations in D3 for more details on this helper function). Now we add the line variable which is, just like the scales earlier, a function that you can call. So when we bind our data to the path, this d3.svg.line function will be called so that the value of the d attribute will be created:

var line = d3.svg.line()
    .x(function(d,i) { return x(i); })
    .y(function(d) { return -1 * y(d); })

The i parameter is the index of the current item, and the d is the current item itself. Note the usage of the scale functions (x() and y() to convert from i and d to x- and y-coordinates (note the -1. This is needed because right now the coordinate have a positive y-axis down, and since we have translated our g down 200 pixels, we need to use the negative values of the y-axis now). To use our line function we add a path element to our g element like this:

g.append("svg:path").attr("d", line(data));

Next up is adding the x-axis and y-axis. The following snippet adds them by appending a line to our g element.

g.append("svg:line")
    .attr("x1", x(0))
    .attr("y1", -1 * y(0))
    .attr("x2", x(w))
    .attr("y2", -1 * y(0))

g.append("svg:line")
    .attr("x1", x(0))
    .attr("y1", -1 * y(0))
    .attr("x2", x(0))
    .attr("y2", -1 * y(d3.max(data)))

Next are the labels. We can use a convenient function of the scales here: x.ticks() and y.ticks(). These functions will return the proper tickmarks, and you can provide how many you want. D3 does return nicely rounded numbers though. Also note the .text(String) usage for obtaining the string value of the current data item. The labels are aligned in the center by using .attr("text-anchor", "middle"). And for the y-labels I’ve used dy in order to vertically position the labels correctly.

g.selectAll(".xLabel")
	.data(x.ticks(5))
	.enter().append("svg:text")
	.attr("class", "xLabel")
	.text(String)
	.attr("x", function(d) { return x(d) })
	.attr("y", 0)
	.attr("text-anchor", "middle")

g.selectAll(".yLabel")
	.data(y.ticks(4))
	.enter().append("svg:text")
	.attr("class", "yLabel")
	.text(String)
	.attr("x", 0)
	.attr("y", function(d) { return -1 * y(d) })
	.attr("text-anchor", "right")
	.attr("dy", 4)

The last thing to do is to add the ticks themselves. These are actually just very short lines, so all you have to do is to provide the start x- and y position (x1 and y1) and the end x- and y-position (x2 and y2).

g.selectAll(".xTicks")
	.data(x.ticks(5))
	.enter().append("svg:line")
	.attr("class", "xTicks")
	.attr("x1", function(d) { return x(d); })
	.attr("y1", -1 * y(0))
	.attr("x2", function(d) { return x(d); })
	.attr("y2", -1 * y(-0.3))

g.selectAll(".yTicks")
	.data(y.ticks(4))
	.enter().append("svg:line")
	.attr("class", "yTicks")
	.attr("y1", function(d) { return -1 * y(d); })
	.attr("x1", x(-0.3))
	.attr("y2", function(d) { return -1 * y(d); })
	.attr("x2", x(0))

And that’s it. Finally, some of the styling of the objects was moved to a style section at the top of the page in order to get the final result:

path {
    stroke: steelblue;
    stroke-width: 2;
    fill: none;
}

line {
    stroke: black;
}

text {
    font-family: Arial;
    font-size: 9pt;
}

View a live version here.

Tutorial: Line interpolations in D3

Wednesday, March 23rd, 2011

In this tutorial we’re going to explore line interpolations in D3.

First we start with 2 scales that we will use to convert values to x- and y-coordinates on the screen:

var x = d3.scale.linear().domain([0,10]).range([0,400]),
y = d3.scale.linear().domain([0,1]).range([0,50]),
groupHeight = 60,
topMargin = 100

Next we generate some random data:

var data = []
d3.range(10).forEach(function(d) { data.push(Math.random()) })

We’re also creating an array which contains all the possible interpolations D3 supports. We’ll see the effects of every interpolation in a moment:

var interpolations = [
	"linear",
	"step-before",
	"step-after",
	"basis",
	"basis-closed",
	"cardinal",
	"cardinal-closed"]

In SVG there is a difference between a line and a path. A line is a straight line where you define the start and end position of the line: , whereas with a path you draw the outline of any arbitrary shape by specifying a series of connected lines, arcs, and curves. You do this by specifying the d attribute of the path. Every path must begin with a moveto command. The command letter is a capital M followed by an x- and y-coordinate, separated by commas or whitespace. This command sets the current location of the “pen” that’s drawing the outline. This is followed by one or more lineto commands, denoted by a capital L, also followed by x- and y-coordinates, and separated by commas or whitespace. You can see more of this specification here.

In our example we’re not actually creating an SVG line, but an SVG path, so we need to set the d attribute of the paht. Luckily D3 has a helper function to ease the burden to create this data: d3.svg.line. For this helper function you can set:

  • an accessor function for obtaining x values
  • an accessor function for obtaining y values
  • an interpolation type, which defaults to linear
  • a tension value which affects the cardinal interpolations only

In our example, we want to show the different kinds of interpolations for the same data, so we create a function that takes the name of an interpolation as an argument, and then returns the d3.svg.line function as a result. This is the code that does that (you can play with the out-commented tension property to see the effect):

function getLine(interpolation) {
    return d3.svg.line().x(function(d,i) {
        return x(i)
    }).y(function(d) {
        return y(d)
    }).interpolate(interpolation)
//.tension(0)
}

Note the following: the function for x has 2 arguments: d and i. The d is the current item in the dataset (which we will provide later), and i is the index of the current item in the dataset. Also note that we’re using the x-scale to convert i to an x-coordinate, and the y-scale to convert the data value to a y-coordinate.

Now we initialize the visualization:

var vis = d3.select("body")
	.append("svg:svg")
	.attr("class", "vis")
	.attr("width", window.width)
	.attr("height", window.height)

Next up is greating a group for each of the lines we want to show:

var lg = vis.selectAll(".lineGroup")
	.data(interpolations)
	.enter().append("svg:g")
	.attr("class", "lineGroup")
	.attr("transform", function(d,i) {
	return "translate(100," + (topMargin + i * groupHeight) + ")"
}).each(drawLine)

We set the interpolations array as data for this group, so that svg:g elements for each of the interpolations will be added to the visualization. The svg:g element can be used to group other elements together, so that if you apply a transformation to the group for instance, it will be applied to all of its members. Note that we add the class lineGroup in our selection to select all these elements. Next we set the transform attribute, and we use the index to position the groups based on their position in the interpolation array. For each of the group, we want to draw a line. We do that by calling the drawLine function in the .each(drawLine) statement. The drawLine function itself looks like this:

function drawLine(p,j) {
	d3.select(this)
		.selectAll(".lineGroup")
		.data(data)
		.enter().append("svg:path")
		.attr("d", getLine(p)(data))
		.attr("fill", "none")
		.attr("stroke", "steelblue")
		.attr("stroke-width", 3)
		//.attr("stroke-dasharray", "15 5")
}

The drawLine function itself has to parameters: p and j where p is the parent data item (the current interpolation name), and j is the parent index. First we select the current element with d3.select(this), and next we select all the .lineGroup elements. We assign the line data to the data property, and append a path to each lineGroup element. The d attribute calls the getLine function and provides the current interpolation name as an argument. The result of that is the d3.svg.line function with the that uses the interpolation we just provided. Next we assign the line data to this function so that D3 will calculate the data string that will be used by the d attribute of the svg:path element. Finally we set some basic properties. The final out-commented is one of the stroke properties you can set, where 15 is the dash length and 5 is the gap length. Just play around with those properties to see what else is possible.

This concludes this tutorial. All the lines you see are using the same data, but they use different interpolations.