- Day 1 at Eyeo: Converge to Inspire on Infosthetics.com
- Day 2 at Eyeo: Understanding Complexity on Infosthetics.com
- Day 3 at Eyeo: the Practice of Digital Information & Art on Infosthetics.com
- Jan Willem Tulp on Eyeo in Visualizing.org
Today was one of those days I won’t forget very soon: today was Visualizing Europe day, and many enthusiasts, practitioners, researchers and users of data visualization gathered in Brussels for an inspiring day of talks and meeting interesting and kind people from the data visualization community. The day was divided into 3 sessions:
The first sessions showed some of the best works currently created in data visualization: Santiago Ortiz from Bestiaro showed the power of the visual programming paradigm of Impure can be used to create sophisticated data visualizations in minutes (did I say minutes? seconds!)
Next, Moritz Stefaner showed two of his recent and impressive projects: the Better Life Index project that was recently launched by the OECD. And his previous famous project: Notabilia, which shows deletions on Wikipedia.
Enrico Bertini gave a fantastic talk from a research perspective and explained different approaches of making a data visualization for the public and for the tiny group of people who are actually solving real world problems with data visualization. A quote that was tweeted numerous times immediately: “data visualization is useless, it is indispensable”. He also highly recommends the book: “How Maps Work” (which of course is on my wishlist now!).
Last but not least of this first sessions was Dave McCandless from Information is Beautiful. Dave showed some of his work, and a remarkable quote was: “I disagree with Moritz, I’m not looking for 1000 stories, I’m looking for 1 story that’s interesting”.
After a short coffee break, session 2 started with Gregor Aisch showing how he creates data visualizations as the Open Spending project for the Open Knowledge Foundation. He proposed a new approach for data visualization, namely ‘open data visualization’, which is open source + open data + open to community. An fascinating idea I’d like to learn more about.
Assaf Biderman from the MIT Senseable City Lab impressed us with some of their cutting edge projects they do together with governments and cities, like the Trash Tag project which tracks and visualizes where trash is being transported all over the USA after people have emptied their trash bin. Another project that keeps impressing me is the Copenhagen Wheel, an augmentation to bicycles that allows bikers to track their own performance, and at the same time measures various air conditions of the city. This data is collected and visualized to understand more about the city’s air pollution.
Salvatore Iaconesi from Art is Open Source elaborated on how the artistic world uses data and visualization to change paradigms, for example in supermarkets: while in the supermarket data is visualized on your iPhone and shows the geographic origins of the chemical compounds of your products.
Last but not least, Peter Miller from ITO World had to rush through his slides where he showed some very compelling and sometimes fine-grained user contributions to the Open Streetmap project. It’s impressive to see how user contributions can lead to sometimes more correct maps than non-crowd-sourced maps.
The final session was a discussion between Franco Accordino and Jean-Claude Burgelman from the Europen Commission and Toby Green from the OECD. The main subject was: what did they take from today’s sessions, and what will they do with it. It was very good to see that the value of data visualization was recognized, and that the EU sees data visualization as one possible and valuable way to create new knowledge, which is very important.
Finally, the day was finished by meeting so many people from the data visualization community. It was amazing to meet so many people whom I’ve been in contact with for quite some time now. Thanks Visualizing.org for organizing this wonderful day, and everybody who has contributed. It was a memorable experience!
Just a short note that I am honored, thrilled and excited to announce that my Ghost Counties data visualization has won the Visualizing.org ‘Visualizing the 2010 US Census data visualization challenge‘. A great thank you to the jury from Visualizing.org and Eyeo Festival and all the kind messages I received!
This weekend I have visited SEE#6 in Wiesbaden, Germany. Although this was the sixth SEE conference, it was my first visit. And I must say, it was overwhelming! Here’s a brief overview of my experience.
The conference was situated in the beautiful Lutherkirche, which was one of the most beautiful and atmostpheric conference locations I’ve ever been to. After a friendly welcome from Micheal Volkmer from the hosting organization Scholz & Volkmer, prof. dr. Harald Welzer. He gave an inspirational talk about sustainability, the main subject of the conference, and he gave us his view on how to fight climate change and change human behavior to improve a sustainable society.
After the keynote Carlo Ratti of the MIT Senseable City Lab in Boston showed us some of his recent projects on how ubiquitous computing is entering our society more and more, and how sensors can help cities and citizens to be more aware of the environment, and improve sustainability. On this image you see a visualization by the Lab that shows real-time whether and rain situation in combination with taxi locations: are taxi’s on places where they are needed the most?
After a break, the young and talented Alexander Lehman took over. He showed us some of his animated infographics, and how he uses satire to tell a story. He elaborated on his most successful project: “du bist terrorist” where he uses satire in an attempt to make people more aware about the increasing danger of the government collecting all kinds of data about its citizens.
Brendan Dawes had a very humorous and inspirational talk about how he is using his creativity to do very inspiring projects, both for customers as personal research projects. He had many great examples of his projects and one of them was a homebuilt digital/wooden weather indicator: not because it really solves a problem, but just because it’s fun and cool:
Wesley Grubbs from Pitch Interactive showed us some of his great projects. Projects where he has visualized many rows of data that resulted in high resolution (and long rendering time) images. One of his most compelling examples was a visualization where he shows some insight in how the US Defense is spending its money.
After the beautiful images from Wesley, Joshua-Prince Rasmus from REX architects gave his presentation. Before the conference I didn’t really know how architecture would fit in a conference about information visualization, but it turned out that this was one of the talks that impressed me most. At REX they’ve defined a very strong process of doing work for clients. Joshua talked us through it. He showed the beautiful images he uses in presentations to show his ideas, and at REX they’re brilliant in working within constraints, creating flexible buildings and, perhaps most importantly, really understand their clients, so that they eventually build something clients really understand and agree with. Great talk!
Final talk was by the talented Justin Manor from Sosolimited. Justin inspired us by showing a few of his great installations. One that he gave special attention was real-time analysis of political debates, where a very large number of different approaches were shown how to interpret, categorize and analyze the words and sentences politicians are saying. His visualization is built in Processing. This concluded the first day.
The second day was an extra day for data visualization die hards, to have some good discussions about data visualization. The discussion was led by 3 prominent people from the data visualization community: Moritz Stefaner, Andrew vandeMoere and Benjamin Wiederkehr. The day was basically split into 2 halves where the first part was a discussion, and the second part some of the presenters of the conference gave some insight in how they do their work.
The discussion took of immediately after the main topics had been presented:
Although all of these topics were a used as a starting point for the discussion, the main topic of the discussion evolved around questions like:
The discussion was really interesting, and people had some very good points. But at the same time I found it somewhat hard to really take a stand in this, because I really think it depends. Anyway, after a short break, some of the presenters from day 1 gave some insight into their work process, things they run into, etc.
That wraps it up.
The conference was a blast! It was so inspiring, a great location, fantastic speakers, and very good talks! I am looking forward to SEE#7 conference.
If you want to see the full talks of SEE#6, go to http://www.see-conference.org/video-stream/. On the conference website you can also see video registrations of previous talks, which are highly recommended to watch!
Yesterday I visited the Dutch InfoGraphics Conference in Zeist. It was a great day, met some nice people and saw some good talks.
The keynote was presented by Gert K. Nielsen, the founder of VisualJournalism.com. Though he is a great speaker and had some valid points, I also disagreed with many of his views. His opinion was that most infographics and data visualizations actually tell you nothing, give no information and remove the truth and emotion from reality (He illustrated this with ‘bad’ examples on visualcomplexity.com, including my World Economic Forum visualization ) He wanted to make his point by showing movies of people committing suicide, people being blown up by bombs etc., and then showing an infographic that only shows an explosion symbol on a map as a representation of these events. I agree that over simplification may not be right, but it really depends on what you are trying to communicate, to what audience, and with what purpose. So, although interesting, I also thought his views were actually to simplistic.
A very animated talk was given by John Grimwade. He showed us how the tablet will be the platform infographic designers will eventually be dealing with. His talk was fun with lots of jokes and very animated, but he didn’t have a very strong message in my opinion.
Bas Broekhuizen gave an update on his scientific research on interactive infographics. Too bad the talk was rather short, because this may be quite interesting, especially since the definition of an interactive infographic may be somewhat fuzzy. Wouter Kroese showed some very nice animations he does for the NCRV. The animations are very clean and Wouter Kroese won one of the Infographics Awards for 2010. The last talk by the Gert Kuiper of the VPRO was very interesting: a preview of the upcoming Dutch documentary ‘Holland from above’ was shown. Quite impressive, and similar to the UK version: Britain from above.
There were more speakers but the ones mentioned above moved me most in some way. I’m looking forward to next years conference!
The third day of the Strata Conference was again packed with great sessions. The day started off with numerous keynotes. The first one was Simon Rogers of The Guardian. Simon is not just a fabulous presenter, also the examples of his work at the Guarding were great examples of how to tell stories with data, and how The Guardian actually enhanced its news stories by sharing data with the public. Next up was an interesting panel discussion with Toby Segaran (Google), Amber Case (Geoloqi) and Bradford Cross (Flightcaster) and moderated by Alistair Croll (Bitcurrent). Topic of discussion was Posthumus, Big Data and New Interfaces. After the discussion we had some good presentations by Ed Boyajian (EnterpriseDB) and after that Barry Devlin (9sight consulting). Next was a very lively talk by DJ Patil (LinkedIn), and he showed very convincingly that the success of working with big data at LinkedIn is only possible with a good team of talented people. Scott Yara (EMC) came next, and also had a lively talk full of humor on how Your Data Rules The World. The closing keynote was from Carol McCall (Tenzing Health) with a serious problem brought with humor on how big data analytics can be used to improve the US healthcare, and turn it ‘from sickcare into healthcare’.
As my first session I chose a talk on Data Journalism, Applied Interfaces. Marshall Kirkpatrick (ReadWriteWeb) showed some really useful tools, like NeedleBase, that he uses for discovering stories on the Internet. He was followed up by Simon Rogers of The Guardian again, who more or less continued his keynote, showing very compelling examples of how The Guardian uses data to tell stories, and how they use for instance Google Fusion Tables to publish many of their data. The last speaker of this sesion was Jer Thorpe, and he absolutely blew me away with a beautiful interface he has created in Processing as an R&D project together with the New York Times. It’s called Cascade, and shows a visual representation of how Twitter messages are cascaded over various followers and links.
My next session was on ‘RealTime Analytics’ at Twitter where Kevin Weil mainly explained RainBird, a project they use for various counting applications so that realtime analytics can easily be applied. The project will be opensourced in the near future.
After the break I saw a session on AnySurface: Bringing Agent-based Simulation and Data Visualization to All Surfaces by Stephen Guerin (Santa Fe Complex). He showed how using a projector and a table of sand can be used to enhance a data visualization for simulation purposes. As an example he showed us how he projects agent-based models and emergent phenomena in complex system dynamics can help firefighters simulate bottlenecks in escape routes. It was also very cool to see that many of his simulations are built in Processing. Next up was a session by Creve Maples (Event Horizon) and I really like the first part of his talk, because he had a very good story on how we should keep the capacity of the human brain for processing information in mind when designing products and tools. It was really good to hear such a strong emphasis on this. The last part of his talk was mainly about some of the 3D visualizations he has done in the past that were very successful for his company, but didn’t struck me as much as the first half of his talk.
The session on Data as Art by J.J. Toothman (NASA Ames Research Center) was a good an fun talk with many examples of infographics and visualizations. I had already seen most of them myself, some were new. It was a great talk with lots of eye-candy. The final talk of the conference I saw was about Predicting the Future: Anticipating the World with Analytics. Three speakers gave their vision on how they do that: Christopher Ahlberg (Recorded Future) showed how his companies uses time-related hints (like the mention of the word ‘tommorrow’) in existing content on the Internet can be used to more or less predict the future. Robert McGrew (Palantir Technologies) showed how analyzing many large datasets in combination with human analysis can be used to perform effective fraud and crime predication. Finally Rion Snow (Twitter) showed that research has proven that analyzing tweets can be used effectively for stock market prediction (3 days ahead!), flu and virus spread prediction, and UK election result prediction (more accurate than exit polls). The predictive power of analyzing the Twitter crowd was really stunning.
This concluded the O’Reilly Strata Conference. The conference was fantastic, the sessions were great, and most of all, meeting all these people was probably even the best of all!
After a day of tutorials, the second day at Strata was the first of two conference days, packed with fascinating sessions. The day was kicked of with a plenary session with a long list of top-speakers in field of data science: Edd Dumbill of O’Reilly Media, Alistair Croll of Bitcurrent, Hilary Mason of bit.ly, James Powell of Thomson Reuters, Mark Madsen of Third Nature, Werner Vogels of Amazon.com, Zane Adam of Microsoft Corp, Abhishek Mehta of Tresata, Mike Olson of Cloudera, Rod Smith of IBM Emerging Internet Technologies and last but not least Anthony Goldbloom of Kaggle. Various topics were presented in presentations of 10 minutes each, like data without limits, data marketplace, and the mythology of big data. The shortest presentation struck me most: “the $3 Million Heritage Health Prize” presented by Anthony Goldbloom: people are challenged to create a predictive application that uses healthcare data to predict which people are most likely to go to hospital, so that ‘US healthcare becomes healthcare instead of sickcare’. The prize is $3 Million for the one who solves this!
Next up were the individual sessions, and I was very much looking forward to the talk “Telling Greate Data Stories Online” Jock MacKinlay of Tableau. And though the talk itself was excellent, for me it was all known stuff, but the talk is highly recommended for those unfamiliar with Visual Analytics or Tableau. Being biased towards visualization related sessions, my next session was “Desinging for Infinity” by Dustin Kirk of Neustar. Dustin showed 8 Design Patterns of User Interface Design, like infinite scrolling, which were really good. It reminded me of the updated version of the material in Steve Krugg’s book Don’t Make Me Think.
Next up was the best talk of the day: “Small is the New Big: Lessons in Visual Economy”. Kim Rees of Periscopic showed us very good examples of effective information visualizations. I was really blown away by this presentation, mostly because she really showed how creatively removing clutter and distractions can make the visualization very effective. Also the creative interactions that help the user using the visualization were compelling. Next was Philip Kromer of Infochimps on “Big Data, Lean Startup: Data Science on a Shoestring”. Though my expectations were that Philip was going to explain the Lean Startup principles, evangelized by Eric Ries, the talk was more about Infochimps approach to doing business. Some remarkable comments by Philip: “everything we do is for the purpose of programmer joy”, and “Java has many many virtues, but joy is not one of them”. Great presentation and inspiring insights!
My next sessions was “Visualizing Shared, Distributed Data” by Roman Stanek (GoodData), Pete Warden (OpenHeatMap) and Alon Halevy (Google). After short presentations of each, these three guys had a panel discussion where the audience could as questions. Their discussion evolved mostly around the fact that all three deal with data that is created and uploaded by a user, and how do you deal with that: do you clean it, what’s the balance between complex query functionality and ease of use, etc. My final session was “Wolfram Alpha: Answering Questions with the World’s Factual Data” by Joshua Martell. Half the talk was a demonstration of the features of WolframAlpha, and the other half was more or less a high level talk about how WolframAlpha handles user input, how data is stored, how user analytics is performed, and more.
The day ended with a Science Fair where students, researchers and companies were showing new advancements in the field of data science. There were really interesting showcases, like a simulation tool for system dynamics. But again biased towards visualization, the one that struck me most was Impure by Bestiaro. Impure is a visual programming language that allows users to easily create their own visualization, both simple and very advanced. It was also great to see the passion of Bestiario for their own product.
Finally one of the best things of the conference so far has been meeting people, some of which I only know virtually for some time now. I especially enjoyed meeting all the visualization people today. It’s really great to meet many of the online visualization community in person.
So again, a fantastic day at Strata, and I am looking forward to tomorrow!
Today was my first day at O’Reilly Strata Conference: a full day of tutorial sessions. The session I picked was the Data Bootcamp by Joseph Adler (LinkedIn), Hilary Mason (bit.ly), Drew Conway (New York University) and Jake Hofman (Yahoo!). The purpose of this bootcamp tutorial was to turn everybody in the room into data scientists by getting our hands dirty with some real hands-on experience.
The tutorial was kicked-off with an introduction of the speakers, and a general overview of the various aspects of working with data: getting data, cleaning data, applications of data intensive applications, and much more. Then Drew gave an interactive introduction in visualizing data using Python and R. The audience had to produce a normal-distribution of random numbers in R. And although some people managed to get along with all the examples, there were also lots of people struggling due to the fact that libraries were missing, or simply for the fact that everything was going pretty fast, at least for R and Python newbies like myself.
Next Jake gave an great introduction into image processing, and especially how you can cluster images based on similar features, color in our case. We used a K-Means clustering algorithm to cluster similar images based on color, and after that we classified images, whether they were images of landscapes or head-shots.
After the break Hilary took over with a great presentation on working with text-data. Starting with some basic examples on extracting data from webpages using command-line commands like curl and wget, and using Python and the BeautifulSoup Python library. After that we turned to the main example: ‘hacking’ a gmail account, and try to get some valuable information out of it. Hilary showed us how to classify email using probability statistics, and then Drew took over to show us how to visualize this data and turn it into network diagrams.
Last but not least Joseph gave a talk about Big Data. This was not an interactive session. Joseph shared some of his knowledge and experience of working with big data at LinkedIn, and explained the basics of Map/Reduce, Hadoop, and why and when to start thinking about big data solutions like Hadoop.
Overall it was an interesting day, also because I’ve met really great people. It was especially great to meet Naomi (@nbrgraphs), Kim (@krees), Jerome (@jcukier) and Daniel (@danielgm). For me the Data Bootcamp was especially an inspirational tutorial with lots of ideas to try out on my own. For some people tempo tempo was a little to high, especially if you’ve never programmed R or Python before. And becoming a Data Scientist in just 1 day may be an illusion anyway. At least the tutorial gave me a good head start, lots of inspiration, and great learnings of how the presenters approach working with data. So for me, this was a great and successful first day, and I’m looking forward to the next two days!
The source code and slides of the Data Bootcamp are available online at: https://github.com/drewconway/strata_bootcamp