Saturday, July 31, 2010

Which zipscribble map is sexier: Sweden Stiletto or Italy Boot

I happened upon a cool website called eagereyes.org. There I read about a visualization called zipscribble maps. Essentially, zipscribble connects postal/ZIP codes in ascending order. On that note, take a look at these visualizations and tell me which country you find sexier.

Sweden
http://http//eagereyes.org/media/attachments/ZIPScribbleMaps/ZIPScribbleMap-Sweden-color-borders.pdf

Italy
http://eagereyes.org/media/attachments/ZIPScribbleMaps/ZIPScribbleMap-Italy-color-names-borders.pdf

I'm partial to Sweden, but don't let my preference sway you.

Verifiable.com to shut down 8/1/2010

If you never heard of verifiable.com, it's too late to try to now. It was a much hyped data viz site. Here is the exact story of the shut down copied from Mr. Rosemans blog. I figured it was better to copy the post than the weblink figuring the weblink may disappear, too, someday.

Out with the Old business in with the New
We will be turning http://Verifiable.com off on Aug 1st. Unless something dramatic changes.
We have moved on to a new project: http://SaneBox.com. With 2 clicks we will separate your email into 2 folders: email you have to look at right now and email that can wait. It is what we have all been waiting for all these years of struggling through over loaded Inboxes. And it just works. Go and sign up!
I have high hopes that http://SaneBox.com fits the following parameters of a successful business:
serve a need actual people have RIGHT NOW
implement in such a way so that users don’t have to do ANY work or expend ANY energy except for paying
avoid user interfaces like the plague.
be able to field something really quickly: not years, not months, but weeks.
get the product into users hands quickly and iterate on their daily needs often

Here is the story of http://Verifiable.com. It violated every one of those rules.

We worked on Verifiable.com from Feb 2007 to Feb 2010. Why did we do it? It was a crazy idea born from sitting in a Tufte conference and listening to his ideas on what data analysis should be and the problems that came from poor visualizations. When he showed two Challenger shuttle data charts:
the one they actually used to decide to launch that morning (showing only problem launches) Note that the connection between O-Ring events and temperature is not clear.
the one they should have used (showing all launch data). Note that ALL launches below about 64 degrees produced O-Ring damage. This chart makes the connection apparent by including all data not just positive data.
We were sold. It was so obvious and clear. We would make a tool that would ensure that this kind of mistake would never be made again. Tufte hammered this point home for us by saying that the tool he always wanted, one that enabled the user to make a presentation of the evidence regardless of the type of evidence, still did not exist. Ah Ha! We would make that tool. As my friend Dennis (the hardware engineer) used to say, it is just a bunch of do loops - how hard could that be?
Personally, this project cost me a lot of money and energy but worse than that it essentially wasted the time of the most talented engineering team I have ever worked with. I am going to lay out the chronology here as a cautionary tale to anyone else out there who is choosing a project for a new business.
We started the project for all the wrong reasons. We wanted to save the world. We wanted to provide a new technology that would allow people to have easy access to facts and data to back up their conclusions. We saw that this didn’t yet exist in a easy to use format. We saw others that had tried and we figured they had failed because they did a bad job and we could do better. The lesson to learn here is If there are burning husks of cars along the roadway, you should probably assume that driving there is dangerous. Regardless of how good a driver you are.
We misestimated how hard the project was. Specifically we misestimated how hard the following issues would be:
how many edge cases there are in any complex user interface
how little patience users have for complex user interfaces
how hard it is to find clean data and upload it
how little interest people have in finding clean data and uploading it
how much time and skill it requires to find clean data and upload it
how much time and skill it requires to make a decent, thoughtful chart from that data
how impossible it is to test complex user interfaces (we even convinced the funfx guy to help out and we still failed to develop a decent overall easy to run test framework)
how fragile a complex flex based user interface is. fix one thing break another.
A couple of interesting highlights: our google foo was mighty. The lesson there is that if you have the correct relevant answer to a google query - google will index you. All the manipulations of google and their index seems to miss this point. For years there were certain queries about cement production and unemployment where we were the first or second links. The corollary to this rule is good google foo won’t save you. You need that traffic to translate into a community. A visit is not an interesting statistic especially in a business that requires the community to produce content.
But, in the end we could never as a company succeed if we were the only ones creating content. And with the exception of a couple of users, we were the only ones making compelling content with our tool. For three years we fought that battle. And to let our users off the hook a little, I admit that it is just hard. The clean data is all tied up with ridiculous trademark issues (yes people are still fighting to preserve their rights to claim a number is theirs). Write to your Congress people - why do we give The Pew Foundation a huge tax break (essentially funding them with federal dollars) and allow them to claim ownership and restrict access to their data?
Were our technology choices part of the problem? We chose ruby on rails . Did we have scaling problems with ruby and rails. Maybe some but ruby is still a very fun language if you can afford the cycles. We chose flex. I didn’t realize exactly how fragile flex was until we were deep into the problem. But even looking back i don’t know that there was a better technical solution at the time. Were I to start today, I would use google charts which is all magical javascript or we’d use/improve some jquery charting packages
But, using google charts today, I think we still would fail to solve the Verifiable problem. The evidence of that is that the only charting sites that are getting any traction are ones that provide users with the ability to chart their personal data with just a couple of clicks. The emphasis is on “personal” data and “a couple” clicks. The big global questions about climate, justice, happiness, truth, which super hero is stronger fall flat in the face of how your stocks are doing or how much weight you are losing.
We chose thinkingsphinx - I have no idea if that was a good idea. Our search never worked as well as it should. A content website requires perfect search. Ours was not perfect by any stretch of the imagination. Two things are required for good search. I define “good search” as a pretty good hit rate with relevant results. It is impossible to get that without sufficient content to answer most questions. And that content has to be tagged and described sufficiently so that the search has some sort of hook to figure out relevance. Additionally you need a system where it is easy to adjust what you mean by relevance since that will change as you understand the problem set. Ours was like moving a small hippo out of a parking space.
The guys were great. The users who I did meet were great. The project was a lot of fun for a lot of the time. It was just too hard for both us and the users. And frankly, I am becoming convinced that “truth” can not be boiled down into numbers. There is always context. And without the context, the numbers don’t help nearly as much as you would like. As a geek this is hard to admit.
For those of you that used the tool and helped out. Thanks! We’ll miss you.
If you use gmail and are tired of wading thru tons of crappy mail to get to the few golden nuggets come sign up for http://sanebox.com - two clicks and no work! I promise!
Best wishes to you all,Stuart
Stuart RosemanSoon to be Ex-President of Verifiable.comcurrent and future President of Sanebox.com

Wednesday, July 28, 2010

A picture is worth a thousand words.

Regardless of who uttered the title of this blog post, that statement is as true today as it was the day it was first said. As an avid user of Excel for financial modeling and analysis, I am fairly well skilled at developing spreadsheets. Yet, the last thing I want to give my audience, or
any audience, is a long spreadsheet filled with page upon page of numbers. Nothing is less exciting, or puts me to sleep faster, than boring spreadsheets. But, graph the data and viola..the data comes alive! Designed properly all viewers can recognize what the chart is telling us. Thus, data visualizations, or data vizes as they are commonly referred to, are an equalizer. Certainly there are pretty sophisticated data vizes out there, some created by PhDs. Whether PhD or not, once you understand what the picture is telling you then you are on your way to becoming better informed.



Take a look at these two websites. The topics are as interesting as the data vizes are imaginative. Can you imagine your information dashboard looking like this?

http://www.yanamitchell.com/#98c/tumblr

http://flowingdata.com/about/



Enjoy!