Archives for category: Data + Graphics

CAR 2011 was stuffed full of information, so much so that the only way to keep up with everything has been to keep a log of what people have been sharing.

Feb. 28 update: Thanks to everyone who’s forwarded additional links and presentations (I’m marking them with NEW as they’re added) and to all who’ve sent me nice notes about this list.

Philip Smith forwarded a JSON file of NICAR tweets with links in them. Want it? Download it.

This year’s conference looks to have been a tremendous success, bringing in the most registered attendees in nearly a decade. Congratulations to NICAR for a terrific, educational and inspiring event.

A more narrative look at what happened at the conference can be found on the conference blog. But if you’re anxious to dive in, this is your buffet: Prepare to have your mind blown.

Got links from sessions you attended? Post them in comments or ping me on Twitter @MacDiva and I’ll add them to this list.

Jump to Tutorials | Software & Tools | References | Work Samples

Presentations & Tutorials

NEW Using TwitInfo and TweeQL to find and tell stories (from Adam Marcus)

NEW A Gentle Introduction to SQL using SQLite: slides, full tutorial and steps only (from Troy Thibodeaux)

Valet Parking Your Django App (from Jeremy Bowers)

Similarity algorithms using Python (from Luke Rosiak)

The Quick and Dirty Varnish Setup for Django (from Andy Boyle)

Making HTML Tables Interactive (from Michelle Minkoff)

View more presentations from Michelle Minkoff

QuantumGIS 1 tutorial and files (16.3MB) (from Timothy Barmann)
• Use JavaScript and jQuery to create interactive maps: tutorial and files (17.5 MB) (from Timothy Barmann)
• How to break news online – and use LA Times app engine tools (from Ben Welsh)

NodeXL for Social Network Analysis (from Peter Aldhous)

• Excel, CAR and mapping training tipsheets, slides and datasets (from Jennifer LaFleur)

My Favorite (Excel) Things (from MaryJo Webster)

Latest in Mapping tools and examples

(Ruby) Coding for Absolute Beginners (from Dan Nguyen) – following the tutorial will produce “My Very First Web Page

Google Refine tutorial and datasets (that download to your hard drive on click) (from David Huynh)

APIs: Making the Web a Data Medium (from Anthony DeBarros and Derek Willis)

NEW R for Statistics: First Steps (from Peter Aldhous)

NEW R for Statistics: Automate Your Analysis (from Jacob Fenton)

Hands-on R, a step-by-step tutorial (from Jacob Fenton)

Ruby4Kids — mentioned in passing as a low-friction way to learn the basics of Ruby

How to make an intensity map with custom boundaries using Google Fusion Tables

Google Fusion Tables tutorial

Cracking Open Electric Records slides and case law < = link launches a PDF bundle (from DB Smallman) • Internet Reporting: What You Should Know (from Jack Gillum)

• Free software: From Spreadsheets to GIS, Part 1 and Part 2 (from Jacob Fenton and Anthony DeBarros)

Beyond Mapping: Spatial Analysis on the Cheap (from Long Creative)

Beautiful Data (from Aron Pilhofer)

View more presentations from pilhofer

• Intro to Python, Session 1 tutorial and Python tipsheet (from Jacqueline Kazil and Serdar Tumgoren)

Getting into a data-oriented mindset (from Mary Jo Webster and Wendell Cochran)

Dataviz for beginners (from Matt Stiles and Sanjay Bhatt)

MGRS Explained (from Jacob Harris)

Data Visualization with JavaScript and HTML5 (from Jeff Larson)

Tutorial: Census Data with Tableau Public

PostGIS is Your New Bicycle – be wowed by a free alternative to costly desktop GIS (from Mike Corey and Ben Welsh)


Software & Tools

Jump to Tutorials | Software & Tools | References | Work Samples

3Scale – API management and monetization tool (free trial)
API Playground – try APIs, no coding skills necessary
Backbone.js adds a models-collections-views structure to JavaScript applications
BatchGeo interactive map maker – business search engine browser compatibility tables
Census Block Conversions API
ChangeTracker from ProPublica – track changes to any website
ChinaVitae – learn who’s who in power in China
CollegeInsight – compare universities by cost, financial aid, diversity, job placement rate
DataWrangler cleans and transforms data
• Download manager downTHEMall is a FireFox extension that grabs webpage links and images.
Europe Media Monitor’s NewsBrief – an international alternative to Google News
EUROCONTROL – “find blocked private planes that might have flown to Europe, for example: see which executives are going to Cannes”
FCC Census Block Conversions API – boundary service API, excellent for mapping
• The FireShot FireFox extension creates browser screenshots, adds annotation and more.
Foreign Labor Certification Data Center – find what visas a company has applied for (there may be wage information tied to the application)
Get Lat Lon – finds latitude and longitude for any location worldwide
• Free Google Drawings wireframe templates
Google Fusion Tables for data analysis and visualization
Google Refine for data cleaning
Inmarsat Ships Directory – lookup a ship’s phone number
JSFiddle online JavaScript editor
Jigsaw: “Visual analytics for exploring and understanding document collections”
Little Sis – visualizing the networks of social, financial and political power – track vessels in real time
Mayan open source, Django-based document manager
Mr. Data Converter converts Excel data into web-friendly formats
NETROnline – public records search, especially good for real property lookups
NodeXL uses Excel for network analysis
NodeXL Teaching lessons and tutorials – lookup phone numbers around the world
Outwit Hub – FireFox plugin for scraping websites
PDFonFly – converts web pages to PDFs
PhraseNet diagrams relationships between words in text
PostGIS – adds mapping ability to PostgreSQL
PrivacyChoice – rates website privacy policies
PySAL an open source Python library for spatial analysis functions
R statistical analysis software
R libraries recommended by Amanda Cox, Jeff Larson and others: ggplot, RColorBrewer (color picker), rgdal (bindings for GDAL – the Geospatial Data Abstraction Library), survival (survival analysis)
Recorded Future – temporal analysis search engine uses predictive analytics to discover the likelihood of events in the future
RSRuby – use the R environment in your Ruby program
Rubular – test your regex on the fly
Simile Timeline
Scraper Wiki
Snitch.Name – people lookup
Tableau Public
TinEye finds information on uploaded images, including usage, higher resolutions, modified versions
Tweeql access the Twitter API by using SQL syntax (requires Python)
TwitInfo chart Twitter keyword frequency and sentiment
USA Spending – see what the US government is spending money on


Jump to Tutorials | Software & Tools | References | Work Samples

NEW Journalists learning Python Google group
NICAR ‘Net Tour – an index of links from IRE for watchdog research and learning computer-assisted reporting
The New Precision Journalism (from Philip Meyer)
The Logic Of Causal Order by James A. Davis (recommended by Philip Meyer)
• US Government Health Data
Health Indicators Warehouse
Coordinate Systems Overview for mapping
Concepts of Probability (statistics!)
Advanced Probability and Statistics, 2nd Ed. by the CK-12 Foundation
Thomas Lumley: work page (statistics! and Amanda Cox’s professor)
Hadley Wickham (statistics! and the maker of ggplot for R)
Graphical Inference for Infovis by Hadley Wickham, Dianne Cook, Heike Hofmann and Andreas Buja (“How do we know if what we see (in a data visualization) is really there?”)
“Be Careful What You Do With That Cell Phone Recording; It Could Land You in Jail” (from DB Smallman)
Gary’s Social Media Count – see the volume of social media activity
Quantitative Discovery from Qualitative Information: A General-Purpose Document Clustering Methodology by Gary King
Producing Online News: Digital Skills, Stronger Stories by Ryan Thornburg
US State Department Foreign Affairs Manual, section on information security, a.k.a. 12 FAM 500
Five Databases in 50 Minutes: Government Session (from the CAR2011 conference blog)
News Apps: What Works and Why (from the CAR2011 conference blog)
Analysis-ready census data (from USA Today, available to NICAR members only)
A directory of statistics bureaus by country (from Statistics Sweden)
Data Visualization for Beginners (from the CAR2011 conference blog)
Tracking the Economy and Business (from the CAR2011 conference blog)
Benford’s Law (statistics!)


Work Samples

Jump to Tutorials | Software & Tools | References | Work Samples

• The Wall Street Journal investigative report, “Confidentiality Cloaks Medicare Abuse” with database created by Mo Tamman
• The Center for Public Integrity investigative report, “Unproven for Older Women, Digital Mammography Saps Medicare Dollars
The Year in CAR
• Des Moines Register potholes map – find the most on-time flights between cities (uses US government raw data)
Employment Market Explorer – find out what the local employment market looks like. Compare local, regional and national rates and labor market dynamics. (uses US government raw data)
WildTrack – using data to monitor endangered species populations
• Roundup of state-based 2010 census stories
The Killing Roads – interactive map of highway accidents in Norway
• The entire King James Bible as a word tree
Who Runs HK – network graph of the people in power in Hong Kong
Research by Martin Wattenberg, including the highlighted works, Name Voyager, Map of the Markets, Shape of Song and Fleshmap

Jump to Tutorials | Software & Tools | References | Work Samples

Over the weekend, I went to Jer Thorp’s Processing and data visualization workshop to dig deeper into the program.

While I don’t have new code to show yet, today I started looking for additional learning resources. Artist Marius Watz is publishing a free series of Processing primers on Modelab. The examples are fully commented, so even if you’re fairly new, it’s easy to follow along.

Daniel Shiffman, who wrote “Learning Processing: A Beginner’s Guide to Programming Images, Animation, and Interaction,” is planning a new book, due to be published this summer. It’s on Kickstarter:

Daniel’s got tutorials and excerpts from his current book online for those curious about his writing style and looking for additional examples to learn from.

Have some additional sites and sample files you’d like to share? Leave a note and help create a standing resource.

Here’s something fun and educational: Feather, an embeddable, lightweight HTML5 photo editor by Aviary. For user instructions, see the Goodle doc.

Want your own? Get the APIkey and auto-generated code from

polka dotsSee those dots? They’re not drawn. I programmed them using a 2D and 3D development environment called Processing.

It may not look like much, but it’s a start, thanks to a workshop taught by artist and instructor Jer Thorp, who’s currently Data Artist in Residence at The New York Times.

Sounds like a very cool job to me.

Meanwhile, this week’s assignment is to build on some of the workshop exercises — and to figure out how to export the files to my server so you can interact with them.

growing boxes

For a few years now, I’ve been keeping an eye on the Gel Conference, where people gather to talk about experience, perception and customer service.

One of this year’s speakers was mapmaker Connie Brown of Redstone Studios. Her one-off painted maps show not just geography but perspective. In her 20-minute presentation, she shows us how maps are both descriptive and opinionated. It’s worth watching, whether your preference is for the science or the art of cartography.


Connie Brown, mapmaker, at Gel 2010 from Gel Conference on Vimeo.

Graphics director Steve Duenes and graphics editor Archie Tse talk about what goes into the visual storytelling elements that The New York Times has become so known for.

Who’s got the biggest social network per country? The BBC charted Nielsen’s figures from June 2010 and from a year ago. Facebook had the largest audience by far in both months, while MySpace has dropped quite a bit.

What’s more interesting is the change in the amount of time people are spending on Facebook every month. I’d really like to know the demographics of the surveyed population. Anyone have info?

BBC charts Nielsen social network audience numbers.

(via BBC News)

I once knew a business editor who griped a lot about the typical story that would cross his desk: “You’re dazzling people with big numbers instead of telling them anything meaningful!”

My takeaway: Always create context around data.

When most people think of data, they think numbers. But most dictionary definitions define the term along the lines of “facts and statistics collected together for reference or analysis.” Remember that.

As the technical foundation of online journalism moves toward structured, semantic data examined by people with expertise (or at least curiosity), we will probably find ourselves wondering how many people we’re reaching and how it happens.

Site metrics is one way. Another is social network analysis.

Among the interesting tools out there is the Infochimps API, which is currently in beta. On their blog, you’ll see this:

Infochimps API in action

It shows one Twitter user’s network and the connections between them. While the example was produced by someone running a business, it could easily be applied to a journalist interested in understanding their own networks (sources, readers, colleagues, etc.).

From the Infochimps blog post:

Coupling Influencer Metrics with Trstrank would enable a promoter to identify not only the users most likely to engage, but also the most influential of those users. Throw Wordbag into the mix and a promoter could also discover if users in the active, influential target population have a potential interest in their product.

What other examples can you come up with?

Just wanted to bring your attention to some news-related projects that launched this week:

ReportingOn logo
Reporters looking for advice from other reporters should take a look at ReportingOn.

Ryan Sholin’s revamped site is like a help forum for news developers and journalists, particularly beat and local journalists. Follow ReportingOn on Twitter. You’ll find me on ReportingOn too.

Everyblock logo
The hyperlocal news and data site Everyblock released its source code, much to the delight of Django developers everywhere.

Everyblock is the brainchild of Adrian Holovaty, one of the co-developers of the Django framework. Read more about the project, poke around and see what you find.

Personally, I’d also like to see the source code for the Everyblock iPhone app, but one thing at a time.

W3C Mobile Web Initiative logo
If you want to learn more about mobile site design, consider signing up for W3C’s first-ever live training session in Cambridge, UK.

The event takes place Oct. 13. Registration — at a hefty €399.00 (about US$558 at today’s exchange rate) — includes lectures and hands-on workshops, as well as access to the nine-part course. Read the full description, register online or read more about the W3C Mobile Web Initiative.

Feel free to browse around the blog. A few of the most popular posts on Ricochet include:

What ideas and tools would you like to know more about? Drop a comment, or ping me on Twitter @MacDivaONA.