Archives for posts with tag: NICAR

NICAR 2016 banner
The short link to this list is j.mp/nicar16 (case sensitive).

Jump to
Presentations & Tutorials | Software & Tools | References & Additional Resources | Lightning Talks | Work Samples

Presentations & Tutorials


Jump to
Presentations & Tutorials | Software & Tools | References & Additional Resources | Lightning Talks | Work Samples


Software & Tools


Jump to
Presentations & Tutorials | Software & Tools | References & Additional Resources | Lightning Talks | Work Samples


References & Other Resources from NICARians


Jump to
Presentations & Tutorials | Software & Tools | References & Additional Resources | Lightning Talks | Work Samples


Lighting Talks

  • I Improved My Math Fluency, And So Can You slides (Ryann Grochowski Jones)
  • Solve Every Statistics Problem with One Weird Trick slides (Jonathan Stray)
  • Let lookup save you from the boring, repetitive work you’ve forgotten you’re even doing slides (Chris Groskopf)
  • Automation in the newsroom slides (Ariana Giorgi)
  • Regular Regular Expression Exercises for Regular People slides (Dan Nguyen)
  • Map tiles are dead; Long live (vector) tiles! slides (Ken Schwencke)
  • How to read 52 books in 52 weeks slides (Nicole Zhu)
  • What I learned working on Failure Factories slides (Adam Playford)
  • Let’s Talk About the Future of Interactive News Content slides (Gregor Aisch)
  • Cats and Stats slides (Jennifer LaFleur)


Jump to
Presentations & Tutorials | Software & Tools | References & Additional Resources | Lightning Talks | Work Samples


Work Samples

Jump to
Presentations & Tutorials | Software & Tools | References & Additional Resources | Lightning Talks | Work Samples

For previous years’ tutorials, videos, presentations and tips see the lists from 2015, 2014, 2013, 2012 and 2011.


NICAR 2015 banner
The short link to this list is j.mp/nicar15 (case sensitive).

Consider donating to IRE
Investigative Reporters and Editors logoThis is the fifth anniversary of my NICAR Links List! If you’ve found the lists helpful, consider donating some money to IRE to help them continue training people and bringing NICAR to you. Donate today. You know you want to.

And now, on to business…
It’s back! The annual collection of presentations, tutorials, and resources from IRE’s CAR conference. This year’s event comes to you from Atlanta, March 5 – 8. Keep up with the chatter on Twitter at #NICAR15.

For attendees, IRE has created an schedule in The Guidebook app (iOS, Android & Web). Very helpful for planning the tactical mission known as “managing your time.”

Jeremy Singer-Vine also created CSV & JSON outputs of the schedule, along with the Python scraper to DIY. And there’s a Google spreadsheet with all the sessions. Awesome.

If you’re presenting at NICAR and would like this list to include your resource (presentation, tutorial repo, etc.), please send it using this form, or ping me on Twitter @MacDiva.

If you’re looking for a job, IRE keeps a list of open positions as does Knight-Mozilla OpenNews at Source Jobs. If you’re specifically interested in data visualization jobs, look here.

California Code Rush 2015And finally, NICAR in the Peach State will see its first California Code Rush. The Golden State’s campaign finance and lobbying database is online, and the California Code Rush aims to make the data easier to download, review and republish. It’s an open source project with lots of opportunities to help.

For previous years’ tutorials, videos, presentations and tips see the lists from 2014, 2013, 2012 and 2011.

Jump to
Presentations & Tutorials | Software & Tools | References & Additional Resources | Lightning Talks | Work Samples

Presentations & Tutorials


Jump to
Presentations & Tutorials | Software & Tools | References & Additional Resources | Lightning Talks | Work Samples


Software & Tools

  • Tarbell – Google spreadsheets-based website publishing tool
  • Landsat-ul: A utility to search, download and process Landsat 8 satellite imagery
  • JPL’s SMAP Viewer (SMAP is “Soil Moisture Active Passive” satellite imaging)
  • Plotly – graph and share your data
  • Plug Tableau into Excel with Tableau’s Reshaper
  • Bokeh Python interactive visualization library
  • markdowneyjr turns Markdown into JSON for slightly easier copyediting of data files
  • Changedetection.com tracks website page changes and notifies you.
  • The New York Times graphics desk’s ai2html changes Adobe Illustrator files into HTML & CSS | example output
  • The New York Times graphics desk’s ArchieML – a structured text format optimized for human writability
  • Minezy email exploration tool (prototype by T. Christian Miller)
  • TimelineCurator works with TimelineJS to extract temporal references in freeform text to generate a visual timeline
  • The Upshot’s Bedfellows command-line tool for exploring the PAC donor-recipient relationship


Jump to
Presentations & Tutorials | Software & Tools | References & Additional Resources | Lightning Talks | Work Samples


References & Other Resources from NICARians


Jump to
Presentations & Tutorials | Software & Tools | References & Additional Resources | Lightning Talks | Work Samples


Lighting Talks


Jump to
Presentations & Tutorials | Software & Tools | References & Additional Resources | Lightning Talks | Work Samples


Work Samples

Jump to
Presentations & Tutorials | Software & Tools | References & Additional Resources | Lightning Talks | Work Samples


NICAR14 The short link to this list is j.mp/nicar14 (case sensitive).

Almost 1,000 people registered for the annual Computer Assisted Reporting conference this year, making it the biggest NICAR ever. Thanks again to Stephen Stirling and Frederick Kaimann of the New Jersey Star-Ledger for creating NICAR bingo with code lent by WNYC.

Make note: NICAR 2015 will be March 5–8 in Atlanta.

This is a collection of all the practical knowledge journalists specializing in investigative reporting shared in four days. It is a lot and deep learning takes time, so consider this your archive.

If you went to the conference, Matt Waite has some good advice for how to make the most of the enthusiasm and frenzied exhaustion you’re feeling immediately after coming home. I strongly suggest you not only read it, but take it to heart. Especially the kicker.

Have session materials? Send me email or ping me on Twitter @MacDiva and I’ll add them to this list.

If you’re looking for a job, IRE keeps a list of open positions and OpenNews Source just launched their jobs list. If you’re specifically interested in data visualization jobs, look here.

For previous years’ tutorials, videos, presentations and tips see the lists from 2013, 2012 and 2011.

Jump to
Presentations & Tutorials | Software & Tools | References | Lightning Talks | Work Samples

Presentations & Tutorials


Make your first news app (from Ben Welsh)
Build maps with leaflet and mapbox.js (from Becca Aaronson)
Creating maps: principles, mistakes, and potential (from Noah Veltman & Tom MacWright)
• Excel Magic class handout and Excel data (from MaryJo Webster)
• 50 ideas 50 minutes handout (from MaryJo Webster)
Maps and Charts in R: Real Newsroom Examples (from Matt Waite)
Intro to MySQL tutorial materials (from Liz Lucas)
PostGIS + CartoDB (from Michael Keller & Andrew Hill)
Demystifying D3, an intro to the grammar of graphics (from Alastair Dant)
Introduction to D3.js (from Irene Ros)
Demystifying d3.js Workshop (from Irene Ros)
Everyday Scripting (from Agustin Armendariz)
Amazon Cloud Basics (from Scott Klein)
• Grabbing Data from Websites: tips & tricks (from Scott Klein)
Intro to Tableau (from Jewel Loree)
• SQLite from the Command Line slides & GitHub repo (from Matt Kiefer)
Working with NPR’s Apps Template (from Tyler Fisher)
Insight and Enlightenment and an expansion on data, patternicity and biases (from Alberto Cairo)
• Notes from The Data-Driven Story (from Stephen Suen)
• Data-Driven Story: Putting the Package Together slides (from Maud Beelman)
Love Your Life, Retire Your Servers (from Andy Boyle & Tasneem Raja
Getting Started with Excel (from Helena Bengtsson)
NodeXL for Network Analysis (from Peter Aldhous)
• Investigating Racial Inequality in Your Region Presentation | Tipsheet (from Lawrence Lanahan)
Mapping 1: displaying geographical data with QGISHands (from Peter Aldhous)
Mapping 2: Manipulating geographical data with QGIS (from Peter Aldhous)
Counting and Summing with SQL (from Andrea Fuller)
Digging online for global data (from Jonathan Stoneman)
Mining the Census for Every Beat (from Ronald Campbell)
• Census I: Must-have data for every beat slides & handout (from Paul Overberg)
• Census I: Crunching Census Commuting Data handout (from Mike Maciag)
• Census II: slides (from Paul Overberg)
• Data Deep Dive I handout (from Paul Overberg)
Free CAR Tools (from Matt Wynn & Martin Burch)
• Harnessing the Power of the Crowd presentation (from Robert Benincasa) | notes (from Stephen Suen)
What to Consider Before Scraping (from Isaac Wolf)
• Tools for cracking PDFs panelist notes (from Jeremy Merrill) | Notes (from Justin Myers)
• The customized Census: How to use microdata when you just can’t find the right table slides (from Robert Gebeloff) | notes (from Justin Myers)
• Justin Myers’s Dig into business with data investigations notes
• Justin Myers’s Enhance your stories with statistics notes
Mining Health Care Data (from Peter Eisler)
How to make a story map with photos, text and ArcGIS (from Sharon Machlis)
Intro to R & Beginners’ Guide to R (from Sharon Machlis)
A few of my favorite (health data) things (from Charles Ornstein)
How ProPublica’s Prescriber Checkup Came Together (from Charles Ornstein)
Intro to GitHub (from Jordan McCullough)
Collaborative Reporting with GitHub (from Ben Balter)
Mining Nonprofit Data (from Kendall Taggart)
Complaints: A road map for killer investigations & State Consumer Complaint Contacts (from Tisha Thompson & Jill Reipenhoff)
A Reporter’s Guide to Unleashing E-Docs (from Deborah Nelson)
• Learn how to use Census Microdata (from Katie Genadek)
Dataviz for Everyone slides (from Chris Amico, Lena Groeger & Ryan Pitts)
Keeping tabs on crime slides (from Laura Norton Amico)
How to Feel Like You’re Hacking Without Really Doing It (from Samantha Sunne)
Campaign Finance I: Mining FEC Data ZIP file of slides & tipsheet (from Chris Schnaars)
• Storytelling as Presentation Tool Slides (from Chrys Wu, Helene Sears, Aron Pilhofer & Alyson Hurt) | Notes (from Stephen Suen)
Cooking With Hardware (from Team Blinky)
Intro to Ruby (from Al Shaw)
When to Scrape (from Nils Mulvad)
Build a police scanner for $20 (from Ken Schwencke & Jon Keegan)
How Panda Works (from Christopher Groskopf)
• Weathering the Storm presentation & tipsheet (from Stephen Stirling & Ian Livingston)
Make Dirty Day Shine with OpenRefine (from Frederick Kaimann)
• Threat Modeling: Planning Digital Security for your Story video and slides (from Jonathan Stray)
• The Wall Street Journal Encrypted Chat installation instructions
PyCAR Python mini-bootcamp (from Tom Meagher)
Getting Started With Python (from Anthony DeBarros)
• Intermediate Python: Refactoring 101 Documentation | GitHub repo and a well-commented example (from Jeremy Bowers, Serdar Tumgoren & Katie Park)
What is a Data Desk (from Ben Welsh)
• Crossing the language boundaries across your newsroom: journo to dev and back notes (from Stephen Suen)
Intro to Google Earth Engine (from Vanessa Schneider)
• Deep Data Dives notes (Team Al Jazeera US & friends)
Learn Regex (from Amanda Hickman)
Rifling Through the Mapping Toolbox (from Michael Corey & Ryan McNeill)
Census III: mapping & presentation (from John Keefe & Chris Amico)
• How to remove water from census shape files (from John Keefe)
• PDF Scraping With Tabula, including an explanation of its algorithms (from Jeremy Merrill)
• Tracking Hazardous Waste (from Ben Poston)
• Social Media for Investigation tools handout (from Mandy Jenkins & Robert Hernandez)
• Build your Twitter bot army – Notes (from Stephen Suen)
• Connecting Charts to Live Data slides & spreadsheet (from Timothy Barrmann)
• Tips for Covering Money in Politics stories (from Jack Gillum)

Pre-NICAR Events
• Reynolds Center Detecting Corporate Fraud workshop slides & handouts | Joanna S. Kao’s notes
Why Does Fraud Happen? (audio from Theo Francis)
Going through SEC’s 10-Ks, 10-Qs and more (audio from Theo Francis)
Don’t be intimidated (audio from Theo Francis & Roddy Boyd)
• TechRaking 5-ish (CIR) – Bootstrapping the News


Jump to
Presentations & Tutorials | Software & Tools | References | Lightning Talks | Work Samples

Software & Tools


Campaign Finance Tools (from Aaron Bycoffe)
Computational Journalism on a Stick (from M. Edward Borasky)
FOIA Machine
• What Do They Know (UK FOI)
Wakari.io, Web-based Python data analysis
Oatmeal geocoded
Kartograph framework for building interactive map apps
OpenRefine for data cleaning
The Miso Project for interactive storytelling and data visualization
D3.chart from The Miso Project for building reusable charts with d3.js
TextQL – execute SQL against structured text like CSV or TSV
Rank and Filed – search SEC filings for free
CometDocs (free for IRE members)
Import.io transforms websites into structured data or an API
Investigative Dashboard – helps expose illicit ties that cross country borders
Captricity can extract handwriting from paper forms and PDFs
Tableau plug-in for Excel
Panopticlick shows how unique your browser is. You may not be as private or hidden as you think.
Spark.io – wifi hardware to DIY
• Use GPGTools to encrypt email and manage OpenPGP keys
Google 2-step Verification
• Make a calculator with Equation by Sisi Wei & Steven Melendez
Stacked Up – check that Philadelphia neighborhood schools have all of the required instructional materials before school resumes in fall
Shut That Down – see who’s funding hate in your state
Sunlight Foundation APIs
Census Reporter
IPUMS (Integrated Public Use Microdata Series) offers complete-count data from 1800s censuses of Canada, Great Britain, Norway, Sweden and the U.S.
• Brown University’s US2010 census project
Website Watcher tracks site changes
• Find phone numbers with AnyWho (U.S.) | Worldwide: Infobel & Numberway
Snap Bird searches your tweets & DMs and friends’ tweets
Foller.me Twitter analytics
• Twitter’s own analytics tools
Tweetbeep Twitter analytics
DownloadThemAll browser plugin
• NPR’s Apps Template
• Chicago Tribune’s Tarbell (Google Spreadsheets + AWS)
Vega visualization grammar
Lyra visualization design environment
Overview Project
Open Source Alternatives a.k.a. OSALT
Tabula
Tineye reverse image search
• Falcon Google Chrome extension for people search
• Cryptocat private chat for Web browsers and iPhone
• Tor Project prevents traffic analysis
Freze saves screenshot + website source code
• Twine is an open-source tool for telling interactive, nonlinear stories


Jump to
Presentations & Tutorials | Software & Tools | References | Lightning Talks | Work Samples

References


• The IRE-NICAR Database Library
• Alberto Cairo’s blog, The Functional Art
• Mike Bostock’s Let’s Make a Map tutorial
• “How Designers Destroyed the World” by Mike Monteiro
• “The Grammar of Graphics (Statistics and Computing)” by Leland Wilkinson et al.
How to Read Histograms and Use Them in R
What statistical analysis should I use? (from UCLA — Go Bruins!)
Econometrics lectures by Mark Thomas, University of Oregon
Fracking tipsheet (from Mike Soraghan)
FollowTheMoney.org
• Make Tidy Data from start to finish by Hadley Wickham
Easing Functions Cheat Sheet by Andrey Sitnik
Mapmakers Cheat Sheet by Tom MacWright
• Information on the sustainability of digital formats from the Library of Congress
• Scott Murray’s D3.js tutorials
Data Resources for Dams, Impoundments and Levees from Society of Environmental Journalists
ArcGIS Gallery of maps, maps, maps
Causes of Death in the World (1990, 2005, 2010) from Health Intelligence
• The Pew Research Center Data Feed
New Directions in Cryptography (PDF) by Whitfield Diffie & Martin E. Hellman
Best practices for FOIA & government information requests (from Office of Government Information Services)
FERPA Fact fact-checks the use of the Family Educational Rights and Privacy Act when denying access to public records. A Student Press Law Center project.
• Edward Tufte’s Sparkline theory and practice
A Map That Wasn’t a Map – Mother Jones case study
VINELink – find out if someone is incarcerated
National Missing and Unidentified Persons System (NamUs)
• Federal Bureau of Prisons Inmate Locator
Federal Reserve Economic Data (FRED), St. Louis Federal Reserve
Algorithmic Accountability Reporting paper by Nick Diakopoulus
PythonJournos Google Group
National Historical Geographic Information System
Data.gov – the U.S. government’s open data repository
How to Mail Merge in Microsoft Word
• Easy maps with Ari Lamstein’s choroplethr
Six Provocations for Big Data by danah boyd & Kate Crawford
• Noah Veltman’s explanation of static vs. dynamic websites
• “Building Data Science Teams” by DJ Patil
• The ultimate in user testing (seriously): Test your mobile app on drunk users
• How to set up your laptop to develop news apps the NPR way
• “Multiliteracies for a Digital Age” by Stuart A. Selber (library lookup | Amazon | Southern Illinois University Press)
• Noah Veltman’s Learning Lunches – an effort to demystify technical topics that come up often in newsroom development
• “Reverse Engineering Chinese Censorship through Randomized Experimentation and Participant Observation” by Gary King, Jennifer Pan and Margaret E. Roberts
• Political Framing Blog uses machine learning to find trends in congressional rhetoric


Jump to
Presentations & Tutorials | Software & Tools | References | Lightning Talks | Work Samples

Lighting Talks


• Refactoring; or Why Your Code Sucks and How to Fix ItChristopher Groskopf
• A Few of My Favorite Wee ThingsLena Groeger
Natural Language Processing in the kitchenAnthony Pesce
• Five (more) algorithms in five (more) minutes GitHub repo | VideoChase Davis
• What we can learn from terrible data viz (slides | Video) – Katie Park
Practical CalculusSteven Rich
• Detecting What Isn’t There – Sisi Wei
• The whole internet in 5 minutes! (Slides | GitHub repo | Video) – Jeremy Bowers
How to Raise an ArmyTyler Fisher
• You Must Learn (Slides | Video) – Ben Welsh


Jump to
Presentations & Tutorials | Software & Tools | References | Lightning Talks | Work Samples

Work Samples


• Planet Money Makes a T-Shirt (NPR)
The GitHub repo for Planet Money’s T-Shirt Project (NPR)
• BBC News Interactives & Graphics
Visualizing Buffy (data visualization, made with d3.js)
Timeline: Shots fired at LAX Terminal 3 checkpoint (KPCC)
Timeline: The search for Christopher Dorner” (KPCC)
Fire Tracker (KPCC)
Confira a evolução da população do mundo desde 1950 (Epoca)
50 Years of Change tracking LGBT civil rights (University of Wisconsin-Madison cartography, multiple representations of the same dataset for clear explanation, recommended by Alberto Cairo)
HealthCare.gov Explorer (WSJ)
Russia’s Dubius Vote (WSJ – histograms example)
Portraits of the Hundreds of Children Killed by Guns Since Newtown (Mother Jones)
Playgrounds for Everyone (NPR)
Behind the Bloodshed: The Untold Story of America’s Mass Killings (USA Today)
A Special Report on the Rise of Mass Shootings in America (Mother Jones)
Secrecy 101 (The Columbus Dispatch)
Washington: A World Apart (The Washington Post)
NHS Winter Accident & Emergency tracker (BBC News)
The Child Exchange: Inside America’s underground market for adopted children (Reuters Investigates)
Chicago Under the Gun (The Chicago Tribune)
Deadly Delays (The Milwaukee Journal-Sentinel)
Twisters: Road to Larissa (Adam Pearce)
News Nerd First Projects – “It’s okay. We all sucked once.”

Jump to
Presentations & Tutorials | Software & Tools | References | Lightning Talks | Work Samples

CAR 2013 Conference logo
NICAR13 brings together some of the sharpest minds and most experienced hands in investigative journalism. Over four days, people share, discuss and teach techniques for hunting leads, gathering data, and presenting stories. Of all the conferences I go to, this one gets the highest marks from attendees for intensive, immediately applicable learning; networking and fun.

No one could possibly absorb and remember everything presented, so below is your memory card. If you’re looking for highlights from this list, read my NICAR13 roundup for Nieman Lab, “Data science, commoditized backends, and the need to know code.”

Have links from sessions you attended? Post them in comments or ping me on Twitter @MacDiva and I’ll add them to this list.

If you’re looking for a job, IRE keeps a list of open positions. Here’s who’s hiring.

NICAR 2014 will be in Baltimore from Feb. 27 to March 2. You should be there.

For additional tutorials, videos, presentations and tips see the lists from 2012 and 2011.

Jump to
Presentations & Tutorials | Software & Tools | References | Work Samples
 

Presentations & Tutorials


Dashboards for Reporting (from Aaron Bycoffe, Jacob Harris & Derek Willis)
Data Science for Nerdy Journalists (from Hadley Wickham)
  – Sisi Wei shares her class notes
Data Scraping with Google Docs (from Sean Sposito)
How to create an automatically updating Google spreadsheet (from Sharon Machlis)
Demystifying Web Scraping (from Sean Sposito & Acton Gordon)
Campaign Finance the Data Science Way (from Chase Davis)
Exploratory Data Analysis (from Chase Davis)
Hone your Google Fusion Tables training skills tutorial (from Sreeram Balakrishnan)
Data Mining Machine Learning (from Jeff Larson)
Practical Machine Learning (from Chase Davis & Jeff Larson)
Journalism, Branding & Social Media (from Mandy Jenkins) 
Social media search tips and tools (from Doug Haddix)
How the Los Angeles Times uses DocumentCloud (from Ben Welsh)
Using Excel for Data Analysis (from Krista Kjellman Schmidt)
Excel I: Sorting and filtering (from Linda Johnson)
Excel II: Rates and Ratios (from Denise Malan)
Excel Magic: Advanced functions for data cleaning and more | Excel data (from MaryJo Webster)
Make Your First News App with Django
Data on the Fly (from John Keefe & Mark Wert)
Digging Deep with Data Journalism (from Jill Riepenhoff)
Information Design & Crossing the Digital Divide (from Helene Sears)
Dataviz on a shoestring (from Sharon Machlis)
Introduction to Ruby (from Al Shaw)
The Data Driven Story: Conceiving & Launching (from Jennifer LaFleur & David Donald)
Dataviz, Responsive Web Design + Mobile: Friends or Frenemies? (from Miranda Mulligan & Pete Karl II)
• Quick steps to mastering SQL through SQLite (from Troy Thibodeaux)
  – Emma Carew Grovum shares her notes from the tutorial
Reporting without revealing: Tools for hiding your tracks (from Paula Lavigne)
Covert reporting using technology to cover your tracks (from Mike Tigas)
Learning Python for journalists (from Jeremy Bowers & Serdar Tumgoren)
  – Ask to join the Google group
Fun with data in sports journalism (from Jack Gillum)
After the game: Top data ideas for investigating $port through $pending (from Paula Lavigne)
Is 911 a Joke in Your Town? (from Ben Welsh)
• Sample code for Introduction to JavaScript the Right Way (from Jeff Larson)
Food waste investigations (from Erin Jordan)
Government waste investigations (from Tim Eberly)
Investigating government waste (from Josh Sweigart)
OpenRefine (formerly Google Refine) slides and cheat sheet (from Tom Meagher)
How can we get the widest impact out of software projects? (from Rich Gordon)
How to be ready for your social media Sandy (on discovery, validation and publication) (from Steve Myers)
Github repo and example code from Developing reusable visualization components using D3 and Backbone.js (from Alastair Dant)
Code for drought maps & Data & code .zip file (from Amanda Cox)
Web scraping with Node.js (from Al Shaw)
• Zip file for Python workshops 1 & 3 | Github repo (from Ron Campbell)
• Tip sheet for Python workshop 2, plus dataset for the workshop (from Christopher Schnaars)
• Mike Ball shares his notes from Tasneem Raja’s Smarter interactive Web projects with Google Spreadsheets and Tabletop.js talk
Data Roadmaps: Priming your desktop with certain data slices helps you spot trends, find people and understand your city (from T.L. Langford)
Making Health Data Sexy (from Charles Ornstein)
Infect the CMS (from Heather Billings, Jacob Harris and Al Shaw)
Making interactives fun | List of interactives shown during the talk (from Tasneem Raja and Sisi Wei)
Covering public pensions (from MaryJo Webster)
• Learn to use Git and Github and fork this cheat sheet (from Tom Meagher)
Making Timelines (from Krista Kjellman Schmidt and Lena Groeger)
Inside baseball: What data journalism can learn from sports (from Jeremy Bowers, Ryan Pitts and Matt Waite
Disasters: Preparing for and digging in after the storm (from Ben Poston)
5 data journalism projects you might not have seen before and why they matter in Europe (from Sebastian Mondial)
The One-Query Story (from Kate Martin)
Mapping Best Practices (from Dave Cole, John Keefe and Matt Stiles)
Web Scraping (and more) with Google Apps Script (from Steven Melendez)
NodeXL for Network Analysis (from Peter Aldhous)
Data-driven Beats (from Chris Amico)
Bringing Excel to the Web with SkyDrive (from Cathy Harley)
Navigating U.S. Census Data (from Erran F. Persley)
How to Serve Mad Traffic, Part I (from Jeremy Bowers)
How to Serve Mad Traffic, Part II (from Jacqui Maher) 

Lightning Talks
5 Algorithms in 5 Minutes | Video (from Chase Davis)
Let’s make games for news | Video (from Sisi Wei)
Big datasets, small streams | Video (from Katie Park)
Z-Scores: How You Can Compare Apples With Oranges (downloads a PowerPoint file) | Video (from Robert Gebeloff)
Casino-Driven Design | Video (from Al Shaw)
Be your wn Nate Silver | Video (from Jeff Larson)
ILENE, the polite coding language | Video (from Jennifer LaFleur and Jeff Larson)
Every State is Weird: A selection of election edge cases | Video (from Jacob Harris)
Dude Who Stole My Congressman? (Data in .xls | Visualization) (from Paul Parker)
• Code for the Arduino Baggage Handler | Video (from Matt Waite)
• “Django Retrained: 5 ways coding like a web developer can make you a better investigative reporter” | Slides (from Ben Welsh)


Jump to
Presentations & Tutorials | Software & Tools | References | Work Samples
 

Software & Tools


BatchGeo
ChangeDetection.com – monitor website changes
Citizen Quotes – A project to demonstrate maximum entropy models for extracting quotes from news articles in Python.
CometDocs converts PDFs to Word and Excel docs
Tabula for pulling data out of PDFs
• Tried and true XPDF (PDFtoText)
DocHive PDF to XML converter
Python wrapper for the Document Cloud API
DownThemAll Firefox plug-in for downloading website assets (photos, video, etc.)
• Embed Excel Interactive View into your site
Fast Cluster, a command line tool for grouping documents by similarity (from Jeff Larson)
FOIA Machine (automate your Freedom of Information requests)
Geofeedia search and monitor social media by location
iWitness from Adaptive Path – search social media content by time and place
OpenRefine (the open source repo of the data cleaning tool formerly known as Google Refine)
Overview Project | Read the getting started guide
Scrape screen scraper Chrome extension. Journalist Jens Finnäs wrote a tutorial for it on Dataists.
Time Flow by Martin Wattenberg & Fernanda Viegas
Stately – a symbol font to create a map of the U.S. using HTML & CSS
Weka 3: Data mining software in Java
Cascading Tree Sheets
Dataset (part of the Miso Project) – grabs data from Google Spreadsheets and helps visualize the data
Datawrapper (open source)
Google Chart Tools
Infogram
ManyEyes
Tabletop.js
Tableau Public (Windows only)
Mapbox and Tilemill
Statwing
Adobe Edge Animate free tool for creating interactive content
Spoofcard caller ID spoofing
Trap Call unblocks private numbers
Burner iPhone app creates disposable phone numbers
• Tools for hiding an IP address:
  – Anonymizer ($80)
  – Privoxy
  – BeHidden
  – Anonymous
  – IxQuick
Orbot provides Tor proxying on Android phones
Silent Circle encrypted communication app for iPhone and Android
Whois (search for domain name owners)
SpiderOak private, secure data stored in the cloud
Foller.me who to follow on social platforms
Twazzup.com
Ban.jo (mobile app)
Hachi social platform search tool
R Project for Statistical Computing
R Studio
• Learn to unlock government data with Sunlight Academy offered by the Sunlight Foundation
JS Console for debugging JavaScript
Programming Ruby 1.9 & 2.0 (4th edition): The Pragmatic Programmers’ Guide
• Production code for Overview Server, which does visual document mining
mitmproxy (“man in the middle” proxy) inspect and edit traffic flows on the fly. SSL compatible.
Python Social Auth social authentication/registration mechanism
XCode iPhone simulator
jQuery Vertical Timeline by MinnPost
Rubular regular expression editor for Ruby
UltraEdit text editor (Windows only)
• Tom MacWright’s Mistakes interactive JS editor
Sphinx open source search engine
• NPR’s App Template project template for client-side apps
ILENE the polite coding language (from Jeff Larson)
Django Bakery helps bake your Django site out as flat files
Invar generates map tiles from a Mapnik configuration
Table Capture Chrome extension grabs table HTML and drops it into a Google doc
TableTools2 Firefox extension allows you to copy and manipulate table data from the Web
Haystax point-and-click data collection
• Sisi Wei’s presentation framework
Bank Tracker contains data on every FDIC bank
Shpescape converts shape files to TopoJSON
Numeric.js JavaScript library for numerical calculations
Pixel Ping pixel tracker
Helium Scraper extracts website data into structured formats such as CSV and XML
Choose Your Own Adventure plug-in from Mother Jones
Timeline JS
• The WNYC interactive Bingo card generator
Proof Finder search email and other unstructured data (designed for lawyers and investigators)
Paper of the Congressional Record (requires a key from Sunlight Labs)
YUI, an open source JavaScript and CSS library for developing interactive applications
Tarbell Google docs-driven CMS from the Chicago Tribune apps team (currently in alpha)
• Chase Davis’s FEC Standardizer code and explainer
• Al Shaw’s Dirtyword Ruby script cleans HTML from Word docs.


Jump to
Presentations & Tutorials | Software & Tools | References | Work Samples
 

References


• Jeff Larson recommends “Eloquent JavaScript” as the best book for learning JS
Mike Bostock’s d3.js tutorials (from Sharon Machlis)
Scott Murray’s d3.js tutorials (from Sharon Machlis)
How to select, create & remove elements in d3.js (from Jerome Cukier and Scott Murray)
Computational Journalism syllabus from Journalism and Media Studies Center at the University of Hong Kong, Spring 2013 (from Jonathan Stray)
Connected China from Fathom & Reuters (background)
  – Notes on Connected China by Chris Amico
How to Bulletproof Your Data (from Jennifer LaFleur, ProPublica)
Federal Reserve Economic Data (includes international data and an API; from Federal Reserve Bank of St. Louis)
Little Sis, a database of relationships between people in business and government
OpenMissouri a collection of state and local government data from Missouri, some of which isn’t ordinarily made available online
Privacy Rights Clearinghouse
• ProPublica’s News Apps Style Guide
TheyRule shows the relationships between people in corporations
• Hadley Wickham’s academic paper on tidy data
• Hadley Wickham’s guide to using regular expressions in R
• ProPublica News Apps Desk Coding Manifesto
• ProPublica’s Principles of News App Design Structure
Pretty Good Privacy (PGP) data encryption
Tor Project
OpenElections Project, certified historical election results for everyone
Open Innovation and open APIs in Digital Journalism (academic paper by Tanja Aitamurto and Seth C. Lewis)
• Chart of the differences between PHP, Python and Ruby
How to build a stepper visualization
How to install MySQL on Mac OS or Windows
R for Journalists
A journalists’ guide to verifying images
Finding the Wisdom in the Crowd (on verifying images found on social platforms)
How to visualize your backlinks with Google Fusion Tables (network visualization tutorial)
Design Patterns: Elements of Reusable Object-Oriented Software
Hospital Compare from Medicare.gov
• Winners of Kaggle’s campaign finance interactive reporting contest
Working with Tabletop.js and Handlebars.js
Impact of Responsive Designs
• Drew Conway’s Data Science Venn Diagram (now in d3.js!)
How to Not Screw Up Your Data
• Did you watch Ben Welsh’s lightning talk? Here’s the presentation he credits for changing his life: Writing reusable code by James Bennett, now at Mozilla. Read the revamped slides


Jump to
Presentations & Tutorials | Software & Tools | References | Work Samples
 

Work Samples


The Year in CAR presentation by Mark Horvit and Megan Luther, IRE
  (7.1 MB PDF)
The Year in CAR wrap by Ryan Graff, Knight Lab
The Evolution of Sandy’s Path (Weather.com)
Paralax Scrolling: James Bond (BBC)
How the Chicago Tribune News Apps team made the Chicago Crime site
Chinese Chemicals Flow Unchecked Onto World Drug Market (The New York Times)
Income Inequality in America (Reuters)
Australians who don’t pay tax: what would Romney say? (Financial Review)
Mid-Year Economic and Fiscal Outlook (Financial Review)
Workout at Work (Washington Post)
Ad Libs (PBS Newshour)
Could you be an Olympic medalist (from The Guardian)
Fake medical providers slip through Medicare loophole (Atlanta Journal-Constitution)
Medicare fraudsters used UPS boxes to fleece millions from taxpayers (Dayton Daily News)
The Killing Roads 10 years of traffic accidents in Norway (bt.no)

Jump to Tutorials | Software & Tools | References | Work Samples

Asking a question at IRE Las Vegas (2010), photo by Ben Welsh
Can you believe it? The annual Computer Assisted Reporting conference (also known as NICAR) is about three short weeks away.

Of all the events I’ve been to, this is the one I get the most out of. All of the sessions are meant to teach you skills you can apply immediately and reveal deep insights that will help you grow as a journalist.

Like years past, I’ll be collecting links to the tutorials, presentations, slide decks and video from NICAR13 and posting them here. In preparation — especially for new attendees — here’s some stuff you should know:

  • There will be 5-minute lightning talks. You could give one. In fact, IRE is taking talk proposals and votes right now. The most popular talks will be presented on Friday, March 1, at 4 p.m.
  • If you want one-on-one mentoring at the conference, sign up by Feb. 7. Organizers will then pair mentees up with mentors. Mentees: Bring work sample and story ideas. Mentorship slots fill up quickly, so apply today.
  • If you’re taking any hands-on training sessions or Hadley Wickham‘s data science masterclass, you might receive emails insisting you install a bunch of software before you arrive. Take the instructions seriously. Do not wait until the last minute or you will be very sad and very, very lost during class.
  • Ersi is offering a free ArcGIS for Desktop license (worth $1,500) if you attend all four of their 50-minute training and demo sessions. If you’re doing a lot of cartography and GIS work, you might want to consider it.
  • There’s Q&A after almost every session, and there’s always a pause before someone speaks up. So prepare a question (and please, not one of the “see how I’m smarter than you?” variety) and use your first-mover advantage.

NICAR is really friendly. If you’ve got a question or you have a reporting problem you’re trying to solve, just ask someone for help.

And if you want to be really prepared, Chris Fralic of First Round Capital has great advice on how to work a conference.

(Photo from IRE 2010 by Ben Welsh/Flickr)

Ben Welsh of the Los Angeles Times Data Desk spoke at the International Symposium on Online Journalism in Austin yesterday, around the same time that I was speaking on a panel about data journalism with Erik Hinton (@erikhinton), Al Shaw (@A_L) and Andrei Scheinkman (@acheink) at NYU Local Young Media Weekend.

Ben gave this talk at NICAR in St. Louis earlier this year. Lucky for us, ISOJ streamed it, and La Nacion’s data team captured it.

Watch, learn, and dig deeper in Ben’s Delicious stack. Ben also writes terrific material on his site, Palewire, and tweets at @palewire.

One of the most popular posts on Ricochet was the collection of dataviz tools, slides and links from last year’s NICAR conference.

It was so popular, in fact, that people have asked me to make a similar collection again. So from Feb. 23–26, I’ll be updating this post with all the great things NICARians have to share this year.

Follow #NICAR12 on Twitter for the buzz; come to this page for the goods. And if you’re attending the conference, be sure to buy a T-shirt to support IRE, the organization that puts this fantastic event together. Ben Welsh of The Los Angeles Times is taking candid photos and posting them on Flickr.

Have links from sessions you attended? Post them in comments or ping me on Twitter @MacDiva and I’ll add them to this list.

Jump to Presentations & Tutorials | Software & Tools | References | Work Samples
 

Presentations & Tutorials


Bringing Maps to Fruition (from Michelle Minkoff)
Free tools for scraping data without programming (from Chris Keller and Michelle Minkoff)
Instructions for Hands-on Web Scraping Without Programming (from Chris Keller and Michelle Minkoff)
Locating the Story: The Latest in Online Maps and mapping links (from Ben Welsh)
Mapping links & presentation (from David Herzog)
Social Media Sleuthing (from Doug Haddix)
freeDive Tips & Tricks (from the Knight Digital Media Center)
CAR on a Shoestring (from Kevin Crowe, Patrick Sweet and Mary Jo Webster)
Regular Expressions: An Introduction (from Kevin Crowe, Patrick Sweet and Mary Jo Webster)
Create a moderation form using Google Forms and Fusion Tables
Scraping with Django (from Kevin Schaul)
How to turn PDFs into a searchable, sortable table (from Kevin Schaul)
Get the Most Out of Fusion Tables (from Rebecca Shapley)
Data viz in 20 minutes: jQuery DataTables (from Christopher Schnaars)
How to set up Python in Windows 7 (from Anthony DeBarros)
Data visualization best practices (from Kat Downs)
NodeXL for Network analysis (from Peter Aldhous)
Network Analysis for News (from Peter Aldhous and Peggy Heinkel-Wolfe)
Network analysis for news (video of Peter Aldhous’s NICAR12 talk)
How to Use Google Refine for Investigative Journalism (from Dan Nguyen)
Mapping is for Everyone – How to make all kinds of maps (from Sharon Machlis)
Advanced Excel techniques tipsheet (from MaryJo Webster)
How do you edit a story made of software? (from Alexander Howard)
Election Night Results & Maps (from John Keefe)
Covering Elections presentation (from Al Shaw)
Making friends with map projections (from Ben Welsh and Michael Corey)
Database validation (from JT Johnson)
Web scraping with Node.js (from Al Shaw)
Who is John Doe — and where to get the paper on him
Practical TastyPie for the Modern Djangonaut (from Jeremy Bowers)
Weathering the Storm: Using data to bolster the traditional weather story (from Stephen Stirling)
Build your first Django news app (from the IRE NICAR12 Django workshop)
GeoCommons walkthrough (from Paul Monies)
QGIS 1 workshop tutorial (from Michael Corey)
Tell Me a Story! – storytelling and data journalism (from Anthony DeBarros)
Human-assisted reporting: How to create robot reporters in your own image (from Ben Welsh)
How I learned to stop worrying and love flat files (from Ben Welsh)
Infect the CMS (from Jacob Harris)
Inspect the Web With Your Browser’s Web Inspector (from Dan Nguyen)
An Intro to R (from Jacob Fenton)
Slides from “Mapping is Hard” (from Brian Boyer)
TileMill hands-on tutorial (from Chris Amico, Brian Boyer and Matt Stiles)
Own Your Map Stack (from Chris Amico, Brian Boyer and Matt Stiles)
Natural Language Toolkit (NLTK) basics (from Jacob Perkins)
Connecting to state data using OpenMissouri.org (from David Herzog)
How to convert PDFs to Excel in Windows (from IRE)
Quantum GIS (QGIS) 2 workshop (from Michael Corey)
How to turn PDFs into text (from Dan Nguyen)
Web scraping in Python workshop tutorial (from Mark Ng)
Infiltrate the Ad Department (from Ryan Pitts)
Map Graphics for Video (from Michael Corey)
What We Can Find Out from Elections (from Aaron Bycoffe)
The Latest in Mapping with Javascript and jQuery (from Timothy Barmann)
How to Make a PANDA (from Brian Boyer)
The Farenthold Surprise (election panel presentation from Derek Willis)
Displaying data geographically: Creating a one-layer map in ArcMap (from Tom Meagher)
An intro to csvKit (from Christopher Groskopf and Anthony DeBarros)
Integrating CAR into a daily Beat (from Kate Martin)
How to use the SIMILE Exhibit timeline framework (from David Karger)
Tableau training handouts (from Tableau)
CAR Training 2012 including mapping data sets, practice data sets and tip sheets (from Jennifer LaFleur)


Jump to Presentations & Tutorials | Software & Tools | References | Work Samples
 

Software & Tools


Twazzup – find breaking news, popular hashtags, influential users
Reporters’ Lab Reviews – a link list of tools, techniques and research for public affairs reporting
Twellow – a yellow pages for Twitter
Twiangulate – find sources and groups of people on Twitter
Crowdbooster – monitor and analyze buzz on social media sites
KnowEm Username Search – finds the social networks a person or organization/brand is using
Muckrack Pro – add yourself to the list of journalists or find journalists covering a particular topic
The Archivist – save tweets and export to Excel to analyze later
PowerPivot for Excel – “Load massive amounts of data from virtually any source, process in seconds and model with powerful analytical capabilities”
Pandoc – a universal document converter
HTML-to-PDF – converts HTML to PDF docs for free
Mr. Data Converter – converts Excel data into one of several Web-friendly formats, including HTML, JSON and XML.
Natural Language Toolkit – for machine language text analysis
Voyant Tools – Web-based document analysis
ClearForest Gnosis – Firefox plugin that uses OpenCalais for data extraction
Exhibit – a publishing framework for data-rich interactive web pages
DocumentCloud – store, analyze and annotate PDFs
DataTables – jQuery plugin to create sortable datasets
Ben Welsh’s triumvirate of tools that allow you to copy Google Maps’ functionality:
   – a data source, like OpenStreetMap
   – a tile set, like what you can make with TileMill
   – a JavaScript interface, like Leaflet
OpenOffice – open source office suite software (word processor, spreadsheet, presentation/slide deck, database)
QGIS – Open source geographic information system
Shape to Fusion (a.k.a. Shpescape) – Import shapefiles to Fusion Tables
MySQL – Database software
Google Refine – data cleaner
Junar – Discover and track data
The Overview Project
Visicheck – ensures your graphics are visible to the colorblind
Colorbrewer – in case you need help with color schemes for your design
Color Oracle – colorblindness simulator for Mac OS, Windows and Linux
0 to 255 – find variations of any color
Beautiful Soup – useful for many things, including parsing HTML
Weave – Web-based analysis and visualization environment. Made by a partnership between the University of Massachusetts Lowell and Open Indicators Consortium
Highcharts – create interactive JavaScript charts (free for non-commercial use)
Indiemapper – Upload shapefiles and convert them to create static, thematic maps
CSV-to-JSON converter
Sinatra a lightweight Ruby/Rails framework for creating apps
• Use Google Docs, XPath and the =importxml() function to put data in a spreadsheet
PANDA Project
Timemap syncs a SIMILE timeline to a web-based map
Tabletop – allows you to use Google spreadsheets as your app backend
Js2Coffee – converts Javascript to CoffeeScript and back
CoffeeScript sandbox
iPL2 – ask a librarian, search through the Internet Public Library (IPL) and the Librarians’ Internet Index (LII) websites.
• “Lesson of the night: Want to put census geos in fusion tables? Keep it stupid simple: convert US Census data from TIGER into shape files with shpescape” — tip from Matt Kiefer
Rubular – a Ruby regular expression editor
Timeline Setter – makes timelines from spreadsheets
Spoofcard changes your voice and gives you a temporary phone number
Tablechart turns HTML tables into charts
Spam Mimic – hide a message in spam
FEC scraper/FEC parser – Chris Schnaars’ script on Github

Jump to Presentations & Tutorials | Software & Tools | References | Work Samples
 

References


• The American Library Association’s wiki of government databases (from Dan Nguyen)
Penn Treebank Project reference – Use it in conjunction with the Natural Language Toolkit (NLTK)
Geomedia Google Group
NICAR-L mailing list
Google Public Data Explorer
InfoVis Wiki – a catchcall list of papers, conferences, patterns and jobs in information visualization
Spatial Reference – an IMDB-like catalog of spatial reference systems
22 free visualization tools collected by ComputerWorld
Free Data Visualization tools – a collection from Sharon Machlis
8 cool tools for data analysis, visualization and presentation (from Sharon Machlis)
Chart and image gallery: 30 free tools for data visualization and analysis (from Sharon Machlis)
LocalHealthData.org – find health data from more than 70 sources and 300+ datasets
Analytic Journalism “It’s not ‘all about story’ if you don’t have anything to say.”
How to install MySQL and Navicat on Windows
Freebase – an entity graph/Wikipedia-like collection of data
Save the Post Office – records U.S. post office consolidations and closures
• Los Angeles Times datadesk Github repository with code for you to use
USASpending.gov – Official record of Federal Funding Accountability and Transparency Act (Transparency Act)
&bull: Data for the Public Good by Alexander Howard (free eBook)
CongressionalPrimaries.org shows what Illinois congressional candidates are tweeting about
Civic Commons Marketplace collects open government efforts in the U.S.
OpenCorporates is in the process of collecting information on every corporate entity in the world
• USA Today’s Developer Network

Jump to Presentations & Tutorials | Software & Tools | References | Work Samples
 

Work Samples


Bailed out banks profit from tax liens (Arizona Star heat maps showed property locations, making the story very clear)
Race gap found in traffic stops (Milwaukee Journal-Sentinel showed the racial disparity in pullovers and on further examination, municipal maintenance requests)
Texas redistricting map and slider code (Texas Tribune)
The Poverty Gap shows a clear correlation between poverty and access to education (ProPublica)
2012 Election Results big board, one approach to visual presentation of election info that tells you the story of the election immediately (The New York Times)
Little Loving County grabs a bit of Texas’ growth a census story unlike the usual census stories (The Dallas Morning News)
Riot rumours: how misinformation spread on Twitter during a time of crisis uses data analysis to watch the spread and suppression of rumors about the London riots (The Guardian)
Discover Boston Public Schools (Code for America)
SchoolBook makes teacher data reports for New York City schools
Redistricting: New lines leave some voters without a senator (The [Riverside, Calif.] Press-Enterprise)

Jump to Tutorials | Software & Tools | References | Work Samples

And finally, no journalism nerdfest would be complete without a demonstration of the latest hotness: Drone journalism by Matt Waite.

Drone Journalism Demo – Matt Waite from John Keefe on Vimeo.

CAR 2011 was stuffed full of information, so much so that the only way to keep up with everything has been to keep a log of what people have been sharing.

Feb. 28 update: Thanks to everyone who’s forwarded additional links and presentations (I’m marking them with NEW as they’re added) and to all who’ve sent me nice notes about this list.

Philip Smith forwarded a JSON file of NICAR tweets with links in them. Want it? Download it.

This year’s conference looks to have been a tremendous success, bringing in the most registered attendees in nearly a decade. Congratulations to NICAR for a terrific, educational and inspiring event.

A more narrative look at what happened at the conference can be found on the conference blog. But if you’re anxious to dive in, this is your buffet: Prepare to have your mind blown.

Got links from sessions you attended? Post them in comments or ping me on Twitter @MacDiva and I’ll add them to this list.

Jump to Tutorials | Software & Tools | References | Work Samples
 

Presentations & Tutorials

NEW Using TwitInfo and TweeQL to find and tell stories (from Adam Marcus)

NEW A Gentle Introduction to SQL using SQLite: slides, full tutorial and steps only (from Troy Thibodeaux)

Valet Parking Your Django App (from Jeremy Bowers)

Similarity algorithms using Python (from Luke Rosiak)

The Quick and Dirty Varnish Setup for Django (from Andy Boyle)

Making HTML Tables Interactive (from Michelle Minkoff)

View more presentations from Michelle Minkoff

QuantumGIS 1 tutorial and files (16.3MB) (from Timothy Barmann)
• Use JavaScript and jQuery to create interactive maps: tutorial and files (17.5 MB) (from Timothy Barmann)
• How to break news online – and use LA Times app engine tools (from Ben Welsh)

NodeXL for Social Network Analysis (from Peter Aldhous)

• Excel, CAR and mapping training tipsheets, slides and datasets (from Jennifer LaFleur)

My Favorite (Excel) Things (from MaryJo Webster)

Latest in Mapping tools and examples

(Ruby) Coding for Absolute Beginners (from Dan Nguyen) – following the tutorial will produce “My Very First Web Page

Google Refine tutorial and datasets (that download to your hard drive on click) (from David Huynh)

APIs: Making the Web a Data Medium (from Anthony DeBarros and Derek Willis)

NEW R for Statistics: First Steps (from Peter Aldhous)

NEW R for Statistics: Automate Your Analysis (from Jacob Fenton)

Hands-on R, a step-by-step tutorial (from Jacob Fenton)

Ruby4Kids — mentioned in passing as a low-friction way to learn the basics of Ruby

How to make an intensity map with custom boundaries using Google Fusion Tables

Google Fusion Tables tutorial

Cracking Open Electric Records slides and case law < = link launches a PDF bundle (from DB Smallman) • Internet Reporting: What You Should Know (from Jack Gillum)

• Free software: From Spreadsheets to GIS, Part 1 and Part 2 (from Jacob Fenton and Anthony DeBarros)

Beyond Mapping: Spatial Analysis on the Cheap (from Long Creative)

Beautiful Data (from Aron Pilhofer)

View more presentations from pilhofer

• Intro to Python, Session 1 tutorial and Python tipsheet (from Jacqueline Kazil and Serdar Tumgoren)

Getting into a data-oriented mindset (from Mary Jo Webster and Wendell Cochran)

Dataviz for beginners (from Matt Stiles and Sanjay Bhatt)

MGRS Explained (from Jacob Harris)

Data Visualization with JavaScript and HTML5 (from Jeff Larson)

Tutorial: Census Data with Tableau Public

PostGIS is Your New Bicycle – be wowed by a free alternative to costly desktop GIS (from Mike Corey and Ben Welsh)


 

Software & Tools

Jump to Tutorials | Software & Tools | References | Work Samples

3Scale – API management and monetization tool (free trial)
API Playground – try APIs, no coding skills necessary
Backbone.js adds a models-collections-views structure to JavaScript applications
BatchGeo interactive map maker
Biznar.com – business search engine
CanIUse.com browser compatibility tables
Census Block Conversions API
ChangeTracker from ProPublica – track changes to any website
ChinaVitae – learn who’s who in power in China
CollegeInsight – compare universities by cost, financial aid, diversity, job placement rate
DataWrangler cleans and transforms data
• Download manager downTHEMall is a FireFox extension that grabs webpage links and images.
Europe Media Monitor’s NewsBrief – an international alternative to Google News
EUROCONTROL – “find blocked private planes that might have flown to Europe, for example: see which executives are going to Cannes”
FCC Census Block Conversions API – boundary service API, excellent for mapping
• The FireShot FireFox extension creates browser screenshots, adds annotation and more.
Foreign Labor Certification Data Center – find what visas a company has applied for (there may be wage information tied to the application)
Get Lat Lon – finds latitude and longitude for any location worldwide
• Free Google Drawings wireframe templates
Google Fusion Tables for data analysis and visualization
Google Refine for data cleaning
Inmarsat Ships Directory – lookup a ship’s phone number
JSFiddle online JavaScript editor
Jigsaw: “Visual analytics for exploring and understanding document collections”
Little Sis – visualizing the networks of social, financial and political power
MarineTraffic.com – track vessels in real time
Mayan open source, Django-based document manager
Mr. Data Converter converts Excel data into web-friendly formats
Needlebase
NETROnline – public records search, especially good for real property lookups
NodeXL uses Excel for network analysis
NodeXL Teaching lessons and tutorials
Numberway.com – lookup phone numbers around the world
Outwit Hub – FireFox plugin for scraping websites
PDFonFly – converts web pages to PDFs
PhraseNet diagrams relationships between words in text
PostGIS – adds mapping ability to PostgreSQL
PrivacyChoice – rates website privacy policies
Protovis
PySAL an open source Python library for spatial analysis functions
R statistical analysis software
R libraries recommended by Amanda Cox, Jeff Larson and others: ggplot, RColorBrewer (color picker), rgdal (bindings for GDAL – the Geospatial Data Abstraction Library), survival (survival analysis)
Recorded Future – temporal analysis search engine uses predictive analytics to discover the likelihood of events in the future
RSRuby – use the R environment in your Ruby program
Rubular – test your regex on the fly
Simile Timeline
Scraper Wiki
Snitch.Name – people lookup
Tableau Public
TimeFlow
TinEye finds information on uploaded images, including usage, higher resolutions, modified versions
Tweeql access the Twitter API by using SQL syntax (requires Python)
TwitInfo chart Twitter keyword frequency and sentiment
USA Spending – see what the US government is spending money on
 

References

Jump to Tutorials | Software & Tools | References | Work Samples

NEW Journalists learning Python Google group
NICAR ‘Net Tour – an index of links from IRE for watchdog research and learning computer-assisted reporting
The New Precision Journalism (from Philip Meyer)
The Logic Of Causal Order by James A. Davis (recommended by Philip Meyer)
• US Government Health Data
Health Indicators Warehouse
Coordinate Systems Overview for mapping
Concepts of Probability (statistics!)
Advanced Probability and Statistics, 2nd Ed. by the CK-12 Foundation
Thomas Lumley: work page (statistics! and Amanda Cox’s professor)
Hadley Wickham (statistics! and the maker of ggplot for R)
Graphical Inference for Infovis by Hadley Wickham, Dianne Cook, Heike Hofmann and Andreas Buja (“How do we know if what we see (in a data visualization) is really there?”)
“Be Careful What You Do With That Cell Phone Recording; It Could Land You in Jail” (from DB Smallman)
Gary’s Social Media Count – see the volume of social media activity
Quantitative Discovery from Qualitative Information: A General-Purpose Document Clustering Methodology by Gary King
Producing Online News: Digital Skills, Stronger Stories by Ryan Thornburg
US State Department Foreign Affairs Manual, section on information security, a.k.a. 12 FAM 500
Five Databases in 50 Minutes: Government Session (from the CAR2011 conference blog)
News Apps: What Works and Why (from the CAR2011 conference blog)
Analysis-ready census data (from USA Today, available to NICAR members only)
A directory of statistics bureaus by country (from Statistics Sweden)
Data Visualization for Beginners (from the CAR2011 conference blog)
Tracking the Economy and Business (from the CAR2011 conference blog)
Benford’s Law (statistics!)

 

Work Samples

Jump to Tutorials | Software & Tools | References | Work Samples

• The Wall Street Journal investigative report, “Confidentiality Cloaks Medicare Abuse” with database created by Mo Tamman
• The Center for Public Integrity investigative report, “Unproven for Older Women, Digital Mammography Saps Medicare Dollars
The Year in CAR
• Des Moines Register potholes map
FlyOnTime.us – find the most on-time flights between cities (uses US government raw data)
Employment Market Explorer – find out what the local employment market looks like. Compare local, regional and national rates and labor market dynamics. (uses US government raw data)
WildTrack – using data to monitor endangered species populations
• Roundup of state-based 2010 census stories
The Killing Roads – interactive map of highway accidents in Norway
• The entire King James Bible as a word tree
Who Runs HK – network graph of the people in power in Hong Kong
Research by Martin Wattenberg, including the highlighted works, Name Voyager, Map of the Markets, Shape of Song and Fleshmap

Jump to Tutorials | Software & Tools | References | Work Samples

It’s true what they say: You might graduate school, but you never stop learning.

CAR 2011 logoTomorrow is the start of the annual CAR Conference, where “computer-assisted reporters” (affectionately referred to as data nerds, jounocoders, and “those spreadsheet geeks over there”) get together for deep education. As one attendee puts it, it’s where journalists learn and demonstrate how to do things. And that’s pretty great.

I’ll be in town to attend the NewsCamp data visualization workshop, where luminaries like Amanda Cox, Daniel Lathrop and Martin Wattenberg will teach a gamut of dataviz skills. The unofficial attendee list looks pretty spectacular too.

If you’re attending and we haven’t met (or seen each other in a while), say hi. If you can’t make it, Computerworld’s online managing editor Sharon Machlis will be collecting notable info in the window below. You can also follow along via Twitter by searching on “NICAR.”