Archives for category: Data + Graphics

Using public records data for reporting isn’t new. Neither is using computers to pore through information to find patterns. But as news organizations look for more ways to offer public records to readers, new takes on old standbys keep popping up.

The pet names database, a favorite of newspaper and TV websites, got a fresh twist from the Los Angeles Times this week by offering more than the standard search-and-list of names, breeds and locations. On it, you’ll find collections of interesting dog names as chosen by the staff, a common names tag cloud, maps of the breeds and names by ZIP code, a tie-in with the reader photos page, and space for user comments.

LA Times dog names database

Times database developer Ben Welsh says the project was a way for him to learn how to navigate through Los Angeles’ complex bureaucracy.

Welsh moved to L.A. from Washington, D.C. several months ago. “When I got here, I knew that learning how many cities make up L.A. County and how the different services get managed was going to be something I needed to get skilled at, so I thought: I need kind of a test case,” he says. The dog names database became his experiment.

The first step was to figure out which offices held the records, then to request the information in accordance with the California Public Records Act. To avoid being turned down for privacy concerns, “I made sure in my earliest communications with people, kind of the first round, to say I don’t want the address of the owner, but I do want their ZIP code,” Welsh says.

Data from each agency was merged into a single file, then the development stage began. Welsh built the database on Django, an open-source development tool created at the Lawrence Journal-World and based on the Python programming language.

Though Welsh says he tries not to advocate one framework or language over another, he personally prefers Django for two reasons: he knows Python, and Django instantly produces a form that allows anyone, not just people with programming skills, to enter data.

That said, the pets database is only the second Times project to be developed on Django. Other programmers at the news company have used Ruby on Rails for site sections including the photo-driven Hollywood Backlot and the L.A. listings and review section, The Guide.

“It’s clear that the people who made (Django) worked in a newsroom,” Welsh says. In tight-deadline situations, having many people working on different aspects of a project at the same time is imperative. On the same day the backend database was created, reporters and researchers began entering data.

“And then simultaneously as they’re working on entry, the developer can also be working on building the public-facing site, which is where you want to invest your most resources, because that’s going to decide whether you sink or swim,” Welsh says.

Though Welsh couldn’t estimate how long the project took from the first public information request to official launch, he says he dedicated most of about two or three weeks to development once the database became top priority.

The project came together so quickly, in part, because it had been based on a prior effort, a database of California soldiers killed in Iraq and Afghanistan that was launched Memorial Day weekend.

“We were able to save effort by borrowing a lot of the layout and stuff, but not everything, from the ‘War Dead’ design and they have a lot of similarities if you’ve used the two,” Welsh says. “And that was sort of an investment that paid off the second time around.”

A new feature on the dog names database is the list of similar names that appears on each name page. The list is created using the Soundex function in mySQL.

Soundex is a patented phoentic algorithm that converts words into numeric code that can then be used to search for similar-sounding words.

Welsh says he applied the Soundex function in “what’s called a custom manager in the Django code that I wrote that just has a SQL statement that passes in whatever that current name is in the URL into the database and finds names that have a similar Soundex score.”

Among the list of interesting dog names is “Pick of the Litter (Editor’s Choice).” In it, you’ll find Welsh’s selections for “the weirdest, funniest, best names in Los Angeles.”

They include Otis, and Chandler (together, the name of the L.A. Times founder), Dr. Zaius, and several names that may be familiar to Django fans.

“It was also an opportunity to give a tongue-in-cheek shoutout to the Lawrence, Kansas, guys,” Welsh says.

The interesting dog names categories started as “just fiddling through the data and seeing the fun ones and wanting to share that with other people,” Welsh says.

“I think for us, also, there was a desire to find ways to package the information so that it would be useful or be topical for other bloggers on our site, where if we have a list of the presidential names, maybe Andy Malcolm would like to write about it at ‘Top of the Ticket,’ or if we have a list of superhero names, it might fit on our superhero blog — just kind of thinking what are the things that the paper covers and that people come to us for and can we find names in there that sort of line up with that.”

What began as an exercise in learning the L.A. County records system has become a way for Welsh to connect with readers. And he says reader comments, especially those left on the “California’s War Dead” database have been the most rewarding and touching aspect of his work so far.

“The people, to whatever degree, trust the site, or they think it’s worthy of depositing information like that, which is very sensitive and very personal.

“Just the fact that someone felt comfortable enough to do that makes me feel like we must’ve done something right. I’m not exactly sure what, but something.”

“The pet name database is a staple of computer-assisted reporting.”

Derek Willis, Web developer, IRE member

Examples of online pet names databases abound.

Here are a few, listed Woody Allen-style. Feel free to add others in comments.