The Reporter's Lab tool review site will fill a hole in the journalism ecosystem by testing open source and commercial reporting and Web tools against real-world tasks and documents. For example:
  • Test scraping tools against simple sites and those filled with javascript or logons.
  • Test document analysis software against a variety of documents held in many different forms, such as single pdfs of e-mail collections and harvested web pages.
  • Evaluate speech and video recognition tools
  • Quickly show reporters what small, free tools to work with pdfs or documents might give them.

Separately, we'll create a task library that includes each iteration of the sample documents and a detailed description of the tasks we want to accomplish. The task library would be ideally linked into the database of reviews to let others use them in their own tests or to let reporters see exactly what they look like.

This is the first project of the lab for several reasons, but the most important is that it will let us see what holes there are in the world of free and open source tools, helping drive our work in the future.

Some of the elements that would be useful to be able to browse and search are:
  • Cost / licensing
  • Tested on which tasks / documents / data? (See related project, Task library)
  • Installation required
  • Programming required (none -> substantial)
  • Results for each test
  • Narrative for the blog posting.
  • Advice to people who want to try it
  • Type of task (eg, text analysis, audio records, scraping, visualization, etc.)
  • Rating?
  • Who did the testing
  • Date(s) reviewed
  • What version tested
  • Op system, etc. environment tested.
  • Stars? / Ratings for usability, accuracy, difficulty?

It would be really useful to have a section of the database to track the reviews and reviewers. I already know of probably two dozen tools that I'd like to test, and it would helpful if I could enter them and assign them out to reviewers as I can. Parts of this section might not be published, but could include:
  • Tool name
  • Source / how to get it
  • What it claims to do
  • Who is reviewing it
  • When assigned
  • When due
  • Completed when
  • What it's good for (what tasks we'lll assign it)