Created by Julian Todd, ScraperWiki is an online tool that allows programmers to scrape data from multiple sources and copy it into a database. It lets users link data that may be spread over various different formats such as spreadsheets, tables, and PDFs. This data can then be easily accessed by journalists and researchers.
Many interesting ideas emerged during yesterday's event and the five groups tackled a wide range of subjects; from environmental protection to eTenders. At the end of the day, each group gave a three-minute presentation outlining where they had sourced their data and the project they envisaged.
The winners on the day were the aptly-named 'MonuMENTAL'. They looked at the issues of planning and preserving local heritage. They observed that the information available on heritage sites, planning applications, local monuments etc isn't co-ordinated; in an effort to resolve this they scraped data from multiple sources and put it together on a mock site.
They said the main function of a site like this would be to enable people to find out about planning applications early on in the process, perhaps through email alerts, and also to build a database of places people consider valuable with the aim of preserving local heritage.
Congratulating the winners, Michael Stubbs of Dublin City Council highlighted the versatility of the project saying, "You could use [a site like that] in the morning". The judges also praised the group's idea saying it had the potential to create a new community around environmental protection and heritage sites.
The first group to present came up with the idea of a Twitter mood index aimed at gauging public mood on certain issues by searching key words. They also outlined how it might be possible to gage mood in different regions of the country by linking in with the location listed in a user's profile.
Road safety was also tackled at the event. Using data from the Road Safety Authority website, along with data from the website of An Garda Síochaná, one group analysed the numbers of road deaths per county in conjunction with the number of speed cameras and penalty points issued.
A fourth group looked at information available on the Environmental Protection Agency (EPA) website. They focused on Integrated Pollution Prevention Control (IPPC) licences, bringing together data from different sections of the site which should be linked but is presented in different formats.
The final group presented a project entitled 'E-tenders: follow the money'. They highlighted how complicated the eTenders website is saying that information about contract authorities and those who are awarded the contracts are not available in one place. The team created a network which brought all this information togther and also showed the links between different industries.
The coveted ScraperWiki mug was awarded to Richard Cyganiak of the Digital Enterprise Research Institute (DERI) for mixing two datastores.
Speaking at the event, Sales and Marketing Director Aine McGuire said ScraperWiki could provide different parts of government with a very cheap way of connecting their datasets. She also spoke about the Smart Economy and the need to get people with different skill sets working together and make public data more accessible.
Dominic Byrne of Fingal County Council also spoke at the event. Fingal County Council has recently launched Fingal Open Data, the first project of its kind undertaken by a local authority. The site enables citizens to easily access data about their area.
The main objectives of the initiative are to create greater transparency with regard to public data as well as encouraging participation and collaboration from citizens and businesses through analysis of this data and suggestions about additional data which could be published.
Tools like these to bring meaning and transparency to public records online are welcome. However, as reported in Politico recently, barriers to public records that are not available online persist. In particular, the 2003 amendment to the 1997 Freedom of Information (FOI) Act has been regressive in terms of transparency. Furthermore, Minister for Finance Brian Lenihan deliberately exempted Anglo Irish Bank and Nama from FOI's remit. Read more on Politico here.