Post: 23 May 2014
Broken links in the Data.gov.uk catalogue have been a growing problem for some time. Recently the DGU team has added reporting functionality, and individual broken links are now flagged with an error message:
The DGU report lists the number of broken dataset links per publisher. The table below shows the ten publishers with the highest number of datasets with broken links:
Of course, some publishers have many more datasets in the DGU catalogue than others. This table shows all publishers with more than 100 published datasets in the catalogue, along with the percentage of their datasets with broken links:
This table shows ministerial departments and the percentage of their published datasets with broken links:
Here’s the full list in a Google Spreadsheet:
Data.gov.uk catalogue broken links (22/05/2014)
(This is basically the DGU broken links report combined with additional information from a DGU catalogue data dump.)
Comments
Any large collection of links will suffer link rot over time if not actively maintained. However there seem to be some additional drivers behind DGU’s broken link problem:
Some publishers have high numbers of broken links due to specific issues. For example:
In my view the scale of the problem with broken links is also a function of the underlying approach to management of the DGU catalogue. Publishers are encouraged to provide direct links to datasets, rather than to a landing page on their own sites. Superficially this makes sense because users can simply grab the data without navigating to another site. However it means there are a lot of different links to maintain. I think most data publishers probably have a low sense of “ownership” for the user’s experience on DGU. Some users may save time by downloading data directly from DGU, but this is not necessarily a worthwhile trade-off. Even if the links do work, the contextual information provided by the DGU catalogue alone may be inadequate or out of date. Serious data users still need to research and confirm the latest position before making use of any dataset available via Data.gov.uk.