Addresses and building locations are often described in different ways, even by the people who live and work there. Maintaining data quality in property databases is a constant challenge, particularly when we want to combine or match property data from different sources.

At the level of national data strategy, the best approach to reducing this challenge is to promote the use of unique identifiers for each address and property.

A unique identifier is a code or string of characters in an agreed format that is specific to the object to which it is assigned. If an identifier is appended and shared consistently in a property record, we can be confident we are talking about the same property even when other descriptors change or are expressed differently.

This post looks at various property identifiers used in datasets maintained by the UK public sector.

(For more general information on UK address data see my recent primer.)


The UPRN

The Unique Property Reference Number (UPRN) is a unique and persistent identifier for every location with an address in the UK.

GeoPlace LLP allocates UPRNs in blocks to local authorities, who assign them to addressable locations when construction begins or during the street numbering phase of the planning process. Ordnance Survey also assigns UPRNs to additional objects such as churches and agricultural properties that might not have a postal address.

UPRNs are a key field in all three versions of Ordnance Survey's AddressBase product, and in OSNI's Pointer product.

GeoPlace promotes the ubiquity of UPRNs, and there are plenty of case studies of UPRN use for address matching and other applications within the public sector. However UPRNs are not open data at source, and this imposes hard limits on their wider adoption.

Under some limited conditions Ordnance Survey licensees can sub-license UPRNs as open data. For example ONS has published all UPRNs for Great Britain as open data in their National Statistics Address Lookup (NSUL) and ONS UPRN Directory (ONSUD) data products, which match individual UPRNs to a wide range of administrative geographies. These datasets are invaluable for statistical purposes but of limited use at address level because the associated address fields and geocodes remain closed data.

In June Cabinet Office announced plans to "investigate opening up" UPRNs:

Furthermore, over the next 12 months the Geospatial Commission will work with GeoPlace, the LGA, the Improvement Service (on behalf of Scottish Local Government), and OS to investigate opening up the key identifiers UPRN and USRN, together with their respective geometries, for the whole of Great Britain under OGL terms. This work must protect the integrity and authority of these identifiers, so as to provide both businesses and public sector organisations with the confidence to continue to rely on these within their own products and services, without restricting their ability to use and benefit from them.


The TOID ®

The TOpographic IDentifier or TOID (pronounced "toyed") is a unique identifier that Ordnance Survey assigns to features in MasterMap and its other large scale mapping products.

A feature may be any type of topographic object or other concept that has a point-based geometry, so TOIDs are not limited to buildings or properties. TOIDs were assigned to AddressPoint features in earlier OS address products that were part of MasterMap, Address Layer and Address Layer 2. However AL and AL2 have now been deprecated. AddressBase is not a MasterMap layer.

The basic AddressBase product includes "address TOIDs" as a legacy from AL2, but TOIDs necessary to cross-reference addresses to building and transport features are provided only in the AddressBase Plus and Premium products. A building TOID may be linked to more than one UPRN.

As with UPRNs, there are limited conditions under which OS licensees may publish TOIDs in open datasets.


The Unique_Building_ID (Northern Ireland)

I don't have a technical specification for OSNI's Large Scale vector boundary data, but there does not seem to be a direct equivalent to the TOID in use in Northern Ireland.

Pointer includes a Unique_Building_ID, which is a "unique sequential number allocated to a primary addressable object" and distinct from the UPRN. More than one UPRN may be linked to a Unique_Building_ID; typically the Unique_Building_ID will represent a building and the UPRN will represent an address within the building.


The OSPAR

The Ordnance Survey ADDRESS-POINT Reference (OSPAR) was a unique identifier used in Ordnance Survey's ADDRESS-POINT product, and retained as a secondary identifier in AL and AL2.

The OSPAR was conceptually similar to the UPRN. ADDRESS-POINT, AL and AL2 were withdrawn in 2014 but OSPARs may still appear in older address data derived from those products.


The UDPRN and the UMRRN

The Unique Delivery Point Reference Number (UDPRN) is a unique identifier supplied as an optional add-on to Royal Mail's main Postcode Address File (PAF) product. A UDPRN is assigned to every physical delivery point i.e. premise level address.

UDPRNs are a more stable alternative to PAF address keys, which are sometimes re-used or become invalid when the postcode of an address changes.

The Unique Multiple Residence Reference Number (UMRRN) is a unique identifier that Royal Mail assigns to premises where multiple households share a single letter box. UMRRNs are linked to UDPRNs in Royal Mail's Multiple Residence product.

Where available the UDPRN is included in all AddressBase products. The UMRRN is not included in AddressBase products, though AddressBase Premium and AddressBase Plus contain multi-residency information from other sources.


The UARN

The Unique Address Reference Number (UARN) is a unique identifier assigned by the Valuation Office Agency (VOA) to premises or hereditaments that are subject to council tax. The UARN appears in compiled domestic and non-domestic rating lists.

Rating lists also usually include billing authority reference numbers, but these are not necessarily unique or persistent.

Where available UARNs are included in AddressBase Plus and AddressBase Premium.


The Title Number and Index Polygon ID (Land Registry)

The title number is a unique identifier assigned by Land Registry to each property registered in England and Wales.

Title numbers have a complex history and are not used consistently by land registries in Scotland and Northern Ireland.

Title numbers are used in Land Registry's main registry and in chargeable National Polygon Service data products. Title numbers do not appear in AddressBase or any other Ordnance Survey product. However Land Registry licenses a Title Number and UPRN Look Up dataset.

The Index Polygon ID is a unique identifier for the polygon that represents the indicative location of a registered title. Polygon IDs are matched to title numbers in the National Polygon dataset.


The Land Registry-INSPIRE ID

The Land Registry-INSPIRE ID is a unique identifier used in Land Registry's free INSPIRE Index Polygons dataset. This dataset contains the locations of freehold registered property in England and Wales and is a subset of Land Registry's Index Polygons dataset.

Land Registry-INSPIRE ID relate to registered titles but Land Registry does not provide any lookup to enable bulk matching with title numbers or Index Polygon IDs. Land Registry offers a chargeable service for finding registered title information based on individual Land Registry-INSPIRE IDs.

The Land Registry-INSPIRE IDs are technically re-usable as open data. However the polygons themselves are not.

Registers of Scotland's Cadastral Parcels Download Service provides polygon data equivalent to Index Polygons dataset, with an INSPIREID reference that is similarly estranged from the underlying registry information.


The Other UPRN

MHCLG publishes Energy Performance of Buildings Data for England and Wales at address level. This dataset provides bulk data from Energy Performance Certificates (EPCs) for domestic and commercial buildings and Display Energy Certificates (DECs) for public buildings. The dataset is free rather than open, but the non-address fields are re-usable under the Open Government Licence.

The data includes an "individual lodgement" identifier (the LMK KEY), which is not unique but may be in future, and a unique identifier for the building (the BUILDING REFERENCE NUMBER).

There is another identifier associated with EPCs and DECs, which does not appear in the bulk data: the Unique Property Reference Number (UPRN).

This is entirely separate from Ordnance Survey's UPRN.

When DCLG (now MHCLG) commissioned Landmark to set up the Energy Performance of Buildings registration system in 2007, a unique property reference number was one of the requirements. However neither DCLG nor Landmark wanted to stump up for the OS licensing fee. So Landmark made up their own UPRNs, which form the basis of the 24-character report numbers that appear on the individual certificates.

Julian Todd wrote a couple of blog posts about this back in 2011. The Local Public Data Panel also produced a short paper on the problem in 2012, concluding that it would be impractical for Landmark to reconcile the two UPRN systems.


The FRid

Flood Re is a reinsurance scheme run by the UK insurance industry with funding from Government. The objective of the scheme is to ensure the availability of household flood insurance even in areas of high risk.

Although Ordnance Survey seemed keen for Flood Re to adopt the UPRN, the Flood Re initiative instead decided to create its own Flood Re Unique Identifier (FRid).

The FRid is a universally unique identifier (UUID) assigned to each of the 30 million UK residential properties recorded in Flood Re's Property Data Hub.

Coincidentally, Flood Re's IT and database infrastructure was built by Landmark.


The MPAN and the MPRN

The Meter Point Administration Number (MPAN) is a unique reference used in Great Britain to identify electricity supply points at individual domestic residences and other locations. The Meter Point Reference Number (MPRN) is an equivalent reference for gas supplies.

Although these are not property identifiers as such, they are an example of identifiers that can sometimes be used as proxies for property identifiers. This is a concept we should keep in mind, as common household objects are increasingly registered and networked through the internet of things.


Property identifiers have untapped potential

There are a number of significant public datasets that provide records at address level but don't make use of unique identifiers for those addresses. Examples include Companies House's Free Company Data Product, the Planning Inspectorate's Appeals Data, FSA's food hygiene ratings data, and Land Registry's Price Paid Data.

In at least some of those cases I think there is potential, within the OS licensing conditions, for PSMA members to append UPRNs without making the terms of use more restrictive.

Conversely there are occasional examples of public datasets that contain property identifiers without the associated information, such as the use of UPRNs in place of address fields in GDS registers. This obviously limits the potential for re-use by users who do not have the wherewithal to match the identifiers to their source.

UPRNs and TOIDs may not fit every use case where there's a need for property identifiers. However data publishers will be more likely to adopt these identifiers as standard attributes in property records if they are openly available and can be shared without reducing the utility of their data.