In my previous article “Putting your data on the map – an introduction to geocoding” I explained the basic concepts of geocoding and geocoder software.
Getting to grips with geocoding is a skill needed to turn the addresses or place names in your data into latitude/longitude coordinates to display on a web-map. Seeing your data on a map can be an instant game changer, neatly illustrated by my colleague Ben Klarich in a recent video.
In this article, I will delve deeper into geocoding and introduce the concept of positional accuracy and why this might be important to your use case.
Setting a geocoder to work on a batch of addresses is satisfying – the time saved, the mundane avoided – and it is tempting to regard the resulting pins on the map as the finished product from the process. This may be the case, but I would urge those doing geocoding to pause a moment and consider two questions:
- Has the geocoder made the correct address match?
- Is the location of the pin accurate enough for my needs?
Has the Geocoder Made the Correct Address Match?
This question is a matter of quality checking the address matching made by the geocoder. Using the Mapcite Excel Add-In to geocode, the output will be 10 new columns added to your spreadsheet of data. The first two are the latitude/longitude coordinates of the matched location. The next 8 columns are populated with the address found by the geocoder. It is worth putting some time into checking the address found by the geocoder against the address you are looking for in your data. If you have less than 250 or so addresses, it can be efficient to scan the list by eye and check the results this way.
As the datasets get larger, you will need other tactics. Useful ones include:
- Looking at the data on a map – useful to pick out obvious issues if you know your data should be within a location boundary such as a city, state or country. For example, if your data is only for Florida addresses, it will be easy on the map to see addresses out of state.
- Sampling – pick a sample of the addresses and check each one carefully. Distributing your sample across different geographic areas is advised.
- Excel Functions – you can write a quick function in excel to compare elements of the found address against the sought address. Postcode or zip code is a good field to use for this.
Also consider what is an acceptable match. If the postcode or zip code is matched but not the street number, is that good enough? Which neatly brings me to the second question:
Is the Location of the Pin Accurate Enough for my Needs?
Consider this, not only does the geocoder have to do an address match, it must also assign a latitude and longitude coordinate to the address. How does it do that? The simple answer is that it gets the coordinates from its own database. The real question is who compiled the data that sits in the database and how? That could be a whole topic for another day, but in summary it depends upon the geocoder you are using. The data could be crowd-sourced from open data, commercially gathered, or gathered by trusted government bodies. For some data, the address coordinates will be on the rooftop of the building – the highest level of positional accuracy. At other times the coordinates may be somewhere within the property boundary, on the same street, to the nearest intersection, or at the centre of the postcode/zip code. Address data gathered commercially or by Government bodies will usually come with a cost and licence restrictions on the use of the data.
For this reason, most commercially available geocoders will have a charging structure and T&Cs that means the more addresses you geocode, the more you must pay and you are limited in what you can do with the data. To build a free product without such licence restrictions, Mapcite chose to use open data from www.OpenStreetMap.org for its Excel geocoder.
Getting back to the question, you need to think about your use case and decide if an approximate location such as a postcode is good enough for your use-case. In the UK, the average postcode includes addresses over an area of 0.6 Hectares (or one football pitch). However, as you move from urban to rural areas, postcode sizes increase. Ten percent of postcodes cover 19 Hectares or more and the largest postcode extent in the UK is in the Scottish Highlands, covering 40,700 Hectares.
So if your use case is doorstep deliveries, a postcode might be good enough in urban areas (if you accept the delivery driver will need to locate the exact address themselves once they arrive in the postcode).
If your use case is something more specific such as assessing the lending or insurance risk at an individual address, you should invest in premium data and geocoders that can reliably give you rooftop coordinates. Mapcite, through its partnership with respected geospatial organisations like Pitney Bowes, Ordnance Survey and PSMA Australia has access to some of the world’s best addressing data and geocoders for roof top accuracy.
Using the Mapcite Add-In and following the advice given in this and previous articles, you should be able to geocode a significant proportion of your data and start using it for map-based analysis. If after reading this article you wish to explore higher positional accuracy, get in touch and we can advise you on the data and services best suited to your needs.
About the Author
Richard Crump is Head of Consulting at Mapcite, a location data analytics company and previously held a similar role at Ordnance Survey, the National Mapping Agency for Great Britain.