Geohash

· 472 words · 3 minute read

Have you ever thought about how to search for locations near specific coordinates, like in Google Maps? My first idea was to save the longitude and latitude in a database like MySQL or PostgreSQL. Adding an index to the longitude and latitude columns can help speed up queries, but it’s still inefficient. In 2008, geohashing was introduced by Gustavo Niemeyer as a solution.

Geohash is a public domain geocode system invented in 2008 by Gustavo Niemeyer and (similar work in 1966) G.M. Morton, which encodes a geographic location into a short string of letters and digits.

grid image

source: https://en.wikipedia.org/wiki/Geohash

Geohashing assigns the same prefix to geohashes of nearby locations, meaning they are spatially close to each other.

Example:

Location Lat, Lng Geohash
Monas -6.175648, 106.827256 qqguygv
Pasar Senen Station -6.172421, 106.843745 qqguzk1
Gambir Station -6.176995, 106.830968 qqguygx

jakarta map

From this table, we can conclude that Monas is closer to Gambir Station because they share a longer prefix (qqguyg) than with Pasar Senen Station. With this geohash behavior, we can use an SQL query only with one column to find all locations near a specific point. This query more efficient instead using lat and long column.

SELECT detail FROM locations WHERE geohash LIKE 'qqguy%`

Geohash split the location in map into some boxes. Each dimension have different precision. To generate geohash we can use this algorithm:

  1. Choose geohash level between 1-12
  2. Loop for each level
  3. Is our coordinate or point in left half map? if yes, write ‘0’, otherwise ‘1’.
  4. Is our coordinate or point in bottom half map? if yes, write ‘0’, otherwise ‘1’.
  5. Rescale the map.
  6. Back to point 2
  7. If loop finished, we can encode the result (binary number) using base32 string

alt text

source: https://researchgate.net/figure/Geohash-binary-code_fig2_332061286

Below geohash precision level

Precision / geohash length Latitude bits Longitude bits Latitude error Longitude error Cell height Cell width
1 2 3 ±23 ±23 4992.6 km 5009.4 km
2 5 5 ±2.8 ±5.6 624.1 km 1252.3 km
3 7 8 ±0.70 ±0.70 156 km 156.5 km
4 10 10 ±0.087 ±0.18 19.5 km 39.1 km
5 12 13 ±0.022 ±0.022 4.9 km 4.9 km
6 15 15 ±0.0027 ±0.0055 609.4 m 1.2 km
7 17 18 ±0.00068 ±0.00068 152.5 m 152.9 m
8 20 20 ±0.00086 ±0.000172 19 m 38.2 m
9 22 23 ±0.000021 ±0.000021 4.8 m 4.8 m
10 25 25 ±0.00000268 ±0.00000536 59.5 cm 1.2 m
11 27 28 ±0.00000067 ±0.00000067 14.9 cm 14.9 cm
12 30 30 ±0.00000008 ±0.00000017 1.9 cm 3.7 cm

Geohashing has some limitations. When two points are close but fall on opposite sides of an edge, they may have different geohash prefixes. As a result, the search may not detect these points as being close. One solution is to reduce the search precision to include those points, but this can lead to results that contain too many unrelated points.