Have you ever thought about how to search for locations near specific coordinates, like in Google Maps? My first idea was to save the longitude and latitude in a database like MySQL or PostgreSQL. Adding an index to the longitude and latitude columns can help speed up queries, but it’s still inefficient. In 2008, geohashing was introduced by Gustavo Niemeyer as a solution.
Geohash is a public domain geocode system invented in 2008 by Gustavo Niemeyer and (similar work in 1966) G.M. Morton, which encodes a geographic location into a short string of letters and digits.
source: https://en.wikipedia.org/wiki/Geohash
Geohashing assigns the same prefix to geohashes of nearby locations, meaning they are spatially close to each other.
Example:
Location | Lat, Lng | Geohash |
---|---|---|
Monas | -6.175648, 106.827256 | qqguygv |
Pasar Senen Station | -6.172421, 106.843745 | qqguzk1 |
Gambir Station | -6.176995, 106.830968 | qqguygx |
From this table, we can conclude that Monas is closer to Gambir Station because they share a longer prefix (qqguyg) than with Pasar Senen Station. With this geohash behavior, we can use an SQL query only with one column to find all locations near a specific point. This query more efficient instead using lat and long column.
SELECT detail FROM locations WHERE geohash LIKE 'qqguy%`
Geohash split the location in map into some boxes. Each dimension have different precision. To generate geohash we can use this algorithm:
- Choose geohash level between 1-12
- Loop for each level
- Is our coordinate or point in left half map? if yes, write ‘0’, otherwise ‘1’.
- Is our coordinate or point in bottom half map? if yes, write ‘0’, otherwise ‘1’.
- Rescale the map.
- Back to point 2
- If loop finished, we can encode the result (binary number) using base32 string
source: https://researchgate.net/figure/Geohash-binary-code_fig2_332061286
Below geohash precision level
Precision / geohash length | Latitude bits | Longitude bits | Latitude error | Longitude error | Cell height | Cell width |
---|---|---|---|---|---|---|
1 | 2 | 3 | ±23 | ±23 | 4992.6 km | 5009.4 km |
2 | 5 | 5 | ±2.8 | ±5.6 | 624.1 km | 1252.3 km |
3 | 7 | 8 | ±0.70 | ±0.70 | 156 km | 156.5 km |
4 | 10 | 10 | ±0.087 | ±0.18 | 19.5 km | 39.1 km |
5 | 12 | 13 | ±0.022 | ±0.022 | 4.9 km | 4.9 km |
6 | 15 | 15 | ±0.0027 | ±0.0055 | 609.4 m | 1.2 km |
7 | 17 | 18 | ±0.00068 | ±0.00068 | 152.5 m | 152.9 m |
8 | 20 | 20 | ±0.00086 | ±0.000172 | 19 m | 38.2 m |
9 | 22 | 23 | ±0.000021 | ±0.000021 | 4.8 m | 4.8 m |
10 | 25 | 25 | ±0.00000268 | ±0.00000536 | 59.5 cm | 1.2 m |
11 | 27 | 28 | ±0.00000067 | ±0.00000067 | 14.9 cm | 14.9 cm |
12 | 30 | 30 | ±0.00000008 | ±0.00000017 | 1.9 cm | 3.7 cm |
Geohashing has some limitations. When two points are close but fall on opposite sides of an edge, they may have different geohash prefixes. As a result, the search may not detect these points as being close. One solution is to reduce the search precision to include those points, but this can lead to results that contain too many unrelated points.