Oxford Progamme for Sustainable Infrastructure Systems (OPSIS)
The Overture website recommends various workflows to download the data. Among them, the one allowing to work a local and self-sufficient manner is the python based overturemaps CLI, available from pip. It requires few arguments: 4 numeric values for the bbox, the type of layer to extract and the type of file to write into.
More information on the values allowed in --type is available via the shell command overturemaps download --help. More methods to download Overture data are shown in the documentation.
Once the data is stored locally as .geoparquet, we can work with it in python with duckdb.
The data set is read as traditional parquet in which the geometry column is a blob.
┌──────────────────────┬──────────────┬────────────────────────────────────────────────────────────────────────────────┐
│ id │ class │ geometry │
│ varchar │ varchar │ blob │
├──────────────────────┼──────────────┼────────────────────────────────────────────────────────────────────────────────┤
│ 089962508d97ffff04… │ path │ \x00\x00\x00\x00\x02\x00\x00\x00\x02@<\xD1\xA5\x99\x82\x1C\xA5\xC01.1\x0D\xB… │
│ 089962508d97ffff04… │ path │ \x00\x00\x00\x00\x02\x00\x00\x00\x08@<\xD1\xD2\x08u\xF0G\xC01.\x0A+\xD2\xEC\… │
│ 088962508d9fffff04… │ path │ \x00\x00\x00\x00\x02\x00\x00\x00\x13@<\xD1\xA5\x99\x82\x1C\xA5\xC01.1\x0D\xB… │
│ 089962508d83ffff04… │ path │ \x00\x00\x00\x00\x02\x00\x00\x00\x0B@<\xD2\x06\xEF\x07\x8A8\xC01-\xE34\x17^\… │
│ 08496251ffffffff04… │ secondary │ \x00\x00\x00\x00\x02\x00\x00\x00c@<\xC7\xF8\xAA\xA1(\xDA\xC01&_\xB8\xCD;\x88… │
│ 086962508fffffff04… │ unclassified │ \x00\x00\x00\x00\x02\x00\x00\x00S@<\xD2\x16/\x16n\x01\xC01!\xD4\x96\xC5\xA9\… │
│ 08496251ffffffff04… │ unclassified │ \x00\x00\x00\x00\x02\x00\x00\x01\x05@<\xD2\x16/\x16n\x01\xC01!\xD4\x96\xC5\x… │
│ 087962508bffffff04… │ track │ \x00\x00\x00\x00\x02\x00\x00\x00!@<\xD3\x9A\x93\x94\x9C\xDB\xC01!\xB0\xFA\x0… │
│ 086962508fffffff04… │ track │ \x00\x00\x00\x00\x02\x00\x00\x00:@<\xD4\xB9\x9F\xC9^\x83\xC01\x1En\xE9\xC2\x… │
│ 087962508bffffff04… │ track │ \x00\x00\x00\x00\x02\x00\x00\x003@<\xD1\xBA\x0B\xFF\xE8\x83\xC01\x1Ew\xA7\xD… │
├──────────────────────┴──────────────┴────────────────────────────────────────────────────────────────────────────────┤
│ 10 rows 3 columns │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
The duckdb loaders do not support reading geoparquet at the moment, but this feature is expected in the upcoming version. We stick to this format for its efficiency when storing large extracts.
To further work with the geometry, we install the duckdb extension.
This will allow us to work with the geometry column from within the database, bypassing the limitation of the parquet reader.
Still with the duckdb package and its SQL-like syntax.
┌────────────┬───────────────┐
│ N_segments │ class │
│ int64 │ varchar │
├────────────┼───────────────┤
│ 83290 │ unknown │
│ 3974 │ driveway │
│ 502882 │ track │
│ 69192 │ secondary │
│ 1128458 │ unclassified │
│ 148719 │ tertiary │
│ 319 │ sidewalk │
│ 2473 │ living_street │
│ 27548 │ primary │
│ 44763 │ trunk │
│ 93 │ NULL │
│ 373 │ steps │
│ 457 │ pedestrian │
│ 63 │ motorway │
│ 1249548 │ path │
│ 1481 │ parking_aisle │
│ 263 │ crosswalk │
│ 144 │ cycleway │
│ 1792660 │ residential │
│ 57044 │ footway │
│ 459 │ alley │
│ 29 │ bridleway │
├────────────┴───────────────┤
│ 22 rows 2 columns │
└────────────────────────────┘
The advantage of working with duckdb is that intensive computations are performed outside the python environment, and all we need to do is collect the results.
# filtering out cycleways
ways = db.sql("Select id,ST_GeomFromWKB(geometry) as geometry,subtype,class from roads where class='primary';")
# intermediate step: transform the geometry into WKT and read the subset of data as a pandas DataFrame
ways_wkt = db.sql("select id, ST_AsText(geometry) as geometry, subtype, class from ways;").df()
# Finally, convert the geometry and create a geopandas GeoDataFrame.
ways_df = gpd.GeoDataFrame(ways_wkt
,geometry=gpd.GeoSeries.from_wkt(ways_wkt["geometry"])
,crs=4326
)
ways_df.head()| id | geometry | subtype | class | |
|---|---|---|---|---|
| 0 | 088971928a1fffff047fb94fc9b6da63 | LINESTRING (30.87546 -17.07291, 30.87616 -17.0... | road | primary |
| 1 | 08497193ffffffff047fafebb6014c10 | LINESTRING (30.86457 -17.04400, 30.86431 -17.0... | road | primary |
| 2 | 08896269203fffff047daff58b6ae4c4 | LINESTRING (30.84229 -17.03138, 30.84201 -17.0... | road | primary |
| 3 | 089962692037ffff047fee1fd2e8c575 | LINESTRING (30.84418 -17.03071, 30.84229 -17.0... | road | primary |
| 4 | 08a962692022ffff047ffd55846af8dc | LINESTRING (30.84504 -17.03038, 30.84418 -17.0... | road | primary |
The resulting types:
id object
geometry geometry
subtype object
class object
dtype: object
Text(0.5, 1.0, 'Example segment class')
Text(0.5, 25.13333333333333, 'Longitude [deg]')
Text(303.74186744197146, 0.5, 'Latitude [deg]')

Once the data is extracted, other options are available to work with it. GeoPandas converts the geometry column for us, so no extra steps are required.
It is however less efficient to read with this method, so it’s only recommended for relatively small data sets.
Is the under the hood reader of geopandas.
The vast python package ecosystem provides a wide range of tools that work with (geo)parquet and (geo)arrow file formats and specifications, among them: