Description
This mini-package provides a very fast, yet user-friendly way of
exporting large road network data in a graph-like format. It uses the libosmium
library in the background and proposes a simple R function to extract
the road graph from a .osm
, .osm.pbf
extract
file with OSM data.
Background
While there are already great packages for working with OSM data, I found myself struggling when the networks were getting big (region, country scale…). Usually, one would have to use command line tools and export data to a database before actually being able to use it. Additionally, for certain use cases, the geometry and most of the tags associated to an OSM feature are not essential, and therefore including them into files ultimately comes at a great cost. So this package solves one specific use case of building a graph out of OSM road network data with almost no size limit.
While the graph files produced can be used in any workflow, the
intended use case is with the cppRouting
package, which is outstanding for working with large scale road networks
right from your R environment. Additionally, pre-processing of raw
extract files can done with the rosmium
package. For
example to select a more specific bounding box in your data. One
condition here is required, is that you export features in a complete
way. Missing nodes can cause problems in the current version of the
package.
cppRosm
The function extract_graph
accepts the path to a raw OSM
extract file, and writes into the working directory of the project 2
data sets: nodes.csv, road_segments.csv.
Output
nodes.csv - contains 3 columns: id, lon, lat with respectively the OSM id of a node, its longitude and latitude.
id | lon | lat |
---|---|---|
character | numeric | numeric |
road_segments.csv - contains at least 3 columns, but can contain more. The minimum expected columns are from, to, length. The first 2 columns contain node ids.
from | to | length | highway |
---|---|---|---|
character | character | numeric | character (factor) |
In some cases, node ids can be cast to long integers, but this is not recommended.
Example workflow
Once the package and dependencies are installed, the workflow is the following:
library(cppRosm)
## basic example code
file <- system.file(package = 'cppRosm','extdata',"map.osm") # path to a local file
cppRosm::extract_graph(file)
## Files will be written to: /Users/cenv1069/Documents/packages/cppRosm/vignettes
## [1] 1
|
|
At the end of the execution of this code, you should find the two
files in your working directory (if you don’t know which one is it, run
getwd()
).
Exceptions
This package does not qualify for CRAN due to the fact that the low level code writes into the files mentioned above. The CRAN specs do not recommend this kind of actions because failure at the low level execution crashes the R session resulting in potential loss of unsaved data. However, due to the performance gain that executing at the low level provides, it seemed worth the risk.
Source of failure
The main source of failure of the low level code is the presence of
incomplete ways in the data. To avoid that, make sure to include
complete ways whenevver possible, and avoid using sources that do not
mention this specification about their data. The usual exporting tools
such as the OSM website, GEOFABRICK use the full data with complete
ways. When using command line tools, such as libosmium or osmosis, this
parameter is generally available. The recommended tool to use, is
filtering a wider area with the rosmium
package in the
following way:
rosmium::extract(
input_path = ".../map.osm"
,extent = sf::st_bbox() # a bbox
,output_path = ".../smaller_map.osm"
,strategy = "complete_ways" # !!! important
)
The parameter strategy="complete_ways"
ensures that ways
are exported without missing bits.