Skip to main content

Importing OSM (PBF file) to PostGIS (PGSNAPSHOT schema)

OSM Data can be used for variety of usecases like rendering, searching, data analytics etc. Each usecase yield quickest result in a particular data format. Due to the large volume of data, this can cause significant difference in the response times. 

Some of these formats are,
  1. PBF -- A highly compressed format used almost exclusively for storage and downloads owing to the small sizes. These format are not human readable and gzip compressed.
  2. PostgreSQL with PostGIS extension -- A data format backed up by PostgreSQL database. There are various schemas supported for multiple use cases which can greatly impact the response time as indexes and normalization are used to speed up queries. Some popular schemas along with their usecase and certain features are as below. The table is taken directly from openstreetmap wiki.

Schema nameCreated withUsed byPrimary use caseUpdatableGeometries (PostGIS)Losslesshstore columnsDatabase
osm2pgsqlosm2pgsqlMapnikKothic JSRenderingyesyesnooptionalPostgreSQL
apidbosmosisAPIMirroringyesnoyesnoPostgreSQL, MySQL
pgsnapshotosmosisjXAPIAnalysisyesoptionalyesyesPostgreSQL
imposmImposmRenderingnoyesnoImposm2: no, Imposm3: yesPostgreSQL
nominatimosm2pgsqlNominatimSearch, Geocodingyesyesyes?PostgreSQL
ogr2ogrogr2ogrAnalysisnoyesnooptionalvarious
osmsharpOsmSharpRoutingyesno??Oracle
overpassOverpass APIAnalysisyes?yes?custom
mongosmMongOSMAnalysismaybe???MongoDB
node-mongosmMongoosejsAnalysisyesyesyesNAMongoDB
goosmgoosmAnalysisnoyesyesNAMongoDB
pbf2mongopbf2mongoAnalysisnoyesyesNAMongoDB
waychangeSQLstreetkeysmvData cache and Analysisonly a schemaoptional?noPostgreSQL
In this blog, we'll download a PBF file containing our region of interest and insert the data into a PostgreSQL database as PGSNAPSHOT schema.

Get PBF Data

Visit https://download.geofabrik.de/, there you'll find OSM data for continent and countries (and cities in some cases). 

Install osmosis

Visit https://github.com/openstreetmap/osmosis/releases/latest, and download the gzipped tarball containing osmosis binaries.

# mkdir osmosis
# mv osmosis-latest.tgz osmosis
# cd osmosis
# tar xvfz osmosis-latest.tgz
# rm osmosis-latest.tgz

Create Dataqbase and import PGSNAPSHOT schema

# createdb gis
# psql -d gis -c "CREATE extension IF NOT EXISTS postgis; CREATE extension IF NOT EXISTS hstore"
# psql -d gis -f osmosis/script/pgsnapshot_schema_0.6.sql

At a later stage, these scripts can be run for specific queries,

# psql -d pgsnapshot -f osmosis_dir/script/pgsnapshot_schema_0.6_action.sql 

# psql -d pgsnapshot -f osmosis_dir/script/pgsnapshot_schema_0.6_bbox.sql 

# psql -d pgsnapshot -f osmosis_dir/script/pgsnapshot_schema_0.6_linestring.sql

Import PBF data

osmosis/bin/osmosis \
--read-pbf file="asia.osm.pbf" \
--write-pgsql host="localhost" database="gis" user="" password=""

Try out GeoQuery

select st_distance(geom, 'SRID=4326;POINT(-74.1235 35.3521)'::geometry) as dist, \
tags->'name' as name from nodes where tags?'place' and tags->'place' \
in ('city','muncipality','town') order by dist;

Comments

Popular posts from this blog

Multimaster replication with Symmetric DS

Symmetric DS is an awesome tool for trigger based replication whcih works for all major database vendors, including but not limited to PostgreSQL, MySQL, MSSQL, Oracle and many others. Symmetric-DS is a java application and can execute on any platform on whcih JRE is available including Windows and Linux. Trigger based replication, in constrast to disk based (eg. DRBD ) or transaction log file shipping based or statement based , works by registering triggers on DMLs and sending the data thus generated to remote machines. Another very popular trigger based DB replication tool is Slony . Symmetric-DS in addition to being database agnostic also supports multi-master replication (MMR). MMR usecase involves multiple database nodes, connected in a pool with DML updates coming from any of them. This is different from the normal master-slave replication, where slaves are not expected to generate any data events, and the sole authority of database is the master. MMR requirement causes d...

Reset root password RHEL/Rocky/CentOS 9

Unlike the earlier versions of Rethat variants, version 9 doesn't allow single user mode to change password, as maintanance mode in 9 requires root password . Single user mode (runlevel 1) can easily be obtained by appending the word ' single ' at the end of the line starting with 'linux' by editing the entry in boot menu by pressing ' e ' at boot menu. To reset the root password on the other hand, one requires to follow a specific set of commands, At the boot menu, edit rescue mode to append 'rd.break ' at the end of the line starting with kernel. Boot with the edited line by pressing Ctrl+X or F10. At the new prompt starting with switch_root, type the following commands, mount -o remount, rw /sysroot chroot /sysroot touch /.autorelabel passwd <new root password> exit reboot       

Devstack installation with cells in multiple machines

  Devstack is a testing and development package for OpenStack . The official devstack website has excellent but terse installation instructions, which could be daunting to a newbie. Normally, a traditional install is pretty straightforward but problems creep in when you have to go for some particular configuration or installation like cells or a simple code-test-debug setup, the information on the main site is insufficient. In this blog, I will enumerate the steps for setting up a cell-ular devstack installation with one parent and one child. Parent will not be instantiating any VMs s it's only a 'command centre'. 1. On both machines, install git and vim, clone the devstack repo using git. yum install git vim git clone https://git.openstack.org/openstack-dev/devstack 2. On both machines, open up the file 'local.conf' inside the cloned directory 'devstack' cd devstack; vim local.conf 3. Copy the parameters below to the file...