Skip to main content

Opensteetmap Data Models

Opensteetmap is the crowdsourced mapping of the planet and can be accessed at https://opensteetmap.org

The map data is freely downloadable and can be hosted locally for any purpose whatsoever under GPL license. Map data is downloadable in multiple archive formats like pbf (Protocolbuffer Binary Format) and others.


OSM's internal data model is represented as shown in the ER diagram above, but when it's imported from the PBF, it has to be efficient, fast and easy to use. Primarily, OSM databases can either be,

  1. Snapshot or 
  2. Historical
Snapshot databases don't contain historical infromation (for instance, how a river changed it's course of the years or how landuse of a forested land changed over the year) whereas, Historical database contains this information. Naturally, historical databases are much more complex and requires more effort.

Various parameters on how to judge various data stores are,
  1. Updatability
  2. Transactibility
  3. Concurrent Read/Writability
  4. Snapshot/Historical ?
Few of the various data store usercases used by OSM community and developers are,
  1. API DB - This is the original PostgreSQL (not PostGIS) database running at the main OSM server.
    1. Updatable
    2. Full transaction support
    3. Full concurrent reads/writes
    4. Full historical data
  2. osm2pgsql - This is the primary datastore used by renderers like Mapnik.
    1. Updatable
    2. No transaction support
    3. Single writer/multiple reader
    4. Snapshot
  3. pgsnapshot - This is a popular schema generated by osmosis tool. It's mainly used for analysis.
    1. Updatable
    2. Snapshot
  4. Normatim -  This is a schema used primarily for geocoding and searching. Generated by osm2pgsql
    1. Updatable
    2. Snapshot
It's important to choose datastore that is appropriate to the usecase, because GIS queries can be damn expensive especially when the records can be in multiple gigabytes.

Comments

Popular posts from this blog

Multimaster replication with Symmetric DS

Symmetric DS is an awesome tool for trigger based replication whcih works for all major database vendors, including but not limited to PostgreSQL, MySQL, MSSQL, Oracle and many others. Symmetric-DS is a java application and can execute on any platform on whcih JRE is available including Windows and Linux. Trigger based replication, in constrast to disk based (eg. DRBD ) or transaction log file shipping based or statement based , works by registering triggers on DMLs and sending the data thus generated to remote machines. Another very popular trigger based DB replication tool is Slony . Symmetric-DS in addition to being database agnostic also supports multi-master replication (MMR). MMR usecase involves multiple database nodes, connected in a pool with DML updates coming from any of them. This is different from the normal master-slave replication, where slaves are not expected to generate any data events, and the sole authority of database is the master. MMR requirement causes d...

PC Power supply and hacks

For posterity and myself, I'm leaving some tips and tricks of PC Power Supply Unit (PSU) whcih is an SMPS (Switched Mode Power Supply). There are a variety of uses of a +12V, +5V and +3V DC power supply like lighting up an LED strip or powering a raspberry pi. There are various colored cables in a typical ATX 12V SMPS. I'll list out the various color lines and what they mean, Sr. No Cable color Number of cables in a PSU Use 1 Green exactly one (1) Wake up signal from motherboard. Pressing PC power button makes this signal carry wake up signal to PSU to start. Green needs to be touched with the any ground to make the SMPS start. For self-starting PSUs, green needs to be connected with one black all the time. 2 Blue exactly one (1) -12V 3 Purple exactly one (1) +5V standby. When power supply is on standby mode (not on by signalling green), this line can give 1-2 A current. 4 Gray exactly one (1) Power good signal. When PSU levels has reached specificati...

Cryptographic Primitive III: RSA Asymmetric Keys

RSA cryptosystems involves, a private key (which is kept private) and a public key, which is kept public i.e. known to everyone. The security of RSA hinges on the mathematically difficult problem of finding prime factorization of a very large number. Let's quickly disuss how a public, private key pair can be generated, Let, p and q be two large primes, then $n = q \times q$ $\phi(n) = (p-1) \times (q-1)$ Here, $\phi(n)$ is called euler's totient function Choose a random number $e$ such that, $e \in \left\{0,1,2...\phi(n)-1\right\}$ and $gcd(e,\phi(n)) = 1$ The gcd condition will ensure that we have an inverse of $e$ in $\mathbb{Z}_{26}$. Now, using extended euclidian algorithm one can get the inverse of e as d such that, $d \equiv e \pmod{\phi(n)}$ So, there we have it, the private key is $e$ and the public key is $(n,d)$. Few points to note here are, $p$ and $q$ are both $\geq 2^{512}$, although the recommened size is $2^{1024}$ $n$ is $\geq 2^{1024}$, although the recommended...