It took > two weeks to load osmosis’s dump for postgres. We need a faster development cycle than that, so we’re going to start with just the BC data, to whet our appetites. And to try out some other software that may be better. Hailey heard that there’s a new tool and imposm that’s multithreaded. Maybe we can try that.

First I had to remember how to create the database. Pretty easy:

# CREATE DATABASE osm_bc TEMPLATE template_postgis TABLESPACE osmspace

imposm doesn’t seem to support subsetting to a bounding box, so I turned back to osmosis for that. There’s probably and easier way, but to get the bounding box, I used qgis, set the map projection to WGS84, enabled reprojection on the fly, and then brought in some layer with the BC political boundaries. Then I subseted the planet.osm like this:

hiebert@windy:/home/data/gis/osm/pgimport_bc$ osmosis --read-xml enableDateParsing=no file=../planet-latest.osm --bounding-box bottom=48.15 top=60 left=-139.25 right=-114-used-node idTrackerType=BitSet --write-xml bc-latest.osm
16-Sep-2011 11:56:54 AM org.openstreetmap.osmosis.core.Osmosis run
INFO: Osmosis Version 0.34
log4j:WARN No appenders could be found for logger (org.java.plugin.ObjectFactory).
log4j:WARN Please initialize the log4j system properly.
16-Sep-2011 11:56:54 AM org.openstreetmap.osmosis.core.Osmosis run
INFO: Preparing pipeline.
16-Sep-2011 11:56:54 AM org.openstreetmap.osmosis.core.Osmosis run
INFO: Launching pipeline execution.
16-Sep-2011 11:56:54 AM org.openstreetmap.osmosis.core.Osmosis run
INFO: Pipeline executing, waiting for completion.
16-Sep-2011 1:50:39 PM org.openstreetmap.osmosis.core.Osmosis run
INFO: Pipeline complete.
16-Sep-2011 1:50:39 PM org.openstreetmap.osmosis.core.Osmosis run
INFO: Total execution time: 6824958 milliseconds.
hiebert@windy:/home/data/gis/osm/pgimport_bc$ ls -lh
total 4.8G
-rw-rw-r-- 1 hiebert staff 4.5G Sep 16 13:50 bc-latest.osm

FWIW: someone’s wiki page about benchmarking osmosis says that the best thing that you can do to speed it up is enableDateParsing=no. Everything else seems to be pretty minor. There doesn’t seem to be any way to use multiple cores unless your decompressing with on in the pipeline. After subsetting, I tried to use imposm to load into the database, but it doesn’t work at all. And the error messages are completely decipherable to me (and I speak python!).

postgres@windy:/home/data4/gis/osm/pgimport_bc$ imposm --read --concurrency 2 --write --database osm_bc --user postgres --optimize bc-latest.osm
password for postgres at localhost:
[13:58:20] ## reading bc-latest.osm
Process CacheWriterProcess-2:
Traceback (most recent call last):
[13:58:20] coords: 24068k nodes: 481k ways: 3438k relations: 24k (estimated)
  File "/usr/lib/python2.6/multiprocessing/process.py", line 232, in _bootstrap
    self.run()
  File "/usr/local/lib/python2.6/dist-packages/imposm/reader.py", line 117, in run
    cache = self.cache(mode='w', estimated_records=self.estimated_records)
  File "/usr/local/lib/python2.6/dist-packages/imposm/cache/osm.py", line 37, in coords_cache
    return self._x_cache(self.coords_fname, DeltaCoordsDB, mode, estimated_records)
  File "/usr/local/lib/python2.6/dist-packages/imposm/cache/osm.py", line 62, in _x_cache
    cache = x_class(x, mode, estimated_records=estimated_records)
  File "tc.pyx", line 393, in imposm.cache.tc.DeltaCoordsDB.__init__ (imposm/cache/tc.c:5291)
Process CacheWriterProcess-3:
Traceback (most recent call last):
  File "/usr/lib/python2.6/multiprocessing/process.py", line 232, in _bootstrap
  File "tc.pyx", line 104, in imposm.cache.tc.BDB.__init__ (imposm/cache/tc.c:1263)
    self.run()
  File "/usr/local/lib/python2.6/dist-packages/imposm/reader.py", line 117, in run
    cache = self.cache(mode='w', estimated_records=self.estimated_records)
Process CacheWriterProcess-4:
  File "/usr/local/lib/python2.6/dist-packages/imposm/cache/osm.py", line 40, in nodes_cache
Traceback (most recent call last):
  File "/usr/lib/python2.6/multiprocessing/process.py", line 232, in _bootstrap
    return self._x_cache(self.nodes_fname, NodeDB, mode, estimated_records)
  File "/usr/local/lib/python2.6/dist-packages/imposm/cache/osm.py", line 62, in _x_cache
    cache = x_class(x, mode, estimated_records=estimated_records)
  File "tc.pyx", line 104, in imposm.cache.tc.BDB.__init__ (imposm/cache/tc.c:1263)
IOError: 4
    self.run()
  File "/usr/local/lib/python2.6/dist-packages/imposm/reader.py", line 117, in run
    cache = self.cache(mode='w', estimated_records=self.estimated_records)
  File "/usr/local/lib/python2.6/dist-packages/imposm/cache/osm.py", line 43, in ways_cache
IOError: 4
    return self._x_cache(self.ways_fname, WayDB, mode, estimated_records)
  File "/usr/local/lib/python2.6/dist-packages/imposm/cache/osm.py", line 62, in _x_cache
    cache = x_class(x, mode, estimated_records=estimated_records)
  File "tc.pyx", line 104, in imposm.cache.tc.BDB.__init__ (imposm/cache/tc.c:1263)
Process CacheWriterProcess-5:
Traceback (most recent call last):
IOError: 4
  File "/usr/lib/python2.6/multiprocessing/process.py", line 232, in _bootstrap
    self.run()
  File "/usr/local/lib/python2.6/dist-packages/imposm/reader.py", line 117, in run
    cache = self.cache(mode='w', estimated_records=self.estimated_records)
  File "/usr/local/lib/python2.6/dist-packages/imposm/cache/osm.py", line 53, in relations_cache
    return self._x_cache(self.relations_fname, RelationDB, mode, estimated_records)
  File "/usr/local/lib/python2.6/dist-packages/imposm/cache/osm.py", line 62, in _x_cache
    cache = x_class(x, mode, estimated_records=estimated_records)
  File "tc.pyx", line 104, in imposm.cache.tc.BDB.__init__ (imposm/cache/tc.c:1263)
IOError: 4
^CTraceback (most recent call last):
  File "/usr/local/bin/imposm", line 9, in 
    load_entry_point('imposm==2.3.2', 'console_scripts', 'imposm')()
  File "/usr/local/lib/python2.6/dist-packages/imposm/app.py", line 217, in main
Process ParserProgress-1:
    reader.read(arg)
  File "/usr/local/lib/python2.6/dist-packages/imposm/reader.py", line 88, in read
Traceback (most recent call last):
    parser.parse(filename)
  File "/usr/local/lib/python2.6/dist-packages/imposm/parser/simple.py", line 64, in parse
    return self.parse_xml_file(filename)
  File "/usr/lib/python2.6/multiprocessing/process.py", line 232, in _bootstrap
  File "/usr/local/lib/python2.6/dist-packages/imposm/parser/simple.py", line 82, in parse_xml_file
    return self._parse(input, XMLMultiProcParser)
  File "/usr/local/lib/python2.6/dist-packages/imposm/parser/simple.py", line 132, in _parse
    callback(items)
  File "/usr/lib/python2.6/multiprocessing/queues.py", line 287, in put
    if not self._sem.acquire(block, timeout):
KeyboardInterrupt
    self.run()
  File "/usr/local/lib/python2.6/dist-packages/imposm/util.py", line 51, in run
    log_statement = self.queue.get()
  File "/usr/lib/python2.6/multiprocessing/queues.py", line 91, in get
    res = self._recv()
KeyboardInterrupt
^CError in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/usr/lib/python2.6/atexit.py", line 24, in _run_exitfuncs
    func(*targs, **kargs)
  File "/usr/lib/python2.6/multiprocessing/util.py", line 269, in _exit_function
    p.join()
  File "/usr/lib/python2.6/multiprocessing/process.py", line 119, in join
    res = self._popen.wait(timeout)
  File "/usr/lib/python2.6/multiprocessing/forking.py", line 117, in wait
    return self.poll(0)
  File "/usr/lib/python2.6/multiprocessing/forking.py", line 106, in poll
    pid, sts = os.waitpid(self.pid, flag)
KeyboardInterrupt
Error in sys.exitfunc:
Traceback (most recent call last):
  File "/usr/lib/python2.6/atexit.py", line 24, in _run_exitfuncs
    func(*targs, **kargs)
  File "/usr/lib/python2.6/multiprocessing/util.py", line 269, in _exit_function
    p.join()
  File "/usr/lib/python2.6/multiprocessing/process.py", line 119, in join
    res = self._popen.wait(timeout)
  File "/usr/lib/python2.6/multiprocessing/forking.py", line 117, in wait
    return self.poll(0)
  File "/usr/lib/python2.6/multiprocessing/forking.py", line 106, in poll
    pid, sts = os.waitpid(self.pid, flag)
KeyboardInterrupt


Back to osmosis.  Only using a subset, it generates the dump lots faster (6 minutes).

hiebert@windy:/home/data/gis/osm/pgimport_bc$ JAVACMD_OPTIONS="-Xmx10g" osmosis --read-xml file="bc-latest.osm" --used-node idTrackerType=BitSet --write-pgsql-dump directory="./pgdump" enableBboxBuilder="no" enableLinestringBuilder="no" nodeLocationStoreType="InMemory"
16-Sep-2011 2:15:30 PM org.openstreetmap.osmosis.core.Osmosis run
INFO: Osmosis Version 0.34
log4j:WARN No appenders could be found for logger (org.java.plugin.ObjectFactory).
log4j:WARN Please initialize the log4j system properly.
16-Sep-2011 2:15:30 PM org.openstreetmap.osmosis.core.Osmosis run
INFO: Preparing pipeline.
16-Sep-2011 2:15:30 PM org.openstreetmap.osmosis.core.Osmosis run
INFO: Launching pipeline execution.
16-Sep-2011 2:15:30 PM org.openstreetmap.osmosis.core.Osmosis run
INFO: Pipeline executing, waiting for completion.
16-Sep-2011 2:21:18 PM org.openstreetmap.osmosis.core.Osmosis run
INFO: Pipeline complete.
16-Sep-2011 2:21:18 PM org.openstreetmap.osmosis.core.Osmosis run
INFO: Total execution time: 347656 milliseconds.
Then the resulting dump loads in less than an hour (praise Jesus).
osm_bc=# \i /usr/share/doc/osmosis/examples/pgsql_simple_schema_0.6_linestring.sql
                   addgeometrycolumn                    
--------------------------------------------------------
 public.ways.linestring SRID:4326 TYPE:GEOMETRY DIMS:2 
(1 row)

CREATE INDEX
osm_bc=# \i /usr/share/doc/osmosis/examples/pgsql_simple_schema_0.6_bbox.sql
                addgeometrycolumn                 
--------------------------------------------------
 public.ways.bbox SRID:4326 TYPE:GEOMETRY DIMS:2 
(1 row)

CREATE INDEX
osm_bc=# \i /usr/share/doc/osmosis/examples/pgsql_simple_load_0.6.sql
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
psql:/usr/share/doc/osmosis/examples/pgsql_simple_load_0.6.sql:6: ERROR:  index "idx_nodes_action" does not exist
DROP INDEX
DROP INDEX
psql:/usr/share/doc/osmosis/examples/pgsql_simple_load_0.6.sql:9: ERROR:  index "idx_ways_action" does not exist
DROP INDEX
DROP INDEX
psql:/usr/share/doc/osmosis/examples/pgsql_simple_load_0.6.sql:12: ERROR:  index "idx_relations_action" does not exist
DROP INDEX
DROP INDEX
DROP INDEX
          dropgeometrycolumn           
---------------------------------------
 public.ways.bbox effectively removed.
(1 row)

             dropgeometrycolumn              
---------------------------------------------
 public.ways.linestring effectively removed.
(1 row)

psql:/usr/share/doc/osmosis/examples/pgsql_simple_load_0.6.sql:57430381: NOTICE:  ALTER TABLE / ADD PRIMARY KEY will create implicit index "pk_nodes" for table "nodes"
ALTER TABLE
psql:/usr/share/doc/osmosis/examples/pgsql_simple_load_0.6.sql:57430382: NOTICE:  ALTER TABLE / ADD PRIMARY KEY will create implicit index "pk_ways" for table "ways"
ALTER TABLE
psql:/usr/share/doc/osmosis/examples/pgsql_simple_load_0.6.sql:57430383: NOTICE:  ALTER TABLE / ADD PRIMARY KEY will create implicit index "pk_way_nodes" for table "way_nodes"
ALTER TABLE
psql:/usr/share/doc/osmosis/examples/pgsql_simple_load_0.6.sql:57430384: NOTICE:  ALTER TABLE / ADD PRIMARY KEY will create implicit index "pk_relations" for table "relations"
ALTER TABLE
psql:/usr/share/doc/osmosis/examples/pgsql_simple_load_0.6.sql:57430385: ERROR:  column "action" does not exist
CREATE INDEX
CREATE INDEX
psql:/usr/share/doc/osmosis/examples/pgsql_simple_load_0.6.sql:57430388: ERROR:  column "action" does not exist
CREATE INDEX
CREATE INDEX
psql:/usr/share/doc/osmosis/examples/pgsql_simple_load_0.6.sql:57430391: ERROR:  column "action" does not exist
CREATE INDEX
                addgeometrycolumn                 
--------------------------------------------------
 public.ways.bbox SRID:4326 TYPE:GEOMETRY DIMS:2 
(1 row)

                   addgeometrycolumn                    
--------------------------------------------------------
 public.ways.linestring SRID:4326 TYPE:GEOMETRY DIMS:2 
(1 row)

UPDATE 1198178
psql:/usr/share/doc/osmosis/examples/pgsql_simple_load_0.6.sql:57430415: ERROR:  syntax error at or near "CREATE"
LINE 10: CREATE INDEX idx_ways_bbox ON ways USING gist (bbox);
         ^
CREATE INDEX
psql:/usr/share/doc/osmosis/examples/pgsql_simple_load_0.6.sql:57430419: NOTICE:   no notnull values, invalid stats
VACUUM
osm_bc=#
-----


blog comments powered by Disqus

Published

19 September 2011

Category

work

Tags