Project

General

Profile

Denormalizing a datasource

38% complete: 3 of 8 datasources switched

to switch: CVS, FIA, Madidi, (NVS), SALVIAS, TEAM, VegBank, (newWorld)

  1. do staging table column renaming
  2. prevent column collisions by prepending the table name to each column name
    1. in each map.csv:
      1. replace text ,*
        with text ,*table--
    2. inputs/$dest/run postprocess # runtime: 1-2 min
    3. commit:
      svn st inputs/$dest/*/test.xml.ref
      # make sure there are no changes
      svn di
      svn ci -m 'inputs/$dest/: prepended the table name to each column name to prevent column collisions, using the steps at http://vegbiendev.nceas.ucsb.edu/wiki/Left-joining_a_datasource'
      
    4. on vegbiendev:
      svn up
      inputs/$dest/run postprocess # runtime: 4-6 min
      
  3. add flattened view
    1. add subdir:
      make inputs/$dest/'taxon_observation.**'/add
      "cp" -f inputs/FIA/'taxon_observation.**'/{run,postprocess.sql} inputs/$dest/'taxon_observation.**'/
      echo 'taxon_observation.**' >>inputs/$dest/import_order.txt
      
    2. edit postprocess.sql:
      1. change the table names to those of the datasource
      2. set the first column in the view to be the row_num (if one exists) or sort_col (which should be a joined table's pkey)
      3. if using a sort_col, remove the 'row_num' argument to mk_subset_by_row_num_func()
    3. inputs/$dest/'taxon_observation.**'/run
      1. fix bugs and repeat until it has a successful exit status
    4. make inputs/$dest/add
    5. commit:
      svn di
      svn ci -m 'inputs/$dest/: added taxon_observation.** left-join of the tables, using the steps at http://vegbiendev.nceas.ucsb.edu/wiki/Left-joining_a_datasource'
      
    6. on vegbiendev:
      svn up
      inputs/$dest/'taxon_observation.**'/run
      
  4. prevent joined tables from also being imported (after the left-join above is successful)
    1. create a blank file named _no_import in each table subdir:
      for table in $(grep -vF Source inputs/$dest/import_order.txt); do "cp" -f inputs/FIA/COND/_no_import inputs/$dest/$table/; done
      make inputs/$dest/add
      
    2. commit:
      svn st
      svn ci -m 'inputs/CVS/: don'\''t import joined tables, because they are now imported in the taxon_observation.** left-join instead'
      
    3. on vegbiendev:
      svn up
      inputs/$dest/run # runtime: 4-6 min