Project

General

Profile

Actions

Task #286

open

CSV-XML-database mapping script

Added by Aaron Marcuse-Kubitza about 13 years ago. Updated almost 13 years ago.

Status:
New
Priority:
Normal
Start date:
11/17/2011
Due date:
% Done:

90%

Estimated time:
Activity type:

Description

Python script to map CSV, XML, and database datasources to each other, using a map spreadsheet when needed

Actions #1

Updated by Aaron Marcuse-Kubitza about 13 years ago

  • Description updated (diff)
Actions #2

Updated by Aaron Marcuse-Kubitza about 13 years ago

  • Subject changed from CSV to XML conversion script to CSV/XML to XML conversion script
  • Description updated (diff)
  • % Done changed from 80 to 50
Actions #3

Updated by Aaron Marcuse-Kubitza about 13 years ago

  • % Done changed from 50 to 60

I updated the data2xml script (demo at nimoy:/home/bien_shared/svn/scripts/data2xml/test) to support the pointer format in the updated mappings.

Actions #4

Updated by Aaron Marcuse-Kubitza almost 13 years ago

  • Subject changed from CSV/XML to XML conversion script to CSV-XML-database mapping script
  • Description updated (diff)

Merged in XML to database conversion script

Actions #5

Updated by Aaron Marcuse-Kubitza almost 13 years ago

I've merged data2xml and xml2db into one script called map, which can be run on nimoy at /home/bien_shared/svn/scripts/map . The tester can be run at /home/bien_shared/svn/scripts/util/test_map and outputs VegX to /home/bien_shared/svn/scripts/util/NYSpecimenDataAmericas.test.xml . There is also a wrapper, map2vegbank, which accepts VegBank XML and outputs to an (empty) VegBank database on nimoy.

The map usage is similar to data2xml and xml2db, and can be obtained by running map with no arguments. In general, you set database access info in environment variables, pass any mapping spreadsheet as a single command-line argument, and the CSV/XML data is input from STDIN and output to STDOUT or a database.

Actions #6

Updated by Aaron Marcuse-Kubitza almost 13 years ago

  • % Done changed from 60 to 70

We are now able to import NYBG data directly into VegBank, using the map2vegbank script on nimoy at /home/bien_shared/svn/scripts/map2vegbank . To demo it, run /home/bien_shared/svn/scripts/test/map . I ran it in commit mode, so you can browse the data it loaded by running /home/bien_shared/svn/scripts/util/psql_vegbank .

(Note that test/map currently produces errors about timestamp input syntax, because SALVIAS uses unconventional data formats that I am in the process of writing a conversion function for. Also note that it will say Inserted 0 rows because the data has already been loaded.)

Actions #7

Updated by Aaron Marcuse-Kubitza almost 13 years ago

  • % Done changed from 70 to 80

Added support for database and XML inputs

Actions #8

Updated by Aaron Marcuse-Kubitza almost 13 years ago

  • % Done changed from 80 to 90

I added support for mapping XML to XML, which will enable us to process NVS's VegX data, and eventually also their internal XML data. I added tests to verify that mapping data straight through to VegBank produces the same results as outputting to VegX first and then mapping the VegX, which it now does after some bug fixes.

You can run the tester on nimoy with the command: make test --directory=/home/bien_shared/svn/scripts/

Actions

Also available in: Atom PDF