Further refinements to mappings to support database constraints
xpath.py: Added support for negative attribute assertions with !
Changed mappings to use keys vs. attrs
xpath.py: Fixed creation of attrs so it happens even when node already exists
xpath.py: Added concept of keys vs attrs in XPath elem
Started filling in required values for VegBank fields in mappings. Will need to refactor to move these to metadata for the datasources.
Now allow empty rows. Added support for select statement limit.
Added support for quoted values in XPaths
Fixed name XML function. Fixed accept_test_output.
Added support for name XML function. Added error handling for empty rows.
Made it easier to accept test output
Added NYBG stemCount metadata
Added xml_func.py to process mappings whose output needs postprocessing
Changed VegBank mappings to use XML functions (not implemented yet) to calculate averages and ranges
Added support for mapping datasource metadata
Changed for loops to use enumerate() where the index is also needed
Moved XPath prep code (setting ID, value) to xpath.py
xpath.py: Added deepcopy() before setting value of other branches to traverse
NYSpecimenDataAmericas.test.xml: Updated for new NYBG-VegX.organisms.csv
NYBG-VegX.organisms.csv: Changed voucher (primary key) column to be UniqueNYInternalRecordNumber because CatalogNumber contained an empty value
xpath.py: Added basic support for split paths
Merged xml_xpath.py into xpath.py in preparation for changing the XPath parse tree to be the XML DOM tree itself
Refactored xpath.parse() to use a nested function instead of a class extending Parser
map: Fixed mislocated import for Parser.SyntaxException
Removed SALVIAS voucher_string mapping per conference call discussion
map: Fixed bugs to enable mapping straight from CSV to a database. Still need a way to set plot.authorPlotCode for specimens data.
Fixed ch_map_root to support subpaths which follow the root by -> rather than /. Changed spreadsheet syntax to have : between label and root.
Updated extract_plot_map to use new name for VegX-VegBank mapping and re-ran it and join_all_vegbank
Finished VegX-VegBank mapping and created VegBank joins of mappings to VegX
Finished ch_map_root (renamed from submap)
Added submap and extract_plot_map to extract plot subpaths from VegX-VegBank.csv
Moved env usage string creation to opts.py. Changed db config var names to use in/out instead of from/to.
Keep *.test.xml out of version control
Moved options-processing code to opts.py: Added opts.py
Moved options-processing code to opts.py
test_map: Compares generated XML to correct version
Fixed xml_xpath.get() last_only optimization to handle attrs correctly. Turned off stack traces for errors intended for the user to see.
Changed mappings to place prefix common to all XPaths in the column header
simplify_xpath: Made it case-insensitive
map: Added support for custom fkeys to parent in db XML trees. Removed extraneous csv reader/writer config because Excel format is default. Improved documentation.
map: Added stub for database input
map: Added more stubs for XML-XML mapping
Started adding XML-XML mapping support to map
Split off xpath.py XML functionality into xml_xpath.py
map: Using SystemExit for usage errors to avoid stack trace
Merged data2xml and xml2db into map
Removed trailing whitespace from VegX-VegBank.csv map
Created join_maps to join two 2-column map spreadsheets
Renamed mappings to be compatible with Redmine allowed characters in attachment filenames
Added refactored mappings and changed data2xml to use the new 2-column format
Refactored db_xml.py's db insertion function to avoid extra nested functions
Added README.TXT
Renamed modules to remove _util
Added svn:ignore for *.pyc
Renamed xml2db_ and data2xml_ to remove _
Moved scripts to main directory and associated files to util
Moved Python modules to shared lib folder
xml2db: Started refactoring xml2db() to support getting as well as inserting data
xml2db: Changed to return ID (pkey) of inserted record and use this returned value as parent_id instead of getting the parent_id from the parent XML node
data2xml: Added syntax for split paths, which map to multiple leaves
xml2db: Improved empty_db to use TRUNCATE instead of DROP DATABASE. Added xml2vegbank to automatically set db env vars.
data2xml: Improved syntax for XPath lookahead assertions. Changed XML printing to print multiple text nodes on separate lines.
Moved vegbank_example_ver1.0.2.xml to xml2db, where it should have been
data2xml: Small correction to NYBG mapping
data2xml: Created simplify_xpath script to remove duplication from XPath expressions
data2xml: Added support for * abbrs for backward (child-to-parent) pointers
In data2xml, fixed determination of which nesting level to put IDs on
Simplified expansion of * abbrs
Removed no longer necessary strip() from node value getter
Added patch for xml.dom.minidom.Element.writexml to avoid adding extra whitespace around text nodes
Added pointer field name abbreviations to data2xml and NYBG mappings
In data2xml, fixed pointer handling to deal with pointer targets that are themselves pointers
In data2xml, added shortcut for lookahead assertion using ! symbol
In data2xml, fixed backward (child-to-parent) pointer handling to get and set attribute values properly
In data2xml, fixed xpath.get() to do last_only optimization properly for pointer targets
In data2xml, added support for XPath pointers
Merged data2xml XPath functionality into xpath.py. Merged data2xml xml_dom.py and xml2db xml_util.py into identical xml_util.py for each script.
Added empty_db script to reset the vegbank database after running xml2db/test in commit mode
Changed xml2db and vegbank db to be owned by new user vegbank
Changed xml2db and data2xml to help standardize mapping to different XML formats
Added DROP DATABASE and CREATE DATABASE to vegbank.sql
Changed xml2db to use primarily node contents to determine whether a node is a field or a child table
Changed xml2db to use the first column in a table as its primary key
Changed xml2db to avoid inserting duplicate rows
Initial version of xml2db. Doesn't yet handle all duplicate rows correctly.
Removed .pyc files
Added BIEN 3 scripts
Added ability to change the vegx node names to be different from the postgres table names.This was the easiest way to change the postgres table names when the vegx names are not useable for some reason. This requires that the node names be altered in the xsd...
Adding scripts to transfer data between BIEN2 to VegX schema.Bug fixes to models.py
Some bug fixes. Slight hack in eml-coverage and veg.xsd to get around Django case-insensitivity & VegX entities that only differ by case.Changed default db relationship to one2many from many2many
Committing first working copy of models.py. Other py/pyc files are added/modified just to get in repos.
Adding vegx definition. Current models.py building script.
Initial import of Django version of BIEN
Removing needless files.
Inital import for script that converts vegx files into django objects.
No longer need concept hash since values are now pulled directly from tapir service
First Import