Project

General

Profile

Activity

From 11/23/2011 to 12/22/2011

12/22/2011

08:30 PM Revision 281: map: Print xml_func.SyntaxExceptions without stack traces by using SystemExit
Aaron Marcuse-Kubitza
08:22 PM Revision 280: xml_func.py: Add function name to SyntaxException message
Aaron Marcuse-Kubitza
08:22 PM Revision 279: ex.py: Added repl_msg() to format a message with the % operator
Aaron Marcuse-Kubitza
07:48 PM Revision 278: xml_func.py: Return string->number conversion errors as xml_func.SyntaxExceptions
Aaron Marcuse-Kubitza
07:29 PM Revision 277: psql_vegbien: Fixed comment to vegbien instead of vegbank
Aaron Marcuse-Kubitza
07:11 PM Revision 276: psql_vegbien: Use new location of bien_password
Aaron Marcuse-Kubitza
07:05 PM Revision 275: Makefile: Fixed paths to mappings dir for new scripts dir location
Aaron Marcuse-Kubitza
07:02 PM Revision 274: Renamed util to bin
Aaron Marcuse-Kubitza
06:59 PM Revision 273: Moved inputs_Makefile to inputs/input.Makefile
Aaron Marcuse-Kubitza
06:55 PM Revision 272: Moved bien_password to new config dir
Aaron Marcuse-Kubitza
06:52 PM Revision 271: Moved sample inputs to test dir
Aaron Marcuse-Kubitza
06:42 PM Revision 270: Added symlink from scripts to new scripts destination
Aaron Marcuse-Kubitza
06:40 PM Revision 269: Removed now-empty scripts dir
Aaron Marcuse-Kubitza
06:39 PM Revision 268: Moved everything in scripts to root. inputs_Makefile: Don't run "all" when installing.
Aaron Marcuse-Kubitza
06:24 PM Revision 267: Renamed bien_map to map
Aaron Marcuse-Kubitza
06:22 PM Revision 266: Moved map to util
Aaron Marcuse-Kubitza
06:14 PM Revision 265: fix_permissions: Don't chmod symlinks
Aaron Marcuse-Kubitza
06:00 PM Revision 264: inputs_Makefile: Auto-generate map to VegBIEN and import data into vegbien from input DB
Aaron Marcuse-Kubitza
05:59 PM Revision 263: inputs/SALVIAS: maps to VegX and VegBIEN
Aaron Marcuse-Kubitza
05:58 PM Revision 262: bien_map: Runs map with BIEN defaults
Aaron Marcuse-Kubitza
05:56 PM Revision 261: join_sort: Sorts a join on the output col
Aaron Marcuse-Kubitza
03:20 PM Revision 260: inputs_Makefile: Require dbEngine var instead of defaulting to MySQL
Aaron Marcuse-Kubitza
02:22 PM Revision 259: Moved inputs into svn
Aaron Marcuse-Kubitza
01:46 PM Revision 258: Moved pre-BIEN 3 files into _archive folder
Aaron Marcuse-Kubitza
01:46 PM Revision 257: test/map: Use db.sh syntax in *.sh tests
Aaron Marcuse-Kubitza
01:45 PM Revision 256: inputs_Makefile: Generate db.sh with DB access info
Aaron Marcuse-Kubitza
12:59 PM Revision 255: inputs_Makefile: Changed GRANT ALL to GRANT SELECT. Added REVOKE ALL. Added $(db).sql as prerequisite of install in case it needs to be auto-generated.
Aaron Marcuse-Kubitza
12:41 PM Task #302: Make changes to VegBIEN schema
Bob Peet's changes are at [[VegBIEN from VegBank]]
E-mail from Bob Peet on 2011-12-22:
I promised to summarize ...
Aaron Marcuse-Kubitza
12:31 PM Task #310 (Resolved): automated build process
I set up an automated build process for all the BIEN scripts and dependencies. It runs on both vegbiendev and nimoy. ... Aaron Marcuse-Kubitza
12:31 PM Task #310 (Resolved): automated build process
Aaron Marcuse-Kubitza
12:30 PM Revision 254: scripts/Makefile: Added action for postgres-Darwin target
Aaron Marcuse-Kubitza
12:11 PM Revision 253: scripts/Makefile: Added postgresql to postgres-Linux apt-get packages
Aaron Marcuse-Kubitza

12/21/2011

08:23 PM Revision 252: test/map: Changed to work on both nimoy and vegbiendev by selecting the appropriate MySQL user and password
Aaron Marcuse-Kubitza
08:22 PM Revision 251: env_password: Added optional message arg
Aaron Marcuse-Kubitza
07:47 PM Revision 250: Added uninstallation of inputs to Makefiles
Aaron Marcuse-Kubitza
07:15 PM Revision 249: scripts/Makefile: Create bien user w/o prompting for password. Fixed syntax error.
Aaron Marcuse-Kubitza
07:06 PM Revision 248: scripts/Makefile: Fixed syntax error
Aaron Marcuse-Kubitza
07:03 PM Revision 247: Inputs now include inputs_Makefile to get mysql command, etc.
Aaron Marcuse-Kubitza
06:30 PM Revision 246: scripts/Makefile: Use bien MySQL user for installing inputs
Aaron Marcuse-Kubitza
06:26 PM Revision 245: scripts/Makefile: Use root MySQL user for creating bien user
Aaron Marcuse-Kubitza
05:53 PM Revision 244: scripts/Makefile: Don't use root as MySQL admin user. Removed no-longer-needed sub-makes for setting DB login vars.
Aaron Marcuse-Kubitza
05:10 PM Revision 243: Changed vegbien_dest and users of it to use separate bien_password file
Aaron Marcuse-Kubitza
04:55 PM Revision 242: fix_permissions: Extend all user permissions to group
Aaron Marcuse-Kubitza
04:26 PM Revision 241: scripts/Makefile: Added inputs
Aaron Marcuse-Kubitza
03:24 PM Revision 240: mappings/Makefile: Cleaned up
Aaron Marcuse-Kubitza
03:02 PM Revision 239: fix_permissions: Don't add group write perms to read-only files
Aaron Marcuse-Kubitza
01:54 PM Revision 238: scripts/Makefile: Made read command syntax compatible with /bin/sh
Aaron Marcuse-Kubitza
01:37 PM Revision 237: scripts/Makefile: Ignore errors about missing packages
Aaron Marcuse-Kubitza
01:30 PM Revision 236: scripts/Makefile: Fixed escape character for /bin/sh
Aaron Marcuse-Kubitza
01:25 PM Revision 235: scripts/Makefile: Added dependency installation. Makefiles: Use _not_file instead of FORCE for clarity. Use $(os) var
Aaron Marcuse-Kubitza

12/20/2011

07:52 PM Revision 234: fix_permissions: Configured output verbosity
Aaron Marcuse-Kubitza
07:44 PM Revision 233: Added fix_permissions to set correct permissions on shared bien files
Aaron Marcuse-Kubitza
07:31 PM Revision 232: Removed VegBank scripts which are no longer needed
Aaron Marcuse-Kubitza
07:30 PM Revision 231: Removed VegBank scripts which are no longer needed
Aaron Marcuse-Kubitza
06:55 PM Revision 230: Renamed vegbien_dest.sh to vegbien_dest to reflect that it is also includable by Makefiles
Aaron Marcuse-Kubitza
06:52 PM Revision 229: DB user creation: Clarified instructions
Aaron Marcuse-Kubitza
06:51 PM Revision 228: scripts/Makefile includes vegbien_dest.sh directly
Aaron Marcuse-Kubitza
06:28 PM Revision 227: Moved db user creation to scripts/Makefile. Removed now-unneeded admin scripts.
Aaron Marcuse-Kubitza
06:12 PM Revision 226: Removed VegBank scripts which are no longer needed
Aaron Marcuse-Kubitza
06:11 PM Revision 225: scripts/Makefile: Added empty_db target which uses vegbien_empty.sql
Aaron Marcuse-Kubitza
05:55 PM Revision 224: mappings/Makefile: Fixed bug where var containing prerequisistes needed to be defined before used. Added support for different sed flags to use extended regular expressions.
Aaron Marcuse-Kubitza
05:42 PM Revision 223: Added auto-generated vegbien_empty.sql to empty the vegbien db
Aaron Marcuse-Kubitza
04:55 PM Revision 222: Test output to VegBIEN instead of VegBank
Aaron Marcuse-Kubitza
04:54 PM Revision 221: VegX-VegBIEN.organisms.csv: sort output of repl
Aaron Marcuse-Kubitza
04:46 PM Revision 220: review: Added nullglob
Aaron Marcuse-Kubitza
04:44 PM Revision 219: review: Don't process replacements spreadsheets
Aaron Marcuse-Kubitza
04:44 PM Revision 218: Moved schema replacements from VegBank-VegBIEN.csv to VegBank-VegBIEN.schema.csv
Aaron Marcuse-Kubitza
04:32 PM Revision 217: VegBank-VegBIEN.csv: Support PostgreSQL before 8.4
Aaron Marcuse-Kubitza
04:29 PM Revision 216: repl: Added support for blank lines. Only add whole word regexp code to inputs w/o *any* regexp metachars.
Aaron Marcuse-Kubitza
04:13 PM Revision 215: Create vegbien db from mappings/schemas/vegbien.sql
Aaron Marcuse-Kubitza
04:12 PM Revision 214: VegBank-VegBIEN.csv: Added replacements for SQL create script conversion
Aaron Marcuse-Kubitza
03:56 PM Revision 213: Generate vegbien db create SQL from vegbank.sql using repl
Aaron Marcuse-Kubitza
03:55 PM Revision 212: Generate vegbien db create SQL from vegbank.sql using repl
Aaron Marcuse-Kubitza
03:54 PM Revision 211: repl: Fixed bug in reading arguments
Aaron Marcuse-Kubitza
03:38 PM Revision 210: repl: Don't add whole-word regexp for inputs that already have regexp metachars
Aaron Marcuse-Kubitza
03:26 PM Revision 209: Removed mappings/VegBank-VegBIEN.organisms.csv because now using replacements spreadsheet
Aaron Marcuse-Kubitza
03:24 PM Revision 208: mappings: Generate mappings to VegBIEN using VegBank-VegBIEN.cs replacements spreadsheet
Aaron Marcuse-Kubitza
03:23 PM Revision 207: mappings: Generate mappings to VegBIEN using VegBank-VegBIEN.cs replacements spreadsheet
Aaron Marcuse-Kubitza
03:23 PM Revision 206: mappings: Generate mappings to VegBIEN using VegBank-VegBIEN.cs replacements spreadsheet
Aaron Marcuse-Kubitza
02:57 PM Revision 205: Added repl to perform replacements on a spreadsheet or file
Aaron Marcuse-Kubitza
01:38 PM Revision 204: scripts/Makefile: removed extra test-% target
Aaron Marcuse-Kubitza
01:38 PM Revision 203: README.TXT: Updated
Aaron Marcuse-Kubitza
01:36 PM Revision 202: scripts/Makefile: install/uninstall targets
Aaron Marcuse-Kubitza
01:36 PM Revision 201: bien_user_create: Print instructions in sequence with password prompts
Aaron Marcuse-Kubitza

12/19/2011

06:16 PM Revision 200: mappings to VegBIEN: Accounted for remaining ALTER TABLE statements
Aaron Marcuse-Kubitza
05:48 PM Revision 199: Renamed format*_for_review to review and added for_review to make clean
Aaron Marcuse-Kubitza
05:41 PM Revision 198: mappings: Added mappings to VegBIEN
Aaron Marcuse-Kubitza
05:25 PM Revision 197: mappings/Makefile: Simplified
Aaron Marcuse-Kubitza
05:22 PM Revision 196: mappings/Makefile: Simplified
Aaron Marcuse-Kubitza
04:59 PM Revision 195: README.TXT: Updated
Aaron Marcuse-Kubitza
04:47 PM Revision 194: Added vegbien DB admin scripts
Aaron Marcuse-Kubitza

12/16/2011

03:54 PM Revision 193: join_all_vegbank: Sort output by both columns
Aaron Marcuse-Kubitza
03:48 PM Revision 192: mappings/VegX-VegBank.organisms.csv: Sorted by both columns
Aaron Marcuse-Kubitza
03:48 PM Revision 191: mappings/Makefile: Sort VegBank-VegBIEN.organisms.csv by both columns
Aaron Marcuse-Kubitza
03:47 PM Revision 190: extract_plot_map: Removed because functionality now in Makefile
Aaron Marcuse-Kubitza
03:46 PM Revision 189: sort: Can sort on multiple columns
Aaron Marcuse-Kubitza
03:17 PM Revision 188: Added basic VegX-VegBIEN mapping
Aaron Marcuse-Kubitza
02:02 PM Revision 187: Added union and join_passthru
Aaron Marcuse-Kubitza
02:02 PM Revision 186: env_password: Print Usage message to stderr
Aaron Marcuse-Kubitza
01:29 PM Revision 185: test/map: Create output dir if it doesn't exist
Aaron Marcuse-Kubitza
01:24 PM Revision 184: Converted scripts back to bash that required bash-specific features
Aaron Marcuse-Kubitza
01:23 PM Revision 183: Converted scripts back to bash that required bash-specific features
Aaron Marcuse-Kubitza
01:13 PM Revision 182: Fixed test/map to work with sh
Aaron Marcuse-Kubitza
01:05 PM Revision 181: Replaced /bin/bash with /bin/sh
Aaron Marcuse-Kubitza
12:31 PM Task #300: TurboVeg data
Mike Lee's mapping is on nimoy under @/home/bien_shared/raw_data/turboveg/DBASEDIC_rkp2011_mtl2011.xlsx@ Aaron Marcuse-Kubitza
12:21 PM Task #309 (Rejected): mapping and export utility from VegBank to VegX
Ideally, what I have in mind is mapping and export utility from VegBank to VegX. Of course this means more work up fr... Aaron Marcuse-Kubitza
12:20 PM Task #308 (Resolved): do a direct transfer of some public data from VegBank
Even higher priority, do you think you could set us up to do a direct transfer of some public data from VegBank? At t... Aaron Marcuse-Kubitza

12/15/2011

04:16 PM Revision 180: join: Added usage item for repeated joins
Aaron Marcuse-Kubitza
04:13 PM Revision 179: join: Changed order of args and redirects to be more intuitive
Aaron Marcuse-Kubitza
04:09 PM Revision 178: Renamed ch_map_root to ch_root
Aaron Marcuse-Kubitza
04:08 PM Revision 177: Renamed join_maps to join
Aaron Marcuse-Kubitza
03:42 PM Task #307 (Resolved): Acquire additional specimen data sets in both DwC and DwCA format, esp. GBIF
Aaron Marcuse-Kubitza
03:42 PM Task #306 (Resolved): Acquire additional plot data sets from providers willing to work with Aaron on mappings and validations
Bob: TurboVeg; Brad: RAINFOR, CTFS Aaron Marcuse-Kubitza
03:42 PM Task #305 (Resolved): Complete full-dataset validations for NYBG & SALVIAS
Aaron Marcuse-Kubitza
03:41 PM Task #304 (Resolved): Complete full dataset imports to VegBIEN via VegX of NYBG and SALVIAS
* Identify and make changes to VegX needed to enable full-dataset imports
* Or report changes needed to Nick, Miquel...
Aaron Marcuse-Kubitza
03:41 PM Task #303 (New): Mapping from VegBIEN to original VegBank
the latter to be used as web schema for BIEN web interface Aaron Marcuse-Kubitza
03:40 PM Task #302 (Resolved): Make changes to VegBIEN schema
Aaron Marcuse-Kubitza
02:49 PM Task #294: find plot data source provider to work with Aaron
Brad has contacted two data source provider (RAINFOR, CTFS) regarding working with Aaron to develop mappings Brad Boyle
01:55 PM Task #285 (Resolved): CSV to XML mappings for NYBG, SALVIAS
initial data sources NYBG and SALVIAS have been fully mapped Aaron Marcuse-Kubitza
01:53 PM Task #291 (Resolved): list of milestones
got commented milestones from Martha Aaron Marcuse-Kubitza
01:03 PM Task #286: CSV-XML-database mapping script
I added support for mapping XML to XML, which will enable us to process NVS's VegX data, and eventually also their in... Aaron Marcuse-Kubitza
01:03 PM Task #296: Direct mapping from native salvias_plots MySQL database to VegBIEN
I added a new mapping to get SALVIAS data directly from the salvias_plots database on nimoy. You can see the results ... Aaron Marcuse-Kubitza
12:20 PM Revision 176: Merged test Makefile into main scripts Makefile
Aaron Marcuse-Kubitza
12:05 PM Revision 175: test/map: Turn off test mode (don't run diff) when env var n (for # rows) is set
Aaron Marcuse-Kubitza
11:39 AM Revision 174: Added SALVIAS DB mapping for plots data
Aaron Marcuse-Kubitza
11:37 AM Revision 173: VegX-VegBank mapping: Fixed VegBank XPath for commName.commName field
Aaron Marcuse-Kubitza
11:35 AM Revision 172: db_xml.py: Use pointer target's name as pointer type where possible. Emphasize that pointer type determined from the pointer name itself is a guess based on common database conventions.
Aaron Marcuse-Kubitza
11:31 AM Revision 171: xpath.py: Changed backward (child-to-parent) pointer ID abbr expansion to happen in get() when source node's tag name is known. This deals with XPath elements that are '.' being used as a pointer source.
Aaron Marcuse-Kubitza

12/14/2011

05:31 PM Revision 170: xpath.py: Moved abbr expansion code to separate function
Aaron Marcuse-Kubitza
04:41 PM Revision 169: test/map: Process all tables for a given DB (.sh) input
Aaron Marcuse-Kubitza
03:26 PM Revision 168: Removed /'s from DB input mappings
Aaron Marcuse-Kubitza
02:29 PM Task #299: Mapping from NVS to VegX and VegBIEN
NVS data from Nick Spencer is on nimoy in @/home/bien_shared/raw_data/nvs/@ Aaron Marcuse-Kubitza
02:06 PM Task #299 (Resolved): Mapping from NVS to VegX and VegBIEN
Will require finding someone at NVS willing to work with Aaron and mappings and validations Aaron Marcuse-Kubitza
02:25 PM Task #300: TurboVeg data
TurboVeg info from Bob Peet is on nimoy in @/home/bien_shared/raw_data/turboveg/@ Aaron Marcuse-Kubitza
02:07 PM Task #300 (New): TurboVeg data
with commitment by someone familiar with DB to work with Aaron on (a) evaluating mappings, and (b) developing validat... Aaron Marcuse-Kubitza
02:07 PM Task #301 (Resolved): RAINFOR data
with commitment by someone familiar with DB to work with Aaron on (a) evaluating mappings, and (b) developing validat... Aaron Marcuse-Kubitza
02:06 PM Task #298 (New): Try to find source of DwCA (DwC Archives) data
hoping GBIF will be willing to work with us on this. Possibly approach Remsen directly Aaron Marcuse-Kubitza
02:05 PM Task #297 (Resolved): Request new data dump of specimen data from GBIF, this time in DwC format
Aaron Marcuse-Kubitza
02:03 PM Task #296 (Resolved): Direct mapping from native salvias_plots MySQL database to VegBIEN
Aaron Marcuse-Kubitza
01:59 PM Task #289: look for formal mapping mechanism
Got NVS mapping tool from Nick Spencer, which is on nimoy in @/home/bien_shared/raw_data/nvs/VegX/@ Aaron Marcuse-Kubitza
01:54 PM Task #286: CSV-XML-database mapping script
Added support for database and XML inputs Aaron Marcuse-Kubitza
01:46 PM Revision 167: map: Use row's index instead of pkey as ID in XML output
Aaron Marcuse-Kubitza
01:45 PM Revision 166: test/map: Compare via-VegX output to direct output
Aaron Marcuse-Kubitza
01:13 PM Revision 165: xpath.py: Changed order that main and other branches are processed in so it is consistent with the order the branches are specified in the XPath
Aaron Marcuse-Kubitza
12:03 PM Revision 164: map: Handle metadata in order with regular mappings
Aaron Marcuse-Kubitza
11:33 AM Revision 163: Accepted VegBank test output for new CSV mapping order
Aaron Marcuse-Kubitza
11:26 AM Revision 162: map: Changed CSV input to process mappings in the order they are in the spreadsheet, rather than the order of the CSV columns
Aaron Marcuse-Kubitza

12/13/2011

05:18 PM Revision 161: map: Added support for XML input
Aaron Marcuse-Kubitza
05:17 PM Revision 160: Accepted new test output for sorted SALVIAS_db-VegBank mapping
Aaron Marcuse-Kubitza
05:07 PM Revision 159: mappings to VegBank: Sorted by output column to help VegX-VegBank conversion put elements in the same order as source-VegBank
Aaron Marcuse-Kubitza
04:57 PM Revision 158: join_all_vegbank: Updated to sort output maps
Aaron Marcuse-Kubitza
04:57 PM Revision 157: Added script to sort a spreadsheet
Aaron Marcuse-Kubitza
03:51 PM Revision 156: xpath.py: Allowed empty names in XPaths
Aaron Marcuse-Kubitza
03:48 PM Revision 155: xpath.py: Added automatic conversion of strings to paths where needed.
Aaron Marcuse-Kubitza
03:00 PM Revision 154: xpath.py: Added caching of parsed XPaths. Added automatic conversion of strings to paths where needed.
Aaron Marcuse-Kubitza
02:59 PM Revision 153: Added __str__() method to XML nodes
Aaron Marcuse-Kubitza
02:58 PM Revision 152: Fixed VegX-VegBank mapping syntax error
Aaron Marcuse-Kubitza
02:17 PM Revision 151: Added faded beginning of string in Parser syntax errors
Aaron Marcuse-Kubitza
02:01 PM Revision 150: Updated mappings Makefile
Aaron Marcuse-Kubitza
02:00 PM Revision 149: Added Makefiles for scripts and test
Aaron Marcuse-Kubitza
01:54 PM Revision 148: Added mappings Makefile
Aaron Marcuse-Kubitza
12:28 PM Revision 147: Added human-readable SALVIAS_db mappings
Aaron Marcuse-Kubitza
12:28 PM Revision 146: db_xml.get(): Pass limit through to SQL query
Aaron Marcuse-Kubitza
12:11 PM Task #295 (Resolved): provide benchmark queries for NYBG data
Brad Boyle provided NYBG queries, which are on the wiki under [[NYBG tests]] Aaron Marcuse-Kubitza
12:10 PM Task #290: benchmark tests for database loading
Brad Boyle provided NYBG queries, which are on the wiki under [[NYBG tests]] Aaron Marcuse-Kubitza
11:57 AM Task #291: list of milestones
updated to do list Aaron Marcuse-Kubitza
11:56 AM Task #291: list of milestones
Brad created a timeline, which is on the wiki under [[December 8 2011 WebEx meeting]]. Aaron Marcuse-Kubitza
11:48 AM Revision 145: Regenerated human-readable mappings
Aaron Marcuse-Kubitza

12/12/2011

05:39 PM Revision 144: Fixed documentation for xml_funcs
Aaron Marcuse-Kubitza
05:38 PM Revision 143: Refactored xml_dom.set_value() to avoid needing a doc parameter for the XML document
Aaron Marcuse-Kubitza
05:35 PM Revision 142: xpath.py: Refactored xml_func.py to avoid needing a doc parameter for the XML document
Aaron Marcuse-Kubitza
05:30 PM Revision 141: xpath.py: Refactored to avoid needing a doc parameter for the XML document
Aaron Marcuse-Kubitza
04:41 PM Revision 140: Fixed DB input to ignore NULL values
Aaron Marcuse-Kubitza
04:27 PM Revision 139: xml_dom.py: Changed all uses of name_of(node) to node.tagName
Aaron Marcuse-Kubitza
04:23 PM Revision 138: Made XML node names case-sensitive
Aaron Marcuse-Kubitza
04:20 PM Revision 137: mappings to VegBank: Fixed incorrect mappings found after disabling heuristic search for missing fields
Aaron Marcuse-Kubitza
03:49 PM Revision 136: test/map: Ignore diff exit status
Aaron Marcuse-Kubitza
03:34 PM Revision 135: map: Implemented DB input support for querying a single table
Aaron Marcuse-Kubitza

12/09/2011

05:36 PM Revision 134: Added SALVIAS_db test accepted output
Aaron Marcuse-Kubitza
05:35 PM Revision 133: map: Continued to add DB input support
Aaron Marcuse-Kubitza
04:54 PM Revision 132: test/map: Echo command used to import db config
Aaron Marcuse-Kubitza
04:02 PM Revision 131: Added support for multiple database engines. Changed SALVIAS_db input to use user-entered password.
Aaron Marcuse-Kubitza
01:58 PM Revision 130: map: Allow db config vars to be optional. SALVIAS_db test: Changed to use salvias_plots and XPath mapping syntax.
Aaron Marcuse-Kubitza
01:32 PM Revision 129: Renamed SALVIAS_db test input to use organisms table
Aaron Marcuse-Kubitza
01:29 PM Revision 128: Re-committed accepted_outputs
Aaron Marcuse-Kubitza
01:23 PM Revision 127: Renamed test/map output to remove CSV/DB indicator because that is now specified in the datasource name
Aaron Marcuse-Kubitza
01:18 PM Revision 126: map: Started adding database get by XPath functionality
Aaron Marcuse-Kubitza

12/08/2011

06:48 PM Revision 125: format_for_review: Fixed bug where Comments column would be reformatted in addition to mappings columns
Aaron Marcuse-Kubitza
05:38 PM Task #285: CSV to XML mappings for NYBG, SALVIAS
To make it easier to review the mappings, I created human-readable versions "in Subversion":https://projects.nceas.uc... Aaron Marcuse-Kubitza
05:35 PM Revision 124: Regenerated human-readable mappings
Aaron Marcuse-Kubitza
05:15 PM Revision 123: Added human-readable versions of mappings and scripts to generate them
Aaron Marcuse-Kubitza
05:14 PM Revision 122: VegX-VegBank mapping: Removed a duplicated mapping
Aaron Marcuse-Kubitza
05:13 PM Revision 121: NYBG-VegX mapping: Added conference call feedback
Aaron Marcuse-Kubitza
03:12 PM Task #295 (Resolved): provide benchmark queries for NYBG data
Aaron Marcuse-Kubitza
03:11 PM Task #294 (Resolved): find plot data source provider to work with Aaron
Aaron Marcuse-Kubitza
01:48 PM Revision 120: Added Comments column with Brad's and Aaron's comments to mapping spreadsheets
Aaron Marcuse-Kubitza
01:07 PM Task #290: benchmark tests for database loading
Brad Boyle provided SALVIAS queries, which are on the wiki under [[SALVIAS tests]] Aaron Marcuse-Kubitza

12/07/2011

05:01 PM Revision 119: Added stub for SALVIAS database test
Aaron Marcuse-Kubitza
05:00 PM Revision 118: test/map: Added support for database input
Aaron Marcuse-Kubitza
04:14 PM Revision 117: Preparing map to input from DB
Aaron Marcuse-Kubitza
04:05 PM Task #288: VegX-VegBank mapping
If you would like to browse the @vegbank@ database on nimoy, you can now use "phpPgAdmin":http://bien.nceas.ucsb.edu/... Aaron Marcuse-Kubitza
03:47 PM Task #293: mapping inversion script
updated transformations Aaron Marcuse-Kubitza
01:43 PM Task #293: mapping inversion script
suggested name and location: @svn/scripts/util/invert_map@ Aaron Marcuse-Kubitza
01:41 PM Task #293 (New): mapping inversion script
A Python script to invert a mapping spreadsheet. This will be useful for mapping VegBank to VegX, so that we can just... Aaron Marcuse-Kubitza
03:32 PM Revision 116: Started preparing map to input from DB
Aaron Marcuse-Kubitza
02:18 PM Task #289: look for formal mapping mechanism
*"RDF SPARQL":http://en.wikipedia.org/wiki/SPARQL:*
* SELECT-style queries for RDF data
* uses concise Turtle syn...
Aaron Marcuse-Kubitza
02:01 PM Task #289: look for formal mapping mechanism
*"IBM Clio":http://www.almaden.ibm.com/cs/projects/criollo/:*
* "Clio then also interprets these mappings to const...
Aaron Marcuse-Kubitza
01:23 PM Task #289: look for formal mapping mechanism
updated to do list Aaron Marcuse-Kubitza
01:11 PM Task #289: look for formal mapping mechanism
*"Bourret's XML-ER mapping":http://rpbourret.com/:*
* *summary: his various mapping methods are already used by Ve...
Aaron Marcuse-Kubitza
12:45 PM Task #289: look for formal mapping mechanism
*XQuery:*
* "XQuery Tutorial":http://www.w3schools.com/xquery/default.asp
** XQuery iterates over XML documents stor...
Aaron Marcuse-Kubitza
12:38 PM Task #289: look for formal mapping mechanism
*Altova XMLSpy's graphical generation of XPaths:*
* *summary: XMLSpy and Oxygen XML both have Copy XPath commands (O...
Aaron Marcuse-Kubitza
01:27 PM Revision 115: xml_func.py: Added optimization to first check if function name starts with _ before looking it up in the table
Aaron Marcuse-Kubitza
12:24 PM Revision 114: Added _alt functions for mappings to VegBank authorPlotCode
Aaron Marcuse-Kubitza
12:17 PM Revision 113: xml_func.py: Added _alt function to choose between alternative values and used it for the collector plantName mapping
Aaron Marcuse-Kubitza
11:54 AM Revision 112: VegX-VegBank mapping: Added mapping from taxonName/Simple (NYBG ScientificName) to collector plantName so that collector plantName will always have a value
Aaron Marcuse-Kubitza
11:27 AM Revision 111: xml_func.py: Added support for decimal years (with day as the fraction)
Aaron Marcuse-Kubitza
11:16 AM Revision 110: test/map: Added echoing of commands run
Aaron Marcuse-Kubitza

12/06/2011

04:34 PM Task #288: VegX-VegBank mapping
A mostly working VegX-VegBank mapping is now available is now available in svn at https://projects.nceas.ucsb.edu/nce... Aaron Marcuse-Kubitza
04:31 PM Task #287 (Resolved): XML to database conversion script (merged into CSV-XML-database mapping script)
Aaron Marcuse-Kubitza
04:28 PM Task #285: CSV to XML mappings for NYBG, SALVIAS
Working NYBG-VegBank mappings are now available in svn at https://projects.nceas.ucsb.edu/nceas/projects/bien/reposit... Aaron Marcuse-Kubitza
04:26 PM Task #286: CSV-XML-database mapping script
We are now able to import NYBG data directly into VegBank, using the map2vegbank script on nimoy at @/home/bien_share... Aaron Marcuse-Kubitza
04:19 PM Revision 109: Added psql_vegbank to easily access vegbank db from the command line
Aaron Marcuse-Kubitza
04:07 PM Revision 108: Ignore OpenOffice lock files in mappings
Aaron Marcuse-Kubitza
04:05 PM Revision 107: Added SALVIAS data CSVs and accepted test output
Aaron Marcuse-Kubitza
03:52 PM Revision 106: test/map: Expanded to include all input CSVs in test/input
Aaron Marcuse-Kubitza
03:31 PM Revision 105: Removed unneeded joins dir
Aaron Marcuse-Kubitza
03:30 PM Revision 104: Moved VegBank mapping joins to main mappings dir so they would have similar paths for the upcoming all-sources tester
Aaron Marcuse-Kubitza
03:11 PM Revision 103: Moved test scripts and files from util to test
Aaron Marcuse-Kubitza
02:50 PM Revision 102: xml_func.py: Added _namepart function for extracting parts of names
Aaron Marcuse-Kubitza
02:11 PM Revision 101: Finished NYBG mapping to VegBank\!
Aaron Marcuse-Kubitza
02:04 PM Revision 100: test_map: Added debug option to print VegBank XML instead of importing it into the database
Aaron Marcuse-Kubitza
01:34 PM Revision 99: xpath.py: Created is_positive() function
Aaron Marcuse-Kubitza
01:28 PM Revision 98: Further refinements to mappings to support database constraints
Aaron Marcuse-Kubitza
01:27 PM Revision 97: xpath.py: Added support for negative attribute assertions with !
Aaron Marcuse-Kubitza
10:54 AM Revision 96: Changed mappings to use keys vs. attrs
Aaron Marcuse-Kubitza
10:53 AM Revision 95: xpath.py: Fixed creation of attrs so it happens even when node already exists
Aaron Marcuse-Kubitza
09:59 AM Revision 94: xpath.py: Added concept of keys vs attrs in XPath elem
Aaron Marcuse-Kubitza

12/05/2011

05:25 PM Revision 93: Started filling in required values for VegBank fields in mappings. Will need to refactor to move these to metadata for the datasources.
Aaron Marcuse-Kubitza
05:24 PM Revision 92: Now allow empty rows. Added support for select statement limit.
Aaron Marcuse-Kubitza
04:17 PM Revision 91: Added support for quoted values in XPaths
Aaron Marcuse-Kubitza
04:02 PM Revision 90: Fixed name XML function. Fixed accept_test_output.
Aaron Marcuse-Kubitza
03:59 PM Revision 89: Added support for name XML function. Added error handling for empty rows.
Aaron Marcuse-Kubitza
03:28 PM Revision 88: Made it easier to accept test output
Aaron Marcuse-Kubitza
03:18 PM Revision 87: Added NYBG stemCount metadata
Aaron Marcuse-Kubitza
03:11 PM Revision 86: Added xml_func.py to process mappings whose output needs postprocessing
Aaron Marcuse-Kubitza
01:53 PM Revision 85: Changed VegBank mappings to use XML functions (not implemented yet) to calculate averages and ranges
Aaron Marcuse-Kubitza
01:25 PM Revision 84: Added support for mapping datasource metadata
Aaron Marcuse-Kubitza
12:53 PM Revision 83: Changed for loops to use enumerate() where the index is also needed
Aaron Marcuse-Kubitza
12:50 PM Revision 82: Moved XPath prep code (setting ID, value) to xpath.py
Aaron Marcuse-Kubitza
12:14 PM Task #292: VegBank metadata query mechanism
(Moved to issue description) Aaron Marcuse-Kubitza
12:13 PM Task #292 (New): VegBank metadata query mechanism
For data discovery of VegBank schema.
Mike Lee's suggestion: (e-mail on 2011-11-9)
I'm wondering if you all tal...
Aaron Marcuse-Kubitza
12:09 PM Task #289: look for formal mapping mechanism
Mike Lee's explanation of the VegBank XML serialization format: (e-mail on 2011-12-2)
My recollection is that our in...
Aaron Marcuse-Kubitza

12/02/2011

05:27 PM Revision 81: xpath.py: Added deepcopy() before setting value of other branches to traverse
Aaron Marcuse-Kubitza
05:12 PM Revision 80: NYSpecimenDataAmericas.test.xml: Updated for new NYBG-VegX.organisms.csv
Aaron Marcuse-Kubitza
05:11 PM Revision 79: NYBG-VegX.organisms.csv: Changed voucher (primary key) column to be UniqueNYInternalRecordNumber because CatalogNumber contained an empty value
Aaron Marcuse-Kubitza
05:10 PM Revision 78: xpath.py: Added basic support for split paths
Aaron Marcuse-Kubitza
04:30 PM Revision 77: Merged xml_xpath.py into xpath.py in preparation for changing the XPath parse tree to be the XML DOM tree itself
Aaron Marcuse-Kubitza
03:58 PM Revision 76: Refactored xpath.parse() to use a nested function instead of a class extending Parser
Aaron Marcuse-Kubitza
03:04 PM Revision 75: map: Fixed mislocated import for Parser.SyntaxException
Aaron Marcuse-Kubitza
02:21 PM Revision 74: Removed SALVIAS voucher_string mapping per conference call discussion
Aaron Marcuse-Kubitza
02:16 PM Revision 73: map: Fixed bugs to enable mapping straight from CSV to a database. Still need a way to set plot.authorPlotCode for specimens data.
Aaron Marcuse-Kubitza
12:05 PM Revision 72: Fixed ch_map_root to support subpaths which follow the root by -> rather than /. Changed spreadsheet syntax to have : between label and root.
Aaron Marcuse-Kubitza

12/01/2011

04:44 PM Task #291: list of milestones
I put the tasks from today's conference call into the "Redmine issue tracker":https://projects.nceas.ucsb.edu/nceas/p... Aaron Marcuse-Kubitza
04:35 PM Task #291 (Resolved): list of milestones
*Conference call:*
* -need list of milestones for the next 6-12 months-
* -*add conference call tasks to Redmine ...
Aaron Marcuse-Kubitza
04:34 PM Task #290 (Resolved): benchmark tests for database loading
*Conference call:*
* *develop benchmark tests to check that datasource data was inserted correctly into VegBank*
...
Aaron Marcuse-Kubitza
04:32 PM Task #289 (Resolved): look for formal mapping mechanism
*Conference call:*
* look into VegBranch's way of capturing mappings and metadata
* -look into Altova XMLSpy's gr...
Aaron Marcuse-Kubitza
04:29 PM Task #285: CSV to XML mappings for NYBG, SALVIAS
*Conference call:*
* Ignore SALVIAS @voucher_string@ because it is sometimes missing collector's name
* Ignore SALVI...
Aaron Marcuse-Kubitza
10:50 AM Task #285: CSV to XML mappings for NYBG, SALVIAS
The latest mapping spreadsheets for datasources->VegX and VegX->VegBank are now available in svn at https://projects.... Aaron Marcuse-Kubitza
01:55 PM Revision 71: Updated extract_plot_map to use new name for VegX-VegBank mapping and re-ran it and join_all_vegbank
Aaron Marcuse-Kubitza
01:51 PM Revision 70: Finished VegX-VegBank mapping and created VegBank joins of mappings to VegX
Aaron Marcuse-Kubitza
11:53 AM Revision 69: Finished ch_map_root (renamed from submap)
Aaron Marcuse-Kubitza
10:53 AM Task #286: CSV-XML-database mapping script
I've merged data2xml and xml2db into one script called map, which can be run on nimoy at /home/bien_shared/svn/script... Aaron Marcuse-Kubitza
10:52 AM Task #288: VegX-VegBank mapping
The mapping spreadsheet for VegX->VegBank is now available in svn at https://projects.nceas.ucsb.edu/nceas/projects/b... Aaron Marcuse-Kubitza

11/30/2011

05:36 PM Revision 68: Added submap and extract_plot_map to extract plot subpaths from VegX-VegBank.csv
Aaron Marcuse-Kubitza
04:56 PM Revision 67: Moved env usage string creation to opts.py. Changed db config var names to use in/out instead of from/to.
Aaron Marcuse-Kubitza
04:24 PM Revision 66: Keep *.test.xml out of version control
Aaron Marcuse-Kubitza
04:22 PM Revision 65: Moved options-processing code to opts.py: Added opts.py
Aaron Marcuse-Kubitza
04:21 PM Revision 64: Moved options-processing code to opts.py
Aaron Marcuse-Kubitza
04:04 PM Revision 63: test_map: Compares generated XML to correct version
Aaron Marcuse-Kubitza
03:55 PM Revision 62: Fixed xml_xpath.get() last_only optimization to handle attrs correctly. Turned off stack traces for errors intended for the user to see.
Aaron Marcuse-Kubitza
02:32 PM Revision 61: Changed mappings to place prefix common to all XPaths in the column header
Aaron Marcuse-Kubitza
01:40 PM Task #288 (Resolved): VegX-VegBank mapping
CSV spreadsheet mapping VegX to VegBank Aaron Marcuse-Kubitza
01:37 PM Task #287: XML to database conversion script (merged into CSV-XML-database mapping script)
Merged into CSV-XML-database mapping script Aaron Marcuse-Kubitza
01:35 PM Task #286: CSV-XML-database mapping script
Merged in XML to database conversion script Aaron Marcuse-Kubitza
01:31 PM Revision 60: simplify_xpath: Made it case-insensitive
Aaron Marcuse-Kubitza
01:25 PM Revision 59: map: Added support for custom fkeys to parent in db XML trees. Removed extraneous csv reader/writer config because Excel format is default. Improved documentation.
Aaron Marcuse-Kubitza

11/29/2011

05:36 PM Revision 58: map: Added stub for database input
Aaron Marcuse-Kubitza
05:33 PM Revision 57: map: Added more stubs for XML-XML mapping
Aaron Marcuse-Kubitza
05:15 PM Revision 56: Started adding XML-XML mapping support to map
Aaron Marcuse-Kubitza
04:43 PM Revision 55: Split off xpath.py XML functionality into xml_xpath.py
Aaron Marcuse-Kubitza
04:28 PM Revision 54: map: Using SystemExit for usage errors to avoid stack trace
Aaron Marcuse-Kubitza
04:22 PM Revision 53: Merged data2xml and xml2db into map
Aaron Marcuse-Kubitza
03:03 PM Revision 52: Removed trailing whitespace from VegX-VegBank.csv map
Aaron Marcuse-Kubitza
02:59 PM Revision 51: Created join_maps to join two 2-column map spreadsheets
Aaron Marcuse-Kubitza
02:11 PM Revision 50: Renamed mappings to be compatible with Redmine allowed characters in attachment filenames
Aaron Marcuse-Kubitza
01:59 PM Revision 49: Added refactored mappings and changed data2xml to use the new 2-column format
Aaron Marcuse-Kubitza
01:25 PM Revision 48: Refactored db_xml.py's db insertion function to avoid extra nested functions
Aaron Marcuse-Kubitza
01:06 PM Revision 47: Added README.TXT
Aaron Marcuse-Kubitza
01:02 PM Revision 46: Renamed modules to remove _util
Aaron Marcuse-Kubitza
12:47 PM Revision 45: Added svn:ignore for *.pyc
Aaron Marcuse-Kubitza
12:42 PM Revision 44: Renamed xml2db_ and data2xml_ to remove _
Aaron Marcuse-Kubitza
12:42 PM Revision 43: Moved scripts to main directory and associated files to util
Aaron Marcuse-Kubitza
12:31 PM Revision 42: Moved Python modules to shared lib folder
Aaron Marcuse-Kubitza

11/28/2011

05:32 PM Revision 41: xml2db: Started refactoring xml2db() to support getting as well as inserting data
Aaron Marcuse-Kubitza
05:29 PM Revision 40: xml2db: Started refactoring xml2db() to support getting as well as inserting data
Aaron Marcuse-Kubitza
05:05 PM Revision 39: xml2db: Changed to return ID (pkey) of inserted record and use this returned value as parent_id instead of getting the parent_id from the parent XML node
Aaron Marcuse-Kubitza
03:16 PM Revision 38: data2xml: Added syntax for split paths, which map to multiple leaves
Aaron Marcuse-Kubitza
01:52 PM Revision 37: xml2db: Improved empty_db to use TRUNCATE instead of DROP DATABASE. Added xml2vegbank to automatically set db env vars.
Aaron Marcuse-Kubitza
01:51 PM Revision 36: data2xml: Improved syntax for XPath lookahead assertions. Changed XML printing to print multiple text nodes on separate lines.
Aaron Marcuse-Kubitza
12:15 PM Revision 35: Moved vegbank_example_ver1.0.2.xml to xml2db, where it should have been
Aaron Marcuse-Kubitza

11/23/2011

05:39 PM Task #286: CSV-XML-database mapping script
I updated the data2xml script (demo at nimoy:/home/bien_shared/svn/scripts/data2xml/test) to support the pointer form... Aaron Marcuse-Kubitza
05:37 PM Task #285: CSV to XML mappings for NYBG, SALVIAS
I made a number of changes to the NYBG and SALVIAS mappings to support VegX's concept of pointers between objects. Aaron Marcuse-Kubitza
05:22 PM Revision 34: data2xml: Small correction to NYBG mapping
Aaron Marcuse-Kubitza
04:58 PM Revision 33: data2xml: Created simplify_xpath script to remove duplication from XPath expressions
Aaron Marcuse-Kubitza
04:15 PM Revision 32: data2xml: Added support for * abbrs for backward (child-to-parent) pointers
Aaron Marcuse-Kubitza
02:52 PM Revision 31: In data2xml, fixed determination of which nesting level to put IDs on
Aaron Marcuse-Kubitza
02:45 PM Revision 30: Simplified expansion of * abbrs
Aaron Marcuse-Kubitza
02:23 PM Revision 29: Removed no longer necessary strip() from node value getter
Aaron Marcuse-Kubitza
02:22 PM Revision 28: Added patch for xml.dom.minidom.Element.writexml to avoid adding extra whitespace around text nodes
Aaron Marcuse-Kubitza
12:45 PM Revision 27: Added pointer field name abbreviations to data2xml and NYBG mappings
Aaron Marcuse-Kubitza
 

Also available in: Atom