/ - Changes - BIEN 3 - NCEAS Projects

root @ 1570

#	Date	Author	Comment
1570	03/23/2012 03:31 PM	Aaron Marcuse-Kubitza	bin/map: Added section comments to env var config retrieval. Reordered env var config retrieval to put DB config last, since these options are input-type specific and complex, and putting them first hides the more general other options.
1569	03/23/2012 03:29 PM	Aaron Marcuse-Kubitza	inputs/SALVIAS*/maps/VegX.plots.csv: Updated _units for % -> decimal conversion to use new syntax
1568	03/23/2012 03:20 PM	Aaron Marcuse-Kubitza	inputs/SALVIAS*/maps/VegX.plots.csv: Updated _units for % -> decimal conversion to use new syntax
1567	03/23/2012 03:19 PM	Aaron Marcuse-Kubitza	xml_func.py: _units: If value can't be converted to float, wrap the ValueError in a SyntaxException
1566	03/23/2012 03:18 PM	Aaron Marcuse-Kubitza	units.py: convert(): Added support for unit conversions. Added initial unit conversion for % -> unitless. str2quantity(): Fixed regexp to match % as units. Set Quantity.__repr__ to quantity2str.
1565	03/23/2012 03:03 PM	Aaron Marcuse-Kubitza	units.py: convert(): Put "units None" test after "quantity.units units" test because a destination of no units might require a conversion for some input units (e.g. % -> unitless requires a division by 100)
1564	03/23/2012 02:51 PM	Aaron Marcuse-Kubitza	inputs/SALVIAS*/maps/VegX.organisms.csv: Habit: Ignore invalid values instead of generating a SyntaxException
1563	03/23/2012 02:47 PM	Aaron Marcuse-Kubitza	xml_dom.py: minidom modifications: Escape as many text strings as we use directly. This still leaves the tagName used by xml.dom.minidom.Element.writexml: It uses 'writer.write(indent+"<" + self.tagName)' and doesn't escape the tagName.
1562	03/23/2012 02:39 PM	Aaron Marcuse-Kubitza	xml_func.py: Made everything Unicode-safe by using strings.ustr instead of str
1561	03/23/2012 12:48 PM	Aaron Marcuse-Kubitza	schemas/tree_cross-links.sql: Added comment for how to get the namedplace trigger from the provided plantname trigger
1560	03/23/2012 12:44 PM	Aaron Marcuse-Kubitza	vegbien.sql: Fixed bug in tree cross-link algorithm where recursion to descendants' ancestors did not use new to refer to the current node's plantname_id
1559	03/23/2012 12:39 PM	Aaron Marcuse-Kubitza	vegbien.sql: Fixed bug in tree cross-link algorithm to also insert ancestors for top-level nodes, because they now need an ancestor entry for themselves
1558	03/23/2012 12:28 PM	Aaron Marcuse-Kubitza	Added separate SQL file for tree cross-links code. A link to this can be e-mailed to people to review.
1557	03/23/2012 12:21 PM	Aaron Marcuse-Kubitza	vegbien.sql: Modified tree cross-link algorithm to add an "ancestor" for this node. This is useful for queries, because you don't have to separately test if the leaf node is the one you're looking for, in addition to that leaf node's ancestors.
1556	03/22/2012 07:08 PM	Aaron Marcuse-Kubitza	README.TXT: Added instructions how to stop all running imports
1555	03/22/2012 06:59 PM	Aaron Marcuse-Kubitza	vegbien.sql: Added namedplace_update_ancestors and plantname_update_ancestors triggers to populate ancestor cross-links in new namedplace_ancestor and plantname_ancestor tables
1554	03/22/2012 06:07 PM	Aaron Marcuse-Kubitza	sql.py: insert() (and try_insert()): Added optional returning param to provide name of an inserted column (usually pkey) to return
1553	03/22/2012 05:41 PM	Aaron Marcuse-Kubitza	env_password: Print Usage message if run without initial "."
1552	03/22/2012 05:34 PM	Aaron Marcuse-Kubitza	Added bin/stop_imports to stop all running imports
1551	03/22/2012 05:33 PM	Aaron Marcuse-Kubitza	import_all: Print Usage message if was run without initial "."
1550	03/22/2012 04:52 PM	Aaron Marcuse-Kubitza	Renamed import-all to import_all to match convention of using underscores
1549	03/22/2012 04:39 PM	Aaron Marcuse-Kubitza	inputs/CTFS: Added remaining non-data src files
1548	03/22/2012 04:35 PM	Aaron Marcuse-Kubitza	Added CTFS data dictionary inputs/CTFS/src/ctfs-comments_worksheet.xls
1547	03/22/2012 04:33 PM	Aaron Marcuse-Kubitza	import-all: Fixed to display the datasource name in the job name instead of 'make ${input}import &'
1546	03/20/2012 11:13 PM	Aaron Marcuse-Kubitza	import-all: disown each new import process to ignore SIGHUP
1545	03/20/2012 11:06 PM	Aaron Marcuse-Kubitza	Added jobspecs to extract jobspecs (%#) from (possibly filtered) `jobs` output
1544	03/20/2012 11:05 PM	Aaron Marcuse-Kubitza	README.TXT: Changed `make import &` to `. bin/import-all`
1543	03/20/2012 11:05 PM	Aaron Marcuse-Kubitza	README.TXT: Changed `make import &` to `. bin/import-all`
1542	03/20/2012 10:39 PM	Aaron Marcuse-Kubitza	main Makefile: import: Before running imports, print message that `. bin/import-all` can be used to import all inputs at once
1541	03/20/2012 10:38 PM	Aaron Marcuse-Kubitza	Added import-all to import all inputs at once
1540	03/20/2012 10:20 PM	Aaron Marcuse-Kubitza	mappings/DwC2-VegBIEN.specimens.csv: Mapped establishmentMeans, which contains growthform, iscultivated, isnative, etc. combined
1539	03/20/2012 10:11 PM	Aaron Marcuse-Kubitza	inputs/SALVIAS-CSV/maps/VegX.organisms.csv: habit: Updated mapping to match equivalent SALVIAS mapping
1538	03/20/2012 10:10 PM	Aaron Marcuse-Kubitza	xml_func.py: _map: Instead of _closed special entry, make all maps closed by default and open them if special entry "=" is present. Support using a _map to filter values by interpreting special entry "=" as removing all values not explicitly specified, and by interpreting special value "" as keeping input value the same.
1537	03/20/2012 10:08 PM	Aaron Marcuse-Kubitza	xml_func.py: _map: Instead of _closed special entry, make all maps closed by default and open them if special entry "=" is present. Support using a _map to filter values by interpreting special entry "=" as removing all values not explicitly specified, and by interpreting special value "" as keeping input value the same.
1536	03/20/2012 09:19 PM	Aaron Marcuse-Kubitza	xml_func.py: _date: On error "month must be in 1..12", try swapping month and day
1535	03/20/2012 09:13 PM	Aaron Marcuse-Kubitza	xml_func.py: _date: On error "month must be in 1..12", try swapping month and day
1534	03/20/2012 08:36 PM	Aaron Marcuse-Kubitza	row: Support getting multiple rows. Document that does not handle embedded newlines.
1533	03/20/2012 08:19 PM	Aaron Marcuse-Kubitza	mappings/Makefile: Removed no longer needed DwC-VegBIEN.specimens.no_empty.csv
1532	03/20/2012 08:18 PM	Aaron Marcuse-Kubitza	input.Makefile: Removed no longer needed $(join) command
1531	03/20/2012 08:15 PM	Aaron Marcuse-Kubitza	input.Makefile: Removed no longer needed src join maps
1530	03/20/2012 08:12 PM	Aaron Marcuse-Kubitza	input.Makefile: Generate VegBIEN maps from full via maps in order to include all input columns if a src map was provided. This causes the VegBIEN join process to produce all the "No join mapping" errors for that datasource, not just those for fields in the (non-full) via map. maps/src.join.*.csv should no longer be needed for producing "No join mapping" errors.
1529	03/20/2012 08:03 PM	Aaron Marcuse-Kubitza	mappings/Makefile: Generate DwC-VegBIEN.specimens.csv from new intermediate DwC.ci-VegBIEN.specimens.csv using $(removeEmpty) so that "No join mapping" errors will be reported when maps are joined to it. Deprecate DwC-VegBIEN.specimens.no_empty.csv because it's now identical to DwC-VegBIEN.specimens.csv.
1528	03/20/2012 07:45 PM	Aaron Marcuse-Kubitza	Added inputs/NY/maps/src.specimens.csv
1527	03/20/2012 07:41 PM	Aaron Marcuse-Kubitza	Added reverse_join to inner-join two map spreadsheets in the opposite order they are specified in
1526	03/20/2012 07:36 PM	Aaron Marcuse-Kubitza	input.Makefile: Intersect the generated VegBIEN and full via maps with the src map, if it exists. This reduces the size of the autogen maps significantly by including only the entries used by the datasource.
1525	03/20/2012 07:34 PM	Aaron Marcuse-Kubitza	intersect: Compare columns based on specified compare_col_nums, just like subtract
1524	03/20/2012 06:50 PM	Aaron Marcuse-Kubitza	input.Makefile: Use var $(selfMap) instead of spelling out $(bin)/cols 0 0
1523	03/20/2012 06:36 PM	Aaron Marcuse-Kubitza	mappings/DwC2-VegBIEN.specimens.csv: Mapped continent
1522	03/20/2012 06:20 PM	Aaron Marcuse-Kubitza	inputs/SpeciesLink/maps/DwC.specimens.csv: Mapped remaining fields
1521	03/20/2012 06:19 PM	Aaron Marcuse-Kubitza	inputs/SpeciesLink/maps/DwC.specimens.csv: Mapped remaining fields
1520	03/20/2012 06:08 PM	Aaron Marcuse-Kubitza	inputs/SpeciesLink/maps/src.specimens.csv: Fixed bug where prefixes had not been removed from fields, which prevented join mappings from being found for any of the fields
1519	03/20/2012 06:08 PM	Aaron Marcuse-Kubitza	main Makefile: Added missing_joins to determine which input fields are missing join mappings
1518	03/20/2012 05:47 PM	Aaron Marcuse-Kubitza	xml_func.py: SyntaxException: Inherit from exc.ExceptionWithCause so the traceback will be populated with the cause's traceback instead of the SyntaxException wrapper's traceback
1517	03/20/2012 05:35 PM	Aaron Marcuse-Kubitza	Added inputs/UNCC/test with accepted test outputs
1516	03/20/2012 05:35 PM	Aaron Marcuse-Kubitza	Added inputs/UNCC/maps
1515	03/20/2012 05:34 PM	Aaron Marcuse-Kubitza	xml_func.py: _date: month: Convert month names to numbers before casting everything to int
1514	03/20/2012 05:27 PM	Aaron Marcuse-Kubitza	xml_func.py: _date: Refactored to convert items to dict right away, and use iteritems() for later type conversion. This will enable month names to be converted before casting everything to int.
1513	03/20/2012 04:47 PM	Aaron Marcuse-Kubitza	mappings/Makefile: Sort mappings/DwC.self.specimens.csv so that entries can more easily be found when using it as a DwC terms reference
1512	03/19/2012 09:55 PM	Aaron Marcuse-Kubitza	Added inputs/UNCC
1511	03/19/2012 09:50 PM	Aaron Marcuse-Kubitza	Added inputs/U/test with accepted test outputs
1510	03/19/2012 09:49 PM	Aaron Marcuse-Kubitza	inputs/U/maps/DwC.specimens.csv: Mapped most of the remaining fields
1509	03/19/2012 09:34 PM	Aaron Marcuse-Kubitza	input.Makefile: Clean up via maps when they change by subtracting the via format's self map from the via map (the comments column is ignored in determining which entries are redundant, and empty entries with a matching input column are also removed)
1508	03/19/2012 09:29 PM	Aaron Marcuse-Kubitza	subtract: Fixed bug where entries were removed even if maps were not combinable and ignore was off
1507	03/19/2012 09:27 PM	Aaron Marcuse-Kubitza	union: Fixed bug where combinable was not saved for use in deciding whether to add entries in map 1 that weren't already defined
1506	03/19/2012 09:25 PM	Aaron Marcuse-Kubitza	inputs/U/maps: Set svn props
1505	03/19/2012 09:20 PM	Aaron Marcuse-Kubitza	subtract: Also remove nonexplicit empty mappings whose input col is in map 1
1504	03/19/2012 09:15 PM	Aaron Marcuse-Kubitza	maps.py: Added is_nonexplicit_empty_mapping()
1503	03/19/2012 09:03 PM	Aaron Marcuse-Kubitza	subtract: Use new maps.combinable() to compare column headers, which allows more flexibility in combining maps
1502	03/19/2012 09:01 PM	Aaron Marcuse-Kubitza	union: Use new maps.combinable()
1501	03/19/2012 09:01 PM	Aaron Marcuse-Kubitza	maps.py: Added col_label() and combinable()
1500	03/19/2012 08:54 PM	Aaron Marcuse-Kubitza	union: Use new strings.overlaps()
1499	03/19/2012 08:53 PM	Aaron Marcuse-Kubitza	strings.py: Added overlaps()
1498	03/19/2012 08:46 PM	Aaron Marcuse-Kubitza	vegbien.sql: Fixed sytnax error in taxonclass enum: missing comma at end of element
1497	03/19/2012 08:38 PM	Aaron Marcuse-Kubitza	inputs//maps/DwC.specimens.csv: Ran through `cols ` to standardize CSV format to that generated by Python
1496	03/19/2012 08:35 PM	Aaron Marcuse-Kubitza	cols: If column number of "*" given, get all columns
1495	03/19/2012 08:32 PM	Aaron Marcuse-Kubitza	bin/subtract: If no compare columns given, compare on all columns instead of column 0
1494	03/19/2012 08:31 PM	Aaron Marcuse-Kubitza	util.py: list_subset(): Support special idxs value None, which returns entire list
1493	03/19/2012 08:22 PM	Aaron Marcuse-Kubitza	cat_csv: Added support for using - to cat stdin
1492	03/19/2012 08:18 PM	Aaron Marcuse-Kubitza	Added inputs/U/maps
1491	03/19/2012 07:32 PM	Aaron Marcuse-Kubitza	Added inputs/U
1490	03/19/2012 07:29 PM	Aaron Marcuse-Kubitza	Put inputs/REMIB/src/remib_raw.0.header.specimens.txt under version control
1489	03/19/2012 07:24 PM	Aaron Marcuse-Kubitza	Added inputs/REMIB/test with accepted test outputs
1488	03/19/2012 07:22 PM	Aaron Marcuse-Kubitza	Added inputs/REMIB/maps
1487	03/19/2012 07:20 PM	Aaron Marcuse-Kubitza	inputs/NCU-NCSC/maps/DwC.specimens.csv: Removed State->StateProvince mapping because that is now in mappings/DwC1-DwC2.specimens.csv
1486	03/19/2012 07:13 PM	Aaron Marcuse-Kubitza	mappings/DwC1-DwC2.specimens.csv: Added common DwC1 fields that are not part of the official DwC1 schema
1485	03/19/2012 06:51 PM	Aaron Marcuse-Kubitza	Added inputs/REMIB
1484	03/19/2012 06:09 PM	Aaron Marcuse-Kubitza	bin/map: Deal with fields that may be in the dataset under more than one prefix by getting all fields and coalesce()ing them (e.g. SpeciesLink has dwcore* and darwin1* columns for the same DwC field)
1483	03/19/2012 06:06 PM	Aaron Marcuse-Kubitza	util.py: Added coalesce()
1482	03/19/2012 05:40 PM	Aaron Marcuse-Kubitza	xpath_func.py: process(): Fixed bug where XPath elem's other_branches were not also processed
1481	03/19/2012 05:28 PM	Aaron Marcuse-Kubitza	row: Don't prepend header row because this feature prevents the program from being used on a pipeline. Sheets may be constructed in a pipeline if multiple segments need to be joined, e.g. with cat_csv.
1480	03/19/2012 05:09 PM	Aaron Marcuse-Kubitza	Added row to get a row of a spreadsheet, preceded by the header row
1479	03/19/2012 05:09 PM	Aaron Marcuse-Kubitza	bin programs: Fixed bug in Usage message where program name was not printed because unset variable $self was used instead of $0
1478	03/19/2012 05:08 PM	Aaron Marcuse-Kubitza	xml_func.py: _nullIf: types_by_name: Use strings.ustr instead of str to support Unicode values
1477	03/19/2012 04:40 PM	Aaron Marcuse-Kubitza	xml_func.py: _nullIf: If value not convertible, return it, because can't equal null. Refactored to store types by name in a dict instead of using if statements.
1476	03/19/2012 04:31 PM	Aaron Marcuse-Kubitza	units.py: convert(): raise MissingUnitsException if quantity doesn't have units. MissingUnitsException: Take Quantity input instead of str.
1475	03/19/2012 04:27 PM	Aaron Marcuse-Kubitza	inputs/NCU-NCSC/maps/DwC.specimens.csv: "Cultivated?": For clarity, use _map instead of _if to translate boolean to "cultivated". Translate "No" to "wild" (the opposite of "cultivated") to store an explicit not-cultivated as such.
1474	03/19/2012 04:26 PM	Aaron Marcuse-Kubitza	inputs/NCU-NCSC/maps/DwC.specimens.csv: "Cultivated?": For clarity, use _map instead of _if to translate boolean to "cultivated". Translate "No" to "wild" (the opposite of "cultivated") to store an explicit not-cultivated as such.
1473	03/19/2012 04:21 PM	Aaron Marcuse-Kubitza	xml_func.py: _map: empty map entry means None
1472	03/19/2012 04:10 PM	Aaron Marcuse-Kubitza	xml_func.py: _avg: Support empty inputs by returning None. Moved _range after _rangeStart/_rangeEnd since it's less frequently used.
1471	03/19/2012 04:07 PM	Aaron Marcuse-Kubitza	units.py: Restructured to use a Quantity object for the units-tagged value and conversion functions quantity2str() and str2quantity() to convert between that and a raw string. Added convert() with basic support for removing units and passing through matching units. xml_func.py: _units: Added "to" attr. VegBIEN mappings: Remove units using new _units "to" attr instead of temporary workaround in _units.

Project

General

Profile