/ - Changes - BIEN 3 - NCEAS Projects

root @ 1444

#	Date	Author	Comment
1444	03/16/2012 06:25 PM	Aaron Marcuse-Kubitza	csvs.py: Added stream_info() to return NamedTuple {header_line, dialect} for later use in cat_csv. Changed reader_and_header() to use stream_info().
1443	03/16/2012 06:23 PM	Aaron Marcuse-Kubitza	util.py: Added NamedTuple
1442	03/16/2012 06:04 PM	Aaron Marcuse-Kubitza	csvs.py: reader_and_header(): Restrict delimiters to common delimiters so that e.g. letters are not considered delimiters just because they appear frequently
1441	03/16/2012 05:38 PM	Aaron Marcuse-Kubitza	Renamed inputs/NYBG to inputs/NY to match herbarium code
1440	03/16/2012 05:35 PM	Aaron Marcuse-Kubitza	Renamed inputs/UNC-NCSC to inputs/NCU-NCSC to match herbarium code
1439	03/16/2012 05:32 PM	Aaron Marcuse-Kubitza	Renamed inputs/UArizona to inputs/ARIZ to match herbarium code
1438	03/16/2012 05:31 PM	Aaron Marcuse-Kubitza	Regenerated inputs/MO/maps/src.join.specimens.csv
1437	03/16/2012 05:26 PM	Aaron Marcuse-Kubitza	Renamed inputs/MOBOT to inputs/MO to match herbarium code
1436	03/16/2012 05:11 PM	Aaron Marcuse-Kubitza	Regenerated vegbien.ERD exports
1435	03/16/2012 05:08 PM	Aaron Marcuse-Kubitza	vegbien.sql: taxonoccurrence: Added cultivatedbasis
1434	03/16/2012 05:03 PM	Aaron Marcuse-Kubitza	vegbien.sql: Moved all accessioncode fields to the bottom of their tables. vegbien.ERD.mwb: Adjusted lines to remove overlaps.
1433	03/16/2012 04:52 PM	Aaron Marcuse-Kubitza	vegbien.sql: taxonoccurrence: Added iscultivated, isnative. Moved accessioncode to bottom.
1432	03/16/2012 04:36 PM	Aaron Marcuse-Kubitza	vegbien.sql: Changed taxonoccurrence.growthform type to more specific growthform
1431	03/16/2012 04:34 PM	Aaron Marcuse-Kubitza	vegbien.sql: Added growthform and establishmentmeans_dwc enums using values from taxonclass. Documented that taxonclass is growthform + establishmentmeans_dwc + some other values.
1430	03/16/2012 04:22 PM	Aaron Marcuse-Kubitza	VegBIEN: Moved aggregateoccurrence.growthform to taxonoccurrence
1429	03/16/2012 04:21 PM	Aaron Marcuse-Kubitza	Added inputs/UNC-NCSC/maps/src.join.specimens.csv
1428	03/16/2012 04:15 PM	Aaron Marcuse-Kubitza	VegBIEN: Merged aggregateoccurrence.verbatimcollectorname and specimenreplicate.verbatimcollectorname into taxonoccurrence
1427	03/16/2012 03:58 PM	Aaron Marcuse-Kubitza	xml_func.py: parse_range(): Handle negative numbers by treating them as not a range
1426	03/16/2012 03:31 PM	Aaron Marcuse-Kubitza	Added inputs/UNC-NCSC/test with initial accepted test outputs
1425	03/16/2012 03:31 PM	Aaron Marcuse-Kubitza	Added inputs/UNC-NCSC/maps
1424	03/16/2012 03:31 PM	Aaron Marcuse-Kubitza	xml_func.py: _replace: Fixed bug where value entry was not unpacked
1423	03/16/2012 12:36 PM	Aaron Marcuse-Kubitza	Added inputs/UNC-NCSC
1422	03/15/2012 07:12 PM	Aaron Marcuse-Kubitza	Added inputs/MOBOT/test with initial accepted test outputs
1421	03/15/2012 07:11 PM	Aaron Marcuse-Kubitza	Added inputs/MOBOT/maps
1420	03/15/2012 06:51 PM	Aaron Marcuse-Kubitza	Added inputs/MOBOT
1419	03/15/2012 06:41 PM	Aaron Marcuse-Kubitza	VegX mappings: Updated plot place mappings to VegX 1.5.3 method of place type-tagged place names. This removes the userdef fields in plot.
1418	03/15/2012 06:18 PM	Aaron Marcuse-Kubitza	VegX mappings: Changed userdef xPosition, yPosition to /relativePlotPosition/relativeX, /relativePlotPosition/relativeY
1417	03/15/2012 06:16 PM	Aaron Marcuse-Kubitza	Regenerated mappings/DwC-VegBIEN.specimens.no_empty.csv
1416	03/15/2012 05:36 PM	Aaron Marcuse-Kubitza	bin/map: map_table(): wrap_row(): Use util.list_as_length() to handle CSV rows of different lengths
1415	03/15/2012 05:35 PM	Aaron Marcuse-Kubitza	util.py: Added list_as_length(). Documented that list_set_length() takes a list, not a tuple. Documented that ListDict must have len(list_) == len(keys).
1414	03/15/2012 05:19 PM	Aaron Marcuse-Kubitza	util.py: Added list_set_length(). Changed list_set() to use list_set_length().
1413	03/13/2012 07:48 PM	Aaron Marcuse-Kubitza	mappings/DwC2-VegBIEN.specimens.csv: Added empty *_id/taxonoccurrence attr to primary keys to ensure that a taxonoccurrence is always created for the specimenreplicate
1412	03/13/2012 07:41 PM	Aaron Marcuse-Kubitza	xml_func.py: _label: Use ustr instead of str when checking types
1411	03/13/2012 07:41 PM	Aaron Marcuse-Kubitza	csvs.py: Set dialect.doublequote to True because Sniffer doesn't turn this on by default
1410	03/13/2012 07:23 PM	Aaron Marcuse-Kubitza	Merged inputs/NYBG-CSV into NYBG
1409	03/13/2012 07:16 PM	Aaron Marcuse-Kubitza	Merged inputs/UArizona-CSV into UArizona
1408	03/13/2012 07:02 PM	Aaron Marcuse-Kubitza	Added inputs/SpeciesLink/test
1407	03/13/2012 07:02 PM	Aaron Marcuse-Kubitza	Added inputs/SpeciesLink/maps
1406	03/13/2012 07:02 PM	Aaron Marcuse-Kubitza	xml_func.py: range-related funcs: Made inputs optional in case they get set to NULL by _nullIf
1405	03/13/2012 06:48 PM	Aaron Marcuse-Kubitza	mappings/DwC1-DwC2.specimens.csv: Added common DwC1 fields that are not part of the official DwC1 schema
1404	03/13/2012 06:31 PM	Aaron Marcuse-Kubitza	bin/map: Added support for getting columns with an optional prefix list for DB/CSV inputs
1403	03/13/2012 06:21 PM	Aaron Marcuse-Kubitza	bin/map: Factored out code common to DB and CSV inputs into map_table()
1402	03/13/2012 06:00 PM	Aaron Marcuse-Kubitza	bin/map: Parse any prefixes in map input column name. They will later be used to check for versions of columns with a prefix added when processing CSV/DB inputs.
1401	03/13/2012 05:58 PM	Aaron Marcuse-Kubitza	strings.py: Added split(), remove_prefix(), remove_suffix(), and remove_prefixes(). Added section comments.
1400	03/13/2012 05:06 PM	Aaron Marcuse-Kubitza	mappings/DwC2-VegBIEN.specimens.csv: minimumElevationInMeters: Handle embedded ranges using _rangeStart and _rangeEnd
1399	03/13/2012 05:05 PM	Aaron Marcuse-Kubitza	xml_func.py: Added _rangeStart and _rangeEnd
1398	03/13/2012 05:04 PM	Aaron Marcuse-Kubitza	xpath.py: parse(): Split paths: Raise a SyntaxException if can't attach a split path because there is no parent element to attach to
1397	03/13/2012 05:02 PM	Aaron Marcuse-Kubitza	Parser.py: Renamed _syntax_err() to syntax_err() to make it a public method
1396	03/13/2012 04:38 PM	Aaron Marcuse-Kubitza	mappings/DwC2-VegBIEN.specimens.csv: Mapped fieldNotes and taxonRemarks to description using _merge. inputs/UArizona*/maps/DwC.specimens.csv: Mapped Remarks to taxonRemarks, which now has a VegBIEN mapping.
1395	03/13/2012 04:24 PM	Aaron Marcuse-Kubitza	Added inputs/GBIF/src with small files that can be under version control
1394	03/13/2012 04:23 PM	Aaron Marcuse-Kubitza	input.Makefile: svn_props: Ignore everything in the src/ subdir that hasn't been explicitly checked in
1393	03/13/2012 04:18 PM	Aaron Marcuse-Kubitza	Added inputs/GBIF/test with accepted test outputs
1392	03/13/2012 04:18 PM	Aaron Marcuse-Kubitza	Added inputs/GBIF/maps
1391	03/13/2012 04:17 PM	Aaron Marcuse-Kubitza	Regenerated inputs/UArizona*/maps VegBIEN maps
1390	03/13/2012 04:13 PM	Aaron Marcuse-Kubitza	Regenerated mappings/DwC-VegBIEN.specimens.no_empty.csv
1389	03/13/2012 04:09 PM	Aaron Marcuse-Kubitza	bin/map: Use new csvs.reader_and_header() to support CSVs/TSVs with other than the default Excel dialect
1388	03/13/2012 04:08 PM	Aaron Marcuse-Kubitza	Added csvs.py for CSV I/O such as automatically detecting the dialect based on the header line
1387	03/13/2012 04:07 PM	Aaron Marcuse-Kubitza	join: Don't append suffix to empty output mappings, so that they stay empty ("NULL")
1386	03/13/2012 04:00 PM	Aaron Marcuse-Kubitza	input.Makefile: Added tsv to $(exts). Strip extra whitespace from $(inputs) so that it's the empty string if $(<in) (and $(<in).header) don't exist, and can be used in $(if ...).
1385	03/12/2012 07:08 PM	Aaron Marcuse-Kubitza	input.Makefile: Fixed bug in inputFiles wildcard where extensions were manually listed instead of dynamically determined from the $(exts) config var
1384	03/12/2012 06:56 PM	Aaron Marcuse-Kubitza	README.TXT: Tell user to `disown -h 1` after running `make import x%x` so that it won't be sent a SIGHUP if the user logs out
1383	03/12/2012 06:55 PM	Aaron Marcuse-Kubitza	README.TXT: Tell user to `disown -h 1` after running `make import x%x` so that it won't be sent a SIGHUP if the user logs out
1382	03/12/2012 06:39 PM	Aaron Marcuse-Kubitza	input.Makefile: Prepend separate CSV header when available
1381	03/12/2012 06:24 PM	Aaron Marcuse-Kubitza	input.Makefile: Use with_cat in map to later support prepending separate CSV headers
1380	03/12/2012 06:21 PM	Aaron Marcuse-Kubitza	Added with_cat to run a command, taking input from the concatenation of files
1379	03/12/2012 05:48 PM	Aaron Marcuse-Kubitza	input.Makefile: Set mapEnv if $(dbEngine) is set, to eventually support pre-existing DB connections
1378	03/12/2012 05:14 PM	Aaron Marcuse-Kubitza	input.Makefile: Changed $(dbFile) to $(dbExport) to make it unambiguous that it refers to a SQL export, not a pre-existing DB, which will be supported later
1377	03/12/2012 05:10 PM	Aaron Marcuse-Kubitza	input.Makefile: Added .txt to list of input file extensions
1376	03/12/2012 04:34 PM	Aaron Marcuse-Kubitza	Added inputs/SpeciesLink
1375	03/12/2012 03:57 PM	Aaron Marcuse-Kubitza	root Makefile: python-Linux: Added pymetrics
1374	03/12/2012 03:54 PM	Aaron Marcuse-Kubitza	bin/map: Consider \N to be None
1373	03/12/2012 03:49 PM	Aaron Marcuse-Kubitza	util.py: none_if(): Allow multiple none_vals using varargs
1372	03/12/2012 03:36 PM	Aaron Marcuse-Kubitza	Added inputs/GBIF
1371	03/12/2012 03:28 PM	Aaron Marcuse-Kubitza	exc.py: Fixed bug in traceback-saving mechanism that didn't deal with nested Exceptions (such as Exceptions with causes in ExceptionWithCause). Renamed add_exc_info() to add_traceback() since we really only need to store the traceback.
1370	03/12/2012 12:41 PM	Aaron Marcuse-Kubitza	dates.py: parse_date_range(): Fixed bug where the date parts were not joined back together into a string for each date range element. Use strings.single_space() after the date has been split into range parts so that whitespace around the range separator is removed instead of being replaced with a single space.
1369	03/12/2012 12:25 PM	Aaron Marcuse-Kubitza	xml_func.py: process(): Also catch XML func internal errors to assist in debugging. Use new exc.add_exc_info() to save traceback in case later code throws exception, overwriting exc_info().
1368	03/12/2012 12:23 PM	Aaron Marcuse-Kubitza	exc.py: str_(): Add the traceback at the end of the exception string. Added add_exc_info() and get_exc_info() for providing traceback info for str_().
1367	03/11/2012 07:33 PM	Aaron Marcuse-Kubitza	mappings/DwC2-VegBIEN.specimens.csv: eventDate, dateIdentified: Use _dateRangeStart and _dateRangeEnd
1366	03/11/2012 07:32 PM	Aaron Marcuse-Kubitza	xml_func.py: Added _dateRangeStart and _dateRangeEnd
1365	03/11/2012 07:32 PM	Aaron Marcuse-Kubitza	dates.py: Added parse_date_range() and helper funcs could_be_year() and could_be_day()
1364	03/11/2012 07:31 PM	Aaron Marcuse-Kubitza	strings.py: Added single_space()
1363	03/11/2012 06:12 PM	Aaron Marcuse-Kubitza	inputs/UArizona*: Map the ScientificNameAuthor to the binomial instead since it contains the binomial in addition to the authority
1362	03/11/2012 05:28 PM	Aaron Marcuse-Kubitza	Added inputs/UArizona-CSV/test
1361	03/11/2012 05:23 PM	Aaron Marcuse-Kubitza	input.Makefile: Use .PRECIOUS to save outputs of failed tests so they can be accepted (needed now that .DELETE_ON_ERROR is turned on globally)
1360	03/11/2012 05:14 PM	Aaron Marcuse-Kubitza	bin/map: Moved string-cleanup code from get_value() to cleanup(), called by process_row(). process_row() now cleans up the string before checking if it's None, because cleanup() uses none_if() to map "" to None.
1359	03/11/2012 05:12 PM	Aaron Marcuse-Kubitza	util.py: Added do_ignore_none()
1358	03/11/2012 04:25 PM	Aaron Marcuse-Kubitza	Added inputs/UArizona-CSV/verify
1357	03/11/2012 04:24 PM	Aaron Marcuse-Kubitza	Added inputs/UArizona-CSV/maps
1356	03/11/2012 04:23 PM	Aaron Marcuse-Kubitza	mappings/DwC2-VegBIEN.specimens.csv: Mapped coordinateUncertaintyInMeters to the same place as coordinatePrecision (input sources generally use only one of these columns, which is most likely the accuracy regardless of what it's named)
1355	03/11/2012 04:18 PM	Aaron Marcuse-Kubitza	join: In error message when map column names don't match, include the actual column names
1354	03/11/2012 04:17 PM	Aaron Marcuse-Kubitza	Makefiles: Added .DELETE_ON_ERROR to delete target if recipe fails
1353	03/11/2012 03:18 PM	Aaron Marcuse-Kubitza	VegBIEN mappings: plantnames: Nest taxons hierarchically using plantname.parent_id. Mappings using _forEach: Append a "," to the `in` list so that mappings will sort from shortest to longest `in` list ("]" comes after "," in ASCII, causing this not to happen without the trailing ",").
1352	03/11/2012 03:14 PM	Aaron Marcuse-Kubitza	xpath.py: parse(): _paths(): Remove trailing ","
1351	03/11/2012 02:38 PM	Aaron Marcuse-Kubitza	xpath_func.py: _forEach: Made syntax more natural-looking by using values instead of names for string args and attrs instead of branches for array args
1350	03/11/2012 02:36 PM	Aaron Marcuse-Kubitza	xpath.py: parse() Fixed bug in _paths() where empty lists would be parsed as a list containing a single empty path, instead of as an empty list
1349	03/11/2012 01:26 PM	Aaron Marcuse-Kubitza	VegBIEN mappings: Place names: Use _forEach to simplify XPaths for recursively nested places
1348	03/11/2012 01:22 PM	Aaron Marcuse-Kubitza	bin/map: In debug mode, print output XPaths
1347	03/09/2012 07:51 PM	Aaron Marcuse-Kubitza	xpath_func.py: _forEach: Fixed to support _val replacements anywhere, by doing a string-based search-and-replace on a quoted XPath instead of a list-based search-and-replace on an already-parsed XPath
1346	03/09/2012 07:41 PM	Aaron Marcuse-Kubitza	xpath_func.py: Renamed _for to _forEach. Finished implementing _forEach.
1345	03/09/2012 07:41 PM	Aaron Marcuse-Kubitza	xpath.py: Import xpath_func after defining XpathElem because xpath_func depends on XpathElem and it hasn't yet been factored into a separate file

Project

General

Profile