/ - Changes - BIEN 3 - NCEAS Projects

root @ 1399

#	Date	Author	Comment
1399	03/13/2012 05:05 PM	Aaron Marcuse-Kubitza	xml_func.py: Added _rangeStart and _rangeEnd
1398	03/13/2012 05:04 PM	Aaron Marcuse-Kubitza	xpath.py: parse(): Split paths: Raise a SyntaxException if can't attach a split path because there is no parent element to attach to
1397	03/13/2012 05:02 PM	Aaron Marcuse-Kubitza	Parser.py: Renamed _syntax_err() to syntax_err() to make it a public method
1396	03/13/2012 04:38 PM	Aaron Marcuse-Kubitza	mappings/DwC2-VegBIEN.specimens.csv: Mapped fieldNotes and taxonRemarks to description using _merge. inputs/UArizona*/maps/DwC.specimens.csv: Mapped Remarks to taxonRemarks, which now has a VegBIEN mapping.
1395	03/13/2012 04:24 PM	Aaron Marcuse-Kubitza	Added inputs/GBIF/src with small files that can be under version control
1394	03/13/2012 04:23 PM	Aaron Marcuse-Kubitza	input.Makefile: svn_props: Ignore everything in the src/ subdir that hasn't been explicitly checked in
1393	03/13/2012 04:18 PM	Aaron Marcuse-Kubitza	Added inputs/GBIF/test with accepted test outputs
1392	03/13/2012 04:18 PM	Aaron Marcuse-Kubitza	Added inputs/GBIF/maps
1391	03/13/2012 04:17 PM	Aaron Marcuse-Kubitza	Regenerated inputs/UArizona*/maps VegBIEN maps
1390	03/13/2012 04:13 PM	Aaron Marcuse-Kubitza	Regenerated mappings/DwC-VegBIEN.specimens.no_empty.csv
1389	03/13/2012 04:09 PM	Aaron Marcuse-Kubitza	bin/map: Use new csvs.reader_and_header() to support CSVs/TSVs with other than the default Excel dialect
1388	03/13/2012 04:08 PM	Aaron Marcuse-Kubitza	Added csvs.py for CSV I/O such as automatically detecting the dialect based on the header line
1387	03/13/2012 04:07 PM	Aaron Marcuse-Kubitza	join: Don't append suffix to empty output mappings, so that they stay empty ("NULL")
1386	03/13/2012 04:00 PM	Aaron Marcuse-Kubitza	input.Makefile: Added tsv to $(exts). Strip extra whitespace from $(inputs) so that it's the empty string if $(<in) (and $(<in).header) don't exist, and can be used in $(if ...).
1385	03/12/2012 07:08 PM	Aaron Marcuse-Kubitza	input.Makefile: Fixed bug in inputFiles wildcard where extensions were manually listed instead of dynamically determined from the $(exts) config var
1384	03/12/2012 06:56 PM	Aaron Marcuse-Kubitza	README.TXT: Tell user to `disown -h 1` after running `make import x%x` so that it won't be sent a SIGHUP if the user logs out
1383	03/12/2012 06:55 PM	Aaron Marcuse-Kubitza	README.TXT: Tell user to `disown -h 1` after running `make import x%x` so that it won't be sent a SIGHUP if the user logs out
1382	03/12/2012 06:39 PM	Aaron Marcuse-Kubitza	input.Makefile: Prepend separate CSV header when available
1381	03/12/2012 06:24 PM	Aaron Marcuse-Kubitza	input.Makefile: Use with_cat in map to later support prepending separate CSV headers
1380	03/12/2012 06:21 PM	Aaron Marcuse-Kubitza	Added with_cat to run a command, taking input from the concatenation of files
1379	03/12/2012 05:48 PM	Aaron Marcuse-Kubitza	input.Makefile: Set mapEnv if $(dbEngine) is set, to eventually support pre-existing DB connections
1378	03/12/2012 05:14 PM	Aaron Marcuse-Kubitza	input.Makefile: Changed $(dbFile) to $(dbExport) to make it unambiguous that it refers to a SQL export, not a pre-existing DB, which will be supported later
1377	03/12/2012 05:10 PM	Aaron Marcuse-Kubitza	input.Makefile: Added .txt to list of input file extensions
1376	03/12/2012 04:34 PM	Aaron Marcuse-Kubitza	Added inputs/SpeciesLink
1375	03/12/2012 03:57 PM	Aaron Marcuse-Kubitza	root Makefile: python-Linux: Added pymetrics
1374	03/12/2012 03:54 PM	Aaron Marcuse-Kubitza	bin/map: Consider \N to be None
1373	03/12/2012 03:49 PM	Aaron Marcuse-Kubitza	util.py: none_if(): Allow multiple none_vals using varargs
1372	03/12/2012 03:36 PM	Aaron Marcuse-Kubitza	Added inputs/GBIF
1371	03/12/2012 03:28 PM	Aaron Marcuse-Kubitza	exc.py: Fixed bug in traceback-saving mechanism that didn't deal with nested Exceptions (such as Exceptions with causes in ExceptionWithCause). Renamed add_exc_info() to add_traceback() since we really only need to store the traceback.
1370	03/12/2012 12:41 PM	Aaron Marcuse-Kubitza	dates.py: parse_date_range(): Fixed bug where the date parts were not joined back together into a string for each date range element. Use strings.single_space() after the date has been split into range parts so that whitespace around the range separator is removed instead of being replaced with a single space.
1369	03/12/2012 12:25 PM	Aaron Marcuse-Kubitza	xml_func.py: process(): Also catch XML func internal errors to assist in debugging. Use new exc.add_exc_info() to save traceback in case later code throws exception, overwriting exc_info().
1368	03/12/2012 12:23 PM	Aaron Marcuse-Kubitza	exc.py: str_(): Add the traceback at the end of the exception string. Added add_exc_info() and get_exc_info() for providing traceback info for str_().
1367	03/11/2012 07:33 PM	Aaron Marcuse-Kubitza	mappings/DwC2-VegBIEN.specimens.csv: eventDate, dateIdentified: Use _dateRangeStart and _dateRangeEnd
1366	03/11/2012 07:32 PM	Aaron Marcuse-Kubitza	xml_func.py: Added _dateRangeStart and _dateRangeEnd
1365	03/11/2012 07:32 PM	Aaron Marcuse-Kubitza	dates.py: Added parse_date_range() and helper funcs could_be_year() and could_be_day()
1364	03/11/2012 07:31 PM	Aaron Marcuse-Kubitza	strings.py: Added single_space()
1363	03/11/2012 06:12 PM	Aaron Marcuse-Kubitza	inputs/UArizona*: Map the ScientificNameAuthor to the binomial instead since it contains the binomial in addition to the authority
1362	03/11/2012 05:28 PM	Aaron Marcuse-Kubitza	Added inputs/UArizona-CSV/test
1361	03/11/2012 05:23 PM	Aaron Marcuse-Kubitza	input.Makefile: Use .PRECIOUS to save outputs of failed tests so they can be accepted (needed now that .DELETE_ON_ERROR is turned on globally)
1360	03/11/2012 05:14 PM	Aaron Marcuse-Kubitza	bin/map: Moved string-cleanup code from get_value() to cleanup(), called by process_row(). process_row() now cleans up the string before checking if it's None, because cleanup() uses none_if() to map "" to None.
1359	03/11/2012 05:12 PM	Aaron Marcuse-Kubitza	util.py: Added do_ignore_none()
1358	03/11/2012 04:25 PM	Aaron Marcuse-Kubitza	Added inputs/UArizona-CSV/verify
1357	03/11/2012 04:24 PM	Aaron Marcuse-Kubitza	Added inputs/UArizona-CSV/maps
1356	03/11/2012 04:23 PM	Aaron Marcuse-Kubitza	mappings/DwC2-VegBIEN.specimens.csv: Mapped coordinateUncertaintyInMeters to the same place as coordinatePrecision (input sources generally use only one of these columns, which is most likely the accuracy regardless of what it's named)
1355	03/11/2012 04:18 PM	Aaron Marcuse-Kubitza	join: In error message when map column names don't match, include the actual column names
1354	03/11/2012 04:17 PM	Aaron Marcuse-Kubitza	Makefiles: Added .DELETE_ON_ERROR to delete target if recipe fails
1353	03/11/2012 03:18 PM	Aaron Marcuse-Kubitza	VegBIEN mappings: plantnames: Nest taxons hierarchically using plantname.parent_id. Mappings using _forEach: Append a "," to the `in` list so that mappings will sort from shortest to longest `in` list ("]" comes after "," in ASCII, causing this not to happen without the trailing ",").
1352	03/11/2012 03:14 PM	Aaron Marcuse-Kubitza	xpath.py: parse(): _paths(): Remove trailing ","
1351	03/11/2012 02:38 PM	Aaron Marcuse-Kubitza	xpath_func.py: _forEach: Made syntax more natural-looking by using values instead of names for string args and attrs instead of branches for array args
1350	03/11/2012 02:36 PM	Aaron Marcuse-Kubitza	xpath.py: parse() Fixed bug in _paths() where empty lists would be parsed as a list containing a single empty path, instead of as an empty list
1349	03/11/2012 01:26 PM	Aaron Marcuse-Kubitza	VegBIEN mappings: Place names: Use _forEach to simplify XPaths for recursively nested places
1348	03/11/2012 01:22 PM	Aaron Marcuse-Kubitza	bin/map: In debug mode, print output XPaths
1347	03/09/2012 07:51 PM	Aaron Marcuse-Kubitza	xpath_func.py: _forEach: Fixed to support _val replacements anywhere, by doing a string-based search-and-replace on a quoted XPath instead of a list-based search-and-replace on an already-parsed XPath
1346	03/09/2012 07:41 PM	Aaron Marcuse-Kubitza	xpath_func.py: Renamed _for to _forEach. Finished implementing _forEach.
1345	03/09/2012 07:41 PM	Aaron Marcuse-Kubitza	xpath.py: Import xpath_func after defining XpathElem because xpath_func depends on XpathElem and it hasn't yet been factored into a separate file
1344	03/09/2012 07:39 PM	Aaron Marcuse-Kubitza	util.py: Added list_replace()
1343	03/09/2012 07:14 PM	Aaron Marcuse-Kubitza	xpath_func.py: Changed XPath function signature to take arguments (args, path), and process() to parse out the args. Implemented basic for that repeats its do arg as many times as there are in elements.
1342	03/09/2012 06:44 PM	Aaron Marcuse-Kubitza	xpath.py: parse(): Run xpath_func.process() on the parsed XPath
1341	03/09/2012 06:43 PM	Aaron Marcuse-Kubitza	Added xpath_func.py for XPath "function" elements that transform their subpaths
1340	03/09/2012 06:23 PM	Aaron Marcuse-Kubitza	VegBIEN mappings: Removed no longer needed taxondetermination.determinationtype values, because they can be determined from the new role closed list
1339	03/09/2012 06:19 PM	Aaron Marcuse-Kubitza	filter_ERD.csv: Removed no longer needed references to role
1338	03/09/2012 06:18 PM	Aaron Marcuse-Kubitza	Regenerated vegbien.ERD exports
1337	03/09/2012 06:17 PM	Aaron Marcuse-Kubitza	VegBIEN: Changed role table to a closed list
1336	03/09/2012 06:14 PM	Aaron Marcuse-Kubitza	PostgreSQL-MySQL.csv: custom types: Consider everything except a set of accepted types to be a custom type
1335	03/09/2012 05:40 PM	Aaron Marcuse-Kubitza	VegBIEN: taxonrank enum: Made values lowercase to match case convention in other enums
1334	03/09/2012 05:33 PM	Aaron Marcuse-Kubitza	Regenerated vegbien.ERD exports
1333	03/09/2012 05:32 PM	Aaron Marcuse-Kubitza	vegbien.sql: Renamed plantconceptscope to plantnamescope because it's now attached to plantname
1332	03/09/2012 05:26 PM	Aaron Marcuse-Kubitza	vegbien.sql: Moved parent_id from plantconcept to plantname, since plantnames themselves are unique according to their parent taxons (a species under one genus is not the same as a species under another genus)
1331	03/09/2012 05:03 PM	Aaron Marcuse-Kubitza	Regenerated vegbien.ERD exports
1330	03/09/2012 04:59 PM	Aaron Marcuse-Kubitza	vegbien.ERD.mwb: Fixed lines
1329	03/09/2012 04:57 PM	Aaron Marcuse-Kubitza	vegbien.sql: Moved scope_id from plantconcept to plantname, since plantnames themselves are scoped, not just the plantconcepts that use them (e.g. "sp. 1" has different meanings in different scopes, so it should not be shared between scopes). plantname: Added accessioncode.
1328	03/09/2012 04:38 PM	Aaron Marcuse-Kubitza	vegbien.sql: Moved plantconcept parent_id from plantstatus to plantconcept. plantconcept: Removed datasource-specific fields to make it globally unique (one plantconcept for each assigned parent taxon of a plantname, of which there will usually be just one)
1327	03/09/2012 04:22 PM	Aaron Marcuse-Kubitza	vegbien.sql: plantname: Removed datasource-specific fields to make this a globally-unique table (the datasource-specific fields belong in plantconcept)
1326	03/09/2012 04:16 PM	Aaron Marcuse-Kubitza	Added inputs/UArizona/verify
1325	03/09/2012 04:15 PM	Aaron Marcuse-Kubitza	mappings/verify.specimens.sql: Updated for schema changes
1324	03/09/2012 04:06 PM	Aaron Marcuse-Kubitza	vegbien.sql: placerank enum: Added "village"
1323	03/09/2012 04:00 PM	Aaron Marcuse-Kubitza	VegBIEN mappings: lat/long locationdetermination: Removed [!namedplace_id] key so that it's merged into the namedplace locationdetermination
1322	03/09/2012 03:54 PM	Aaron Marcuse-Kubitza	VegBIEN mappings: Changed namedplace mappings to use new nested format for storing place containment relationships
1321	03/09/2012 03:44 PM	Aaron Marcuse-Kubitza	xml_func.py: Added _simplifyPath
1320	03/09/2012 03:25 PM	Aaron Marcuse-Kubitza	xpath.py: Added get_1()
1319	03/09/2012 02:50 PM	Aaron Marcuse-Kubitza	vegbien.sql: namedplace: Removed parent_id from unique constraint because some data might be missing intervening links (e.g. state for a county, country), but the place (e.g. county) should still be attached to the existing place of the same name and rank (which will hopefully already have the correct parent_id link)
1318	03/09/2012 02:46 PM	Aaron Marcuse-Kubitza	vegbien.sql: namedplace: Made rank required
1317	03/09/2012 02:33 PM	Aaron Marcuse-Kubitza	vegbien.sql: namedplace: Removed no longer needed placesystem, which has been replaced by rank closed list
1316	03/09/2012 02:30 PM	Aaron Marcuse-Kubitza	VegBIEN mappings: Map namedplaces using new rank field
1315	03/09/2012 02:25 PM	Aaron Marcuse-Kubitza	vegbien.sql: namedplace: Added rank. Do duplicate elimination using rank and parent_id instead of placesystem
1314	03/09/2012 02:20 PM	Aaron Marcuse-Kubitza	vegbien.sql: placerank: Standardized names to DwC/GML
1313	03/09/2012 01:06 PM	Aaron Marcuse-Kubitza	vegbien.sql: Added placerank enum
1312	03/09/2012 12:35 PM	Aaron Marcuse-Kubitza	vegbien.sql: namedplace: Removed VegBank internal fields and datasource scoping fields (namedplaces are globally unique). Added parent_id to point to containing namedplace.
1311	03/09/2012 12:21 PM	Aaron Marcuse-Kubitza	xml_func.py: Added _dateRangePart with partial implementation (only works on strings with no range)
1310	03/09/2012 12:20 PM	Aaron Marcuse-Kubitza	DwC mappings: Moved date _date filter outside _alt so it would run only on the string that was actually chosen, and not produce date format errors when a pre-parsed year/month/day is already available
1309	03/08/2012 06:30 PM	Aaron Marcuse-Kubitza	xml_func.py: _date: Map date with only empty fields to NULL (occurs when all fields were e.g. 0 and were filtered to NULL by _nullIf)
1308	03/08/2012 06:00 PM	Aaron Marcuse-Kubitza	xml_func.py: _date: Removed mapping year/month/day of 0 to NULL because that is now handled on a case-by-case basis in the mappings
1307	03/08/2012 05:58 PM	Aaron Marcuse-Kubitza	mappings/DwC1-DwC2.specimens.csv: Map year/month/day of 0 to NULL
1306	03/08/2012 05:13 PM	Aaron Marcuse-Kubitza	inputs/SALVIAS/maps/VegX.organisms.csv: Habit: Fixed syntax error in growthForm map
1305	03/08/2012 05:11 PM	Aaron Marcuse-Kubitza	inputs/SALVIAS/maps/VegX.organisms.csv: Habit: Removed input values from growthForm map that Brad said were invalid
1304	03/08/2012 05:10 PM	Aaron Marcuse-Kubitza	xml_func.py: _map: Added option to make map a closed list
1303	03/08/2012 04:56 PM	Aaron Marcuse-Kubitza	mappings/DwC2-VegBIEN.specimens.csv: Fixed waterdepth mappings to use _avg
1302	03/06/2012 06:48 PM	Aaron Marcuse-Kubitza	mappings/verify.specimens.sql: Use ORDER BY ... NULLS FIRST to match MySQL
1301	03/06/2012 06:42 PM	Aaron Marcuse-Kubitza	input.Makefile: verify: Time the verification since it can take a long time
1300	03/06/2012 06:34 PM	Aaron Marcuse-Kubitza	specimens verification: Added duplicate catalog numbers test

Project

General

Profile