Project

General

Profile

Statistics
| Revision:

# Date Author Comment
1444 03/16/2012 06:25 PM Aaron Marcuse-Kubitza

csvs.py: Added stream_info() to return NamedTuple {header_line, dialect} for later use in cat_csv. Changed reader_and_header() to use stream_info().

1443 03/16/2012 06:23 PM Aaron Marcuse-Kubitza

util.py: Added NamedTuple

1442 03/16/2012 06:04 PM Aaron Marcuse-Kubitza

csvs.py: reader_and_header(): Restrict delimiters to common delimiters so that e.g. letters are not considered delimiters just because they appear frequently

1441 03/16/2012 05:38 PM Aaron Marcuse-Kubitza

Renamed inputs/NYBG to inputs/NY to match herbarium code

1440 03/16/2012 05:35 PM Aaron Marcuse-Kubitza

Renamed inputs/UNC-NCSC to inputs/NCU-NCSC to match herbarium code

1439 03/16/2012 05:32 PM Aaron Marcuse-Kubitza

Renamed inputs/UArizona to inputs/ARIZ to match herbarium code

1438 03/16/2012 05:31 PM Aaron Marcuse-Kubitza

Regenerated inputs/MO/maps/src.join.specimens.csv

1437 03/16/2012 05:26 PM Aaron Marcuse-Kubitza

Renamed inputs/MOBOT to inputs/MO to match herbarium code

1436 03/16/2012 05:11 PM Aaron Marcuse-Kubitza

Regenerated vegbien.ERD exports

1435 03/16/2012 05:08 PM Aaron Marcuse-Kubitza

vegbien.sql: taxonoccurrence: Added cultivatedbasis

1434 03/16/2012 05:03 PM Aaron Marcuse-Kubitza

vegbien.sql: Moved all accessioncode fields to the bottom of their tables. vegbien.ERD.mwb: Adjusted lines to remove overlaps.

1433 03/16/2012 04:52 PM Aaron Marcuse-Kubitza

vegbien.sql: taxonoccurrence: Added iscultivated, isnative. Moved accessioncode to bottom.

1432 03/16/2012 04:36 PM Aaron Marcuse-Kubitza

vegbien.sql: Changed taxonoccurrence.growthform type to more specific growthform

1431 03/16/2012 04:34 PM Aaron Marcuse-Kubitza

vegbien.sql: Added growthform and establishmentmeans_dwc enums using values from taxonclass. Documented that taxonclass is growthform + establishmentmeans_dwc + some other values.

1430 03/16/2012 04:22 PM Aaron Marcuse-Kubitza

VegBIEN: Moved aggregateoccurrence.growthform to taxonoccurrence

1429 03/16/2012 04:21 PM Aaron Marcuse-Kubitza

Added inputs/UNC-NCSC/maps/src.join.specimens.csv

1428 03/16/2012 04:15 PM Aaron Marcuse-Kubitza

VegBIEN: Merged aggregateoccurrence.verbatimcollectorname and specimenreplicate.verbatimcollectorname into taxonoccurrence

1427 03/16/2012 03:58 PM Aaron Marcuse-Kubitza

xml_func.py: parse_range(): Handle negative numbers by treating them as not a range

1426 03/16/2012 03:31 PM Aaron Marcuse-Kubitza

Added inputs/UNC-NCSC/test with initial accepted test outputs

1425 03/16/2012 03:31 PM Aaron Marcuse-Kubitza

Added inputs/UNC-NCSC/maps

1424 03/16/2012 03:31 PM Aaron Marcuse-Kubitza

xml_func.py: _replace: Fixed bug where value entry was not unpacked

1423 03/16/2012 12:36 PM Aaron Marcuse-Kubitza

Added inputs/UNC-NCSC

1422 03/15/2012 07:12 PM Aaron Marcuse-Kubitza

Added inputs/MOBOT/test with initial accepted test outputs

1421 03/15/2012 07:11 PM Aaron Marcuse-Kubitza

Added inputs/MOBOT/maps

1420 03/15/2012 06:51 PM Aaron Marcuse-Kubitza

Added inputs/MOBOT

1419 03/15/2012 06:41 PM Aaron Marcuse-Kubitza

VegX mappings: Updated plot place mappings to VegX 1.5.3 method of place type-tagged place names. This removes the userdef fields in plot.

1418 03/15/2012 06:18 PM Aaron Marcuse-Kubitza

VegX mappings: Changed userdef xPosition, yPosition to /relativePlotPosition/relativeX, /relativePlotPosition/relativeY

1417 03/15/2012 06:16 PM Aaron Marcuse-Kubitza

Regenerated mappings/DwC-VegBIEN.specimens.no_empty.csv

1416 03/15/2012 05:36 PM Aaron Marcuse-Kubitza

bin/map: map_table(): wrap_row(): Use util.list_as_length() to handle CSV rows of different lengths

1415 03/15/2012 05:35 PM Aaron Marcuse-Kubitza

util.py: Added list_as_length(). Documented that list_set_length() takes a list, not a tuple. Documented that ListDict must have len(list_) == len(keys).

1414 03/15/2012 05:19 PM Aaron Marcuse-Kubitza

util.py: Added list_set_length(). Changed list_set() to use list_set_length().

1413 03/13/2012 07:48 PM Aaron Marcuse-Kubitza

mappings/DwC2-VegBIEN.specimens.csv: Added empty *_id/taxonoccurrence attr to primary keys to ensure that a taxonoccurrence is always created for the specimenreplicate

1412 03/13/2012 07:41 PM Aaron Marcuse-Kubitza

xml_func.py: _label: Use ustr instead of str when checking types

1411 03/13/2012 07:41 PM Aaron Marcuse-Kubitza

csvs.py: Set dialect.doublequote to True because Sniffer doesn't turn this on by default

1410 03/13/2012 07:23 PM Aaron Marcuse-Kubitza

Merged inputs/NYBG-CSV into NYBG

1409 03/13/2012 07:16 PM Aaron Marcuse-Kubitza

Merged inputs/UArizona-CSV into UArizona

1408 03/13/2012 07:02 PM Aaron Marcuse-Kubitza

Added inputs/SpeciesLink/test

1407 03/13/2012 07:02 PM Aaron Marcuse-Kubitza

Added inputs/SpeciesLink/maps

1406 03/13/2012 07:02 PM Aaron Marcuse-Kubitza

xml_func.py: range-related funcs: Made inputs optional in case they get set to NULL by _nullIf

1405 03/13/2012 06:48 PM Aaron Marcuse-Kubitza

mappings/DwC1-DwC2.specimens.csv: Added common DwC1 fields that are not part of the official DwC1 schema

1404 03/13/2012 06:31 PM Aaron Marcuse-Kubitza

bin/map: Added support for getting columns with an optional prefix list for DB/CSV inputs

1403 03/13/2012 06:21 PM Aaron Marcuse-Kubitza

bin/map: Factored out code common to DB and CSV inputs into map_table()

1402 03/13/2012 06:00 PM Aaron Marcuse-Kubitza

bin/map: Parse any prefixes in map input column name. They will later be used to check for versions of columns with a prefix added when processing CSV/DB inputs.

1401 03/13/2012 05:58 PM Aaron Marcuse-Kubitza

strings.py: Added split(), remove_prefix(), remove_suffix(), and remove_prefixes(). Added section comments.

1400 03/13/2012 05:06 PM Aaron Marcuse-Kubitza

mappings/DwC2-VegBIEN.specimens.csv: minimumElevationInMeters: Handle embedded ranges using _rangeStart and _rangeEnd

1399 03/13/2012 05:05 PM Aaron Marcuse-Kubitza

xml_func.py: Added _rangeStart and _rangeEnd

1398 03/13/2012 05:04 PM Aaron Marcuse-Kubitza

xpath.py: parse(): Split paths: Raise a SyntaxException if can't attach a split path because there is no parent element to attach to

1397 03/13/2012 05:02 PM Aaron Marcuse-Kubitza

Parser.py: Renamed _syntax_err() to syntax_err() to make it a public method

1396 03/13/2012 04:38 PM Aaron Marcuse-Kubitza

mappings/DwC2-VegBIEN.specimens.csv: Mapped fieldNotes and taxonRemarks to description using _merge. inputs/UArizona*/maps/DwC.specimens.csv: Mapped Remarks to taxonRemarks, which now has a VegBIEN mapping.

1395 03/13/2012 04:24 PM Aaron Marcuse-Kubitza

Added inputs/GBIF/src with small files that can be under version control

1394 03/13/2012 04:23 PM Aaron Marcuse-Kubitza

input.Makefile: svn_props: Ignore everything in the src/ subdir that hasn't been explicitly checked in

1393 03/13/2012 04:18 PM Aaron Marcuse-Kubitza

Added inputs/GBIF/test with accepted test outputs

1392 03/13/2012 04:18 PM Aaron Marcuse-Kubitza

Added inputs/GBIF/maps

1391 03/13/2012 04:17 PM Aaron Marcuse-Kubitza

Regenerated inputs/UArizona*/maps VegBIEN maps

1390 03/13/2012 04:13 PM Aaron Marcuse-Kubitza

Regenerated mappings/DwC-VegBIEN.specimens.no_empty.csv

1389 03/13/2012 04:09 PM Aaron Marcuse-Kubitza

bin/map: Use new csvs.reader_and_header() to support CSVs/TSVs with other than the default Excel dialect

1388 03/13/2012 04:08 PM Aaron Marcuse-Kubitza

Added csvs.py for CSV I/O such as automatically detecting the dialect based on the header line

1387 03/13/2012 04:07 PM Aaron Marcuse-Kubitza

join: Don't append suffix to empty output mappings, so that they stay empty ("NULL")

1386 03/13/2012 04:00 PM Aaron Marcuse-Kubitza

input.Makefile: Added tsv to $(exts). Strip extra whitespace from $(inputs) so that it's the empty string if $(<in) (and $(<in).header) don't exist, and can be used in $(if ...).

1385 03/12/2012 07:08 PM Aaron Marcuse-Kubitza

input.Makefile: Fixed bug in inputFiles wildcard where extensions were manually listed instead of dynamically determined from the $(exts) config var

1384 03/12/2012 06:56 PM Aaron Marcuse-Kubitza

README.TXT: Tell user to `disown -h 1` after running `make import x%x` so that it won't be sent a SIGHUP if the user logs out

1383 03/12/2012 06:55 PM Aaron Marcuse-Kubitza

README.TXT: Tell user to `disown -h 1` after running `make import x%x` so that it won't be sent a SIGHUP if the user logs out

1382 03/12/2012 06:39 PM Aaron Marcuse-Kubitza

input.Makefile: Prepend separate CSV header when available

1381 03/12/2012 06:24 PM Aaron Marcuse-Kubitza

input.Makefile: Use with_cat in map to later support prepending separate CSV headers

1380 03/12/2012 06:21 PM Aaron Marcuse-Kubitza

Added with_cat to run a command, taking input from the concatenation of files

1379 03/12/2012 05:48 PM Aaron Marcuse-Kubitza

input.Makefile: Set mapEnv if $(dbEngine) is set, to eventually support pre-existing DB connections

1378 03/12/2012 05:14 PM Aaron Marcuse-Kubitza

input.Makefile: Changed $(dbFile) to $(dbExport) to make it unambiguous that it refers to a SQL export, not a pre-existing DB, which will be supported later

1377 03/12/2012 05:10 PM Aaron Marcuse-Kubitza

input.Makefile: Added .txt to list of input file extensions

1376 03/12/2012 04:34 PM Aaron Marcuse-Kubitza

Added inputs/SpeciesLink

1375 03/12/2012 03:57 PM Aaron Marcuse-Kubitza

root Makefile: python-Linux: Added pymetrics

1374 03/12/2012 03:54 PM Aaron Marcuse-Kubitza

bin/map: Consider \N to be None

1373 03/12/2012 03:49 PM Aaron Marcuse-Kubitza

util.py: none_if(): Allow multiple none_vals using varargs

1372 03/12/2012 03:36 PM Aaron Marcuse-Kubitza

Added inputs/GBIF

1371 03/12/2012 03:28 PM Aaron Marcuse-Kubitza

exc.py: Fixed bug in traceback-saving mechanism that didn't deal with nested Exceptions (such as Exceptions with causes in ExceptionWithCause). Renamed add_exc_info() to add_traceback() since we really only need to store the traceback.

1370 03/12/2012 12:41 PM Aaron Marcuse-Kubitza

dates.py: parse_date_range(): Fixed bug where the date parts were not joined back together into a string for each date range element. Use strings.single_space() after the date has been split into range parts so that whitespace around the range separator is removed instead of being replaced with a single space.

1369 03/12/2012 12:25 PM Aaron Marcuse-Kubitza

xml_func.py: process(): Also catch XML func internal errors to assist in debugging. Use new exc.add_exc_info() to save traceback in case later code throws exception, overwriting exc_info().

1368 03/12/2012 12:23 PM Aaron Marcuse-Kubitza

exc.py: str_(): Add the traceback at the end of the exception string. Added add_exc_info() and get_exc_info() for providing traceback info for str_().

1367 03/11/2012 07:33 PM Aaron Marcuse-Kubitza

mappings/DwC2-VegBIEN.specimens.csv: eventDate, dateIdentified: Use _dateRangeStart and _dateRangeEnd

1366 03/11/2012 07:32 PM Aaron Marcuse-Kubitza

xml_func.py: Added _dateRangeStart and _dateRangeEnd

1365 03/11/2012 07:32 PM Aaron Marcuse-Kubitza

dates.py: Added parse_date_range() and helper funcs could_be_year() and could_be_day()

1364 03/11/2012 07:31 PM Aaron Marcuse-Kubitza

strings.py: Added single_space()

1363 03/11/2012 06:12 PM Aaron Marcuse-Kubitza

inputs/UArizona*: Map the ScientificNameAuthor to the binomial instead since it contains the binomial in addition to the authority

1362 03/11/2012 05:28 PM Aaron Marcuse-Kubitza

Added inputs/UArizona-CSV/test

1361 03/11/2012 05:23 PM Aaron Marcuse-Kubitza

input.Makefile: Use .PRECIOUS to save outputs of failed tests so they can be accepted (needed now that .DELETE_ON_ERROR is turned on globally)

1360 03/11/2012 05:14 PM Aaron Marcuse-Kubitza

bin/map: Moved string-cleanup code from get_value() to cleanup(), called by process_row(). process_row() now cleans up the string before checking if it's None, because cleanup() uses none_if() to map "" to None.

1359 03/11/2012 05:12 PM Aaron Marcuse-Kubitza

util.py: Added do_ignore_none()

1358 03/11/2012 04:25 PM Aaron Marcuse-Kubitza

Added inputs/UArizona-CSV/verify

1357 03/11/2012 04:24 PM Aaron Marcuse-Kubitza

Added inputs/UArizona-CSV/maps

1356 03/11/2012 04:23 PM Aaron Marcuse-Kubitza

mappings/DwC2-VegBIEN.specimens.csv: Mapped coordinateUncertaintyInMeters to the same place as coordinatePrecision (input sources generally use only one of these columns, which is most likely the accuracy regardless of what it's named)

1355 03/11/2012 04:18 PM Aaron Marcuse-Kubitza

join: In error message when map column names don't match, include the actual column names

1354 03/11/2012 04:17 PM Aaron Marcuse-Kubitza

Makefiles: Added .DELETE_ON_ERROR to delete target if recipe fails

1353 03/11/2012 03:18 PM Aaron Marcuse-Kubitza

VegBIEN mappings: plantnames: Nest taxons hierarchically using plantname.parent_id. Mappings using _forEach: Append a "," to the `in` list so that mappings will sort from shortest to longest `in` list ("]" comes after "," in ASCII, causing this not to happen without the trailing ",").

1352 03/11/2012 03:14 PM Aaron Marcuse-Kubitza

xpath.py: parse(): _paths(): Remove trailing ","

1351 03/11/2012 02:38 PM Aaron Marcuse-Kubitza

xpath_func.py: _forEach: Made syntax more natural-looking by using values instead of names for string args and attrs instead of branches for array args

1350 03/11/2012 02:36 PM Aaron Marcuse-Kubitza

xpath.py: parse() Fixed bug in _paths() where empty lists would be parsed as a list containing a single empty path, instead of as an empty list

1349 03/11/2012 01:26 PM Aaron Marcuse-Kubitza

VegBIEN mappings: Place names: Use _forEach to simplify XPaths for recursively nested places

1348 03/11/2012 01:22 PM Aaron Marcuse-Kubitza

bin/map: In debug mode, print output XPaths

1347 03/09/2012 07:51 PM Aaron Marcuse-Kubitza

xpath_func.py: _forEach: Fixed to support _val replacements anywhere, by doing a string-based search-and-replace on a quoted XPath instead of a list-based search-and-replace on an already-parsed XPath

1346 03/09/2012 07:41 PM Aaron Marcuse-Kubitza

xpath_func.py: Renamed _for to _forEach. Finished implementing _forEach.

1345 03/09/2012 07:41 PM Aaron Marcuse-Kubitza

xpath.py: Import xpath_func after defining XpathElem because xpath_func depends on XpathElem and it hasn't yet been factored into a separate file