/ - Changes - BIEN 3 - NCEAS Projects

root @ 1643

#	Date	Author	Comment
1643	03/27/2012 04:21 PM	Aaron Marcuse-Kubitza	inputs/REMIB/src/nodes.all.specimens.csv.make: Write each node to a separate output file
1642	03/27/2012 04:00 PM	Aaron Marcuse-Kubitza	inputs/REMIB/src/nodes.all.specimens.csv.make: Raise InputException instead of AssertionError if invalid metadata row, so that it will be caught and printed instead of aborting the program
1641	03/27/2012 03:56 PM	Aaron Marcuse-Kubitza	inputs/REMIB/src/nodes.all.specimens.csv.make: Moved header reading code inside TimeoutException try-except block since read sometimes times out before the header is even read
1640	03/27/2012 03:55 PM	Aaron Marcuse-Kubitza	schemas/postgresql.nimoy.conf: Increased shared_buffers to 1.5GB since kernel.shmmax has been increased to 2GB
1639	03/26/2012 11:07 PM	Aaron Marcuse-Kubitza	Renamed inputs/REMIB/src/remib_raw.0.header.specimens.txt to nodes.all.0.header.specimens.csv
1638	03/26/2012 10:57 PM	Aaron Marcuse-Kubitza	inputs/REMIB/src/nodes.all.specimens.csv.make: Increased read timeout
1637	03/26/2012 10:55 PM	Aaron Marcuse-Kubitza	inputs/REMIB/src/nodes.all.specimens.csv.make: Timeout stuck reads because sometimes nodes are offline, etc.
1636	03/26/2012 10:53 PM	Aaron Marcuse-Kubitza	exc.py: str_(): Strip trailing whitespace. print_ex(): Since str_() now strips trailing whitespace, strings.ensure_newl() is no longer necessary.
1635	03/26/2012 10:43 PM	Aaron Marcuse-Kubitza	streams.py: Added TimeoutInputStream and WrapStream. Changed StreamIter to use new WrapStream.
1634	03/26/2012 10:42 PM	Aaron Marcuse-Kubitza	Added timeout.py
1633	03/26/2012 10:25 PM	Aaron Marcuse-Kubitza	inputs/REMIB/src/nodes.all.specimens.csv.make: Download from all prefixes of all nodes. Stop when a node produces an empty response (not even an error), which indicates no more nodes. Changed status messages.
1632	03/26/2012 10:17 PM	Aaron Marcuse-Kubitza	input.Makefile: `src/%: src/%.make`: Append stderr to log file
1631	03/26/2012 09:21 PM	Aaron Marcuse-Kubitza	Added inputs/REMIB/src/nodes.all.specimens.csv.make to download REMIB data for all nodes
1630	03/26/2012 09:20 PM	Aaron Marcuse-Kubitza	Added streams.py for I/O, which contains StreamIter, TracedOutputStream, and LineCountOutputStream
1629	03/26/2012 09:20 PM	Aaron Marcuse-Kubitza	term.py: Added clear_line. Corrected file comment.
1628	03/26/2012 08:06 PM	Aaron Marcuse-Kubitza	Makefiles: Let subdir's Makefile decide whether to delete on error
1627	03/26/2012 08:05 PM	Aaron Marcuse-Kubitza	input.Makefile: Save partial outputs of aborted src make scripts
1626	03/26/2012 06:44 PM	Aaron Marcuse-Kubitza	input.Makefile: Fixed bug in `%: %.make` rule to use $< instead of $*
1625	03/26/2012 06:20 PM	Aaron Marcuse-Kubitza	mappings/DwC2-VegBIEN.specimens.csv: minimumElevationInMeters: Remove any "ca." prefix
1624	03/26/2012 06:19 PM	Aaron Marcuse-Kubitza	xml_func.py: _replace: Strip whitespace from the returned string
1623	03/26/2012 06:09 PM	Aaron Marcuse-Kubitza	csvs.py: Added TsvReader to support TSV quirks. Added reader_class(). reader_and_header(): Use reader_class() to automatically use TsvReader instead of csv.reader for TSVs. Added is_tsv() and use it where `dialect.delimiter == '\t'` was used.
1622	03/26/2012 06:06 PM	Aaron Marcuse-Kubitza	strings.py: Added extract_line_ending() and remove_line_ending(). ensure_newl(): Use new remove_line_ending(). Moved Parsing section to top since it is used by the other sections.
1621	03/26/2012 04:40 PM	Aaron Marcuse-Kubitza	csvs.py: stream_info(): Set dialect.quoting = csv.QUOTE_NONE for TSVs because they usually don't quote fields. Factored dialect detecting code into new function sniff().
1620	03/26/2012 03:45 PM	Aaron Marcuse-Kubitza	input.Makefile: verify: Added reverify option, which can be turned off to prevent regenerating the verify/%.out file from the DB (which can be time-consuming), and instead just diff verify/%.out with verify/%.ref
1619	03/24/2012 10:31 PM	Aaron Marcuse-Kubitza	count_error_rows: Allow input to be specified as last arg(s) in addition to as stdin
1618	03/24/2012 10:30 PM	Aaron Marcuse-Kubitza	exc.py: ExPercentTracker: When diplaying fraction of iters that had errors, don't duplicate the iter_text ("row", etc.) in the numerator
1617	03/24/2012 10:27 PM	Aaron Marcuse-Kubitza	bin/map: Use new ExPercentTracker iter_num tracking to track distinct row #s with errors
1616	03/24/2012 10:27 PM	Aaron Marcuse-Kubitza	exc.py: ExPercentTracker: Track iter_nums of Exceptions as well, to distinguish how many distinct iters had errors
1615	03/24/2012 10:10 PM	Aaron Marcuse-Kubitza	Added bin/count_error_rows to count distinct rows with errors in `map` error messages
1614	03/24/2012 09:06 PM	Aaron Marcuse-Kubitza	input.Makefile: Changed "%.out: .make" rule to ": %.make" so that any file can be built from a corresponding .make file. This will allow flat files to be retrieved dynamically by running an associated .make file.
1613	03/24/2012 09:01 PM	Aaron Marcuse-Kubitza	xml_func.py: FormatException: Inherit from ExceptionWithCause instead of SyntaxError because a FormatException signals a different kind of error condition (related to the input value rather than the function syntax)
1612	03/24/2012 08:57 PM	Aaron Marcuse-Kubitza	xml_func.py: Renamed SyntaxException to SyntaxError because it's a user error signaling invalid mappings syntax
1611	03/24/2012 08:55 PM	Aaron Marcuse-Kubitza	xml_func.py: SyntaxException: Use ExceptionWithCause to combine msg and cause's msg because it now combines them on one line, which is needed for bin/error_stats to work properly
1610	03/24/2012 08:54 PM	Aaron Marcuse-Kubitza	exc.py: ExceptionWithCause: Prepend msg to cause's msg separated by ': ' instead of '\ncause: '
1609	03/24/2012 08:47 PM	Aaron Marcuse-Kubitza	xml_func.py: Changed SyntaxException to FormatException where the error was with the input data format rather than the mapping syntax
1608	03/24/2012 08:41 PM	Aaron Marcuse-Kubitza	mappings/VegX-VegBIEN.organisms.csv: slopeaspect: Apply new conversion _compass
1607	03/24/2012 08:40 PM	Aaron Marcuse-Kubitza	xml_func.py: Added _compass to convert a compass direction (N, NE, NNE, etc.) into a degree heading
1606	03/24/2012 08:38 PM	Aaron Marcuse-Kubitza	Added angles.py
1605	03/24/2012 07:37 PM	Aaron Marcuse-Kubitza	inputs/SpeciesLink/maps: Updated to use new TAPIR download
1604	03/24/2012 07:29 PM	Aaron Marcuse-Kubitza	input.Makefile: All targets can be specified with an optional trailing slash. This enables using tab completion to complete a target name which is also a subdir name, since tab completion appends a trailing slash.
1603	03/24/2012 07:23 PM	Aaron Marcuse-Kubitza	bin/tapir/tapir2flat.php: Fixed bug in row assembly where XML elements that weren't found were left out of the array, causing the columns to shift to the left
1602	03/24/2012 07:03 PM	Aaron Marcuse-Kubitza	xml_func.py: _map: Factored replacing code out into new function repl(), which can also be used by other XML funcs
1601	03/24/2012 06:46 PM	Aaron Marcuse-Kubitza	bin/tapir/tapir2flat.php: Turned off exiting after 3 successive failures, because it causes the import to abort and it doesn't seem to restart where it left off
1600	03/24/2012 03:41 PM	Aaron Marcuse-Kubitza	main Makefile: Added instructions to install PHP PEAR and HTTP_Request on Mac OS X
1599	03/24/2012 03:10 PM	Aaron Marcuse-Kubitza	Makefile: Added PHP section, which installs php-http-request
1598	03/24/2012 03:05 PM	Aaron Marcuse-Kubitza	Moved _archive/tapir2flatClient/trunk/client/ to bin/tapir/
1597	03/24/2012 03:03 PM	Aaron Marcuse-Kubitza	_archive/tapir2flatClient/trunk/client/tapir2flat.php: Upgraded to use fputcsv(). This should fix errors caused by embedded delimeters. configurableParams.php: Set default delimeter to ','.
1596	03/24/2012 02:42 PM	Aaron Marcuse-Kubitza	mappings/verify.specimens.sql: # species: Don't join at all on genus because DISTINCT is on the plantname_id rather than the plantname, which is already unique for a given genus because plantname_unique includes parent_id
1595	03/24/2012 02:39 PM	Aaron Marcuse-Kubitza	mappings/verify.specimens.sql: # species: Fixed to join separately on plantname_ancestor for genus and species
1594	03/24/2012 02:14 PM	Aaron Marcuse-Kubitza	input.Makefile: Moved log and trace files to new import subdir. Moved subdir-adding code from inputs/Makefile to input.Makefile.
1593	03/24/2012 01:49 PM	Aaron Marcuse-Kubitza	mappings/verify.specimens.sql: Updated for schema changes
1592	03/24/2012 01:36 PM	Aaron Marcuse-Kubitza	inputs/*: Added any missing standard subdirs
1591	03/24/2012 01:35 PM	Aaron Marcuse-Kubitza	inputs/Makefile: Added %/-add to re-add existing dirs
1590	03/24/2012 01:29 PM	Aaron Marcuse-Kubitza	inputs/Makefile: %-add: `svn mkdir` the datasource's standard subdirs
1589	03/23/2012 06:52 PM	Aaron Marcuse-Kubitza	schemas/postgresql.nimoy.conf: Increased work_mem (for sorting) and maintenance_work_mem (for vacuum)
1588	03/23/2012 06:45 PM	Aaron Marcuse-Kubitza	schemas/postgresql.nimoy.conf: Reset shared_buffers to initial value 24MB because although kernel.shmmax is 32MB, only values up to 26MB seem to work
1587	03/23/2012 06:33 PM	Aaron Marcuse-Kubitza	schemas/postgresql.nimoy.conf: Set shared_buffers to SHMMAX
1586	03/23/2012 06:27 PM	Aaron Marcuse-Kubitza	Optimized schemas/postgresql.nimoy.conf
1585	03/23/2012 06:04 PM	Aaron Marcuse-Kubitza	Added schemas/postgresql.nimoy.conf
1584	03/23/2012 05:59 PM	Aaron Marcuse-Kubitza	bin/map: When profiling, print the profile_to destination file
1583	03/23/2012 05:53 PM	Aaron Marcuse-Kubitza	Added schemas/postgresql.conf
1582	03/23/2012 05:38 PM	Aaron Marcuse-Kubitza	xml_func.py: _date: When converting month name to number, wrap any ValueError in a SyntaxException
1581	03/23/2012 05:33 PM	Aaron Marcuse-Kubitza	xml_func.py: XML functions that assume their last argument is a value (_map, etc.): Use new helper function pop_value() to retrieve this value. Return None if value is None because this indicates the input is empty.
1580	03/23/2012 05:22 PM	Aaron Marcuse-Kubitza	xml_func.py: _date: Use format.str2int instead of int to convert date parts to int so that strange formatting will be parsed correctly
1579	03/23/2012 05:21 PM	Aaron Marcuse-Kubitza	format.py: clean_numeric(): Also fix some OCR errors
1578	03/23/2012 05:15 PM	Aaron Marcuse-Kubitza	filter_errors: Default to outputing only the first match
1577	03/23/2012 04:59 PM	Aaron Marcuse-Kubitza	xpath.py: Added append() to recursively append subpath to every leaf of a path tree. parse(): Use append() to fix bug in split path parsing where subpath was not added to every leaf of the tree, only the main leaf of the main branch and the main leaves of the other branches of the last element.
1576	03/23/2012 04:27 PM	Aaron Marcuse-Kubitza	exc.py: Changed to store multiple tracebacks in an exception, in case an exception is caught and re-raised inside an ExceptionWithCause wrapper. This preserves more of the traceback in this situation, because you get the ExceptionWithCause's traceback as well.
1575	03/23/2012 03:53 PM	Aaron Marcuse-Kubitza	input.Makefile: import: Removed verbose=1 because verbose mode is now automatically on (except in test mode)
1574	03/23/2012 03:52 PM	Aaron Marcuse-Kubitza	bin/map: verbose mode defaults to off in test mode and on otherwise
1573	03/23/2012 03:48 PM	Aaron Marcuse-Kubitza	bin/map: In verbose mode, print which input rows will be processed
1572	03/23/2012 03:40 PM	Aaron Marcuse-Kubitza	bin/map: n option: Defaults to 1 in test mode. Empty string "" is interpreted as None (previously n would have to be unset to specify None).
1571	03/23/2012 03:32 PM	Aaron Marcuse-Kubitza	bin/map: Added section comments to env var config retrieval. Reordered env var config retrieval to put DB config last, since these options are input-type specific and complex, and putting them first hides the more general other options.
1570	03/23/2012 03:31 PM	Aaron Marcuse-Kubitza	bin/map: Added section comments to env var config retrieval. Reordered env var config retrieval to put DB config last, since these options are input-type specific and complex, and putting them first hides the more general other options.
1569	03/23/2012 03:29 PM	Aaron Marcuse-Kubitza	inputs/SALVIAS*/maps/VegX.plots.csv: Updated _units for % -> decimal conversion to use new syntax
1568	03/23/2012 03:20 PM	Aaron Marcuse-Kubitza	inputs/SALVIAS*/maps/VegX.plots.csv: Updated _units for % -> decimal conversion to use new syntax
1567	03/23/2012 03:19 PM	Aaron Marcuse-Kubitza	xml_func.py: _units: If value can't be converted to float, wrap the ValueError in a SyntaxException
1566	03/23/2012 03:18 PM	Aaron Marcuse-Kubitza	units.py: convert(): Added support for unit conversions. Added initial unit conversion for % -> unitless. str2quantity(): Fixed regexp to match % as units. Set Quantity.__repr__ to quantity2str.
1565	03/23/2012 03:03 PM	Aaron Marcuse-Kubitza	units.py: convert(): Put "units None" test after "quantity.units units" test because a destination of no units might require a conversion for some input units (e.g. % -> unitless requires a division by 100)
1564	03/23/2012 02:51 PM	Aaron Marcuse-Kubitza	inputs/SALVIAS*/maps/VegX.organisms.csv: Habit: Ignore invalid values instead of generating a SyntaxException
1563	03/23/2012 02:47 PM	Aaron Marcuse-Kubitza	xml_dom.py: minidom modifications: Escape as many text strings as we use directly. This still leaves the tagName used by xml.dom.minidom.Element.writexml: It uses 'writer.write(indent+"<" + self.tagName)' and doesn't escape the tagName.
1562	03/23/2012 02:39 PM	Aaron Marcuse-Kubitza	xml_func.py: Made everything Unicode-safe by using strings.ustr instead of str
1561	03/23/2012 12:48 PM	Aaron Marcuse-Kubitza	schemas/tree_cross-links.sql: Added comment for how to get the namedplace trigger from the provided plantname trigger
1560	03/23/2012 12:44 PM	Aaron Marcuse-Kubitza	vegbien.sql: Fixed bug in tree cross-link algorithm where recursion to descendants' ancestors did not use new to refer to the current node's plantname_id
1559	03/23/2012 12:39 PM	Aaron Marcuse-Kubitza	vegbien.sql: Fixed bug in tree cross-link algorithm to also insert ancestors for top-level nodes, because they now need an ancestor entry for themselves
1558	03/23/2012 12:28 PM	Aaron Marcuse-Kubitza	Added separate SQL file for tree cross-links code. A link to this can be e-mailed to people to review.
1557	03/23/2012 12:21 PM	Aaron Marcuse-Kubitza	vegbien.sql: Modified tree cross-link algorithm to add an "ancestor" for this node. This is useful for queries, because you don't have to separately test if the leaf node is the one you're looking for, in addition to that leaf node's ancestors.
1556	03/22/2012 07:08 PM	Aaron Marcuse-Kubitza	README.TXT: Added instructions how to stop all running imports
1555	03/22/2012 06:59 PM	Aaron Marcuse-Kubitza	vegbien.sql: Added namedplace_update_ancestors and plantname_update_ancestors triggers to populate ancestor cross-links in new namedplace_ancestor and plantname_ancestor tables
1554	03/22/2012 06:07 PM	Aaron Marcuse-Kubitza	sql.py: insert() (and try_insert()): Added optional returning param to provide name of an inserted column (usually pkey) to return
1553	03/22/2012 05:41 PM	Aaron Marcuse-Kubitza	env_password: Print Usage message if run without initial "."
1552	03/22/2012 05:34 PM	Aaron Marcuse-Kubitza	Added bin/stop_imports to stop all running imports
1551	03/22/2012 05:33 PM	Aaron Marcuse-Kubitza	import_all: Print Usage message if was run without initial "."
1550	03/22/2012 04:52 PM	Aaron Marcuse-Kubitza	Renamed import-all to import_all to match convention of using underscores
1549	03/22/2012 04:39 PM	Aaron Marcuse-Kubitza	inputs/CTFS: Added remaining non-data src files
1548	03/22/2012 04:35 PM	Aaron Marcuse-Kubitza	Added CTFS data dictionary inputs/CTFS/src/ctfs-comments_worksheet.xls
1547	03/22/2012 04:33 PM	Aaron Marcuse-Kubitza	import-all: Fixed to display the datasource name in the job name instead of 'make ${input}import &'
1546	03/20/2012 11:13 PM	Aaron Marcuse-Kubitza	import-all: disown each new import process to ignore SIGHUP
1545	03/20/2012 11:06 PM	Aaron Marcuse-Kubitza	Added jobspecs to extract jobspecs (%#) from (possibly filtered) `jobs` output
1544	03/20/2012 11:05 PM	Aaron Marcuse-Kubitza	README.TXT: Changed `make import &` to `. bin/import-all`

Project

General

Profile