lib/csvs.py: JsonReader: factored out row-dict-to-list into new row_dict_to_list_reader so that JSON-specific preprocessing is kept separate from the row format translation
lib/csvs.py: added MultiFilter, which enables applying multiple filters by nesting
lib/tnrs.py: single_tnrs_request(): JSON mode: implemented output of JSON data
lib/tnrs.py: single_tnrs_request(): factored out wrapping in TnrsOutputStream, since this is done for both modes
fix: lib/tnrs.py: JSON mode: TSV export columns: need to translate these to JSON column names before they can be used with the JSON data
lib/csvs.py: added JsonReader, which reads parsed JSON data as row tuples
lib/csvs.py: added row_dict_to_list(), which translates a CSV dict-based row to a list-based one
lib/csvs.py: RowNumFilter: added support for filtering the header row as well
lib/csvs.py: ColInsertFilter: added support for filtering the header row as well
lib/csvs.py: InputRewriter: documented that this is also a stream (in addition to inheriting from StreamFilter)
bugfix: lib/csvs.py: InputRewriter: accept a reader, as would be expected, instead of a custom stream whose lines are tuples
fix: lib/sql_io.py: append_csv(): use new csvs.ProgressInputFilter instead of streams.ProgressInputStream(csvs.StreamFilter(__)), so that the input to csvs.InputRewriter is a reader, not a stream. this avoids the need for csvs.InputRewriter to accept a stream whose lines are tuples, instead of the expected reader.
lib/csvs.py: added ProgressInputFilter, analogous to streams.ProgressInputStream
lib/sql_io.py: added commented-out debug statement used to troubleshoot copy_expert() errors
lib/dicts.py: added pair_keys(), pair_values()
bugfix: lib/streams.py: CaptureStream: end_idx must also be > start_idx
lib/tnrs.py: single_tnrs_request(): use_tnrs_export=False: need to obtain export columns
lib/csvs.py: added header(stream)
fix: lib/tnrs.py: single_tnrs_request(): need to `assert name_ct >= 1`, because with no names, TNRS hangs indefinitely
bugfix: lib/sh/archives.sh: compress(): don't include dir prefix in zip archive
lib/sh/util.sh: cd(): use echo_run instead of a manual echo_cmd call
fix: lib/sh/util.sh: cd(): indent after running cd rather than before
lib/sh/util.sh: cd(): support rebasing path vars for the new dir
bugfix: lib/sh/archives.sh: compress(): need to use zip's path syntax to avoid the file in the archive being named "-"
lib/tnrs.py: added option to avoid using TNRS's TSV export feature, which currently returns incorrect selected matches (vegpath.org/issues/943). this has been implemented up through the GWT/JSON decoding.
lib/tnrs.py: added gwt_decode()
lib/strings.py: added unesc_quotes() and helper functions
lib/strings.py: added json_decode()
lib/runscripts/extract.run: export_(): also compress created file
lib/sh/archives.sh: added compress(), expand(), which handle compression of individual files
lib/tnrs.py: documentation about output of the retrieve step: added that this is also unusable because the array does not contain all the columns and contains no column names
fix: lib/tnrs.py: retrieval_request_template: source_sorting (Constrain by Source): corrected explanation to reflect that the behavior is actually the same in both modes, since only one match is ever marked as selected, and that match should always come first
bugfix: lib/sh/util.sh: str2varname(): need to lowercase str because on case-insensitive filesystems, paths sometimes canonicalize to a different capitalization than the original
lib/sh/util.sh: added lowercase()
bugfix: lib/sh/util.sh: die(): need stub since this is invoked before it's defined
bugfix: lib/sh/util.sh: setup_log_fd(): don't change $log_fd to stdlog until stdlog is set up, to avoid "Bad file descriptor" errors
lib/sh/util.sh: func_override(), copy_func(): added echo_func to facilitate debugging
bugfix: lib/sh/util.sh: stubs: log++ alias also needs to be moved to stub section
bugfix: lib/Firefox_bookmarks.reformat.csv: URLs: match only the uppercase tags used by Firefox, not any lowercase tags added by the user
fix: lib/Firefox_bookmarks.reformat.csv: page's self-description: updated comment to match regexp
bugfix: lib/Firefox_bookmarks.reformat.csv: page's self-description: updated "page's self-description: " prefix to remove
lib/sh/sync.sh: db_snapshot(): before backing up, trim bloated temp files (eg. from rolled back changes)
bugfix: lib/sql_io.py: put_table(): handle_MissingCastException(): when updating join_cols, don't add new entry for join_cols[out_col], only update existing one. this fixes #902 (import bug), and with #902 fixed, #887 (disk space leak) should no longer occur.
bugfix: lib/Firefox_bookmarks.reformat.csv: updated for new Firefox bookmarks format, which indents the <DD> tag
lib/tnrs.py: dirty: documented that this actually used to be on in the web app (see r9910, 2013-6-18), but does not appear to be needed (the source_sorting bug alluded to in r9910 is not fixed by enabling the dirty setting)
lib/tnrs.py: requests: also debug-print request URL
lib/tnrs.py: Download: include the same debug info as do_request()
lib/tnrs.py: do_request(): also debug-print request headers
lib/tnrs.py: download_request_template: dirty: documented why this must be off
bugfix: lib/tnrs.py: download_request_template: fixed bug where multiple names were being marked as Selected, because dirty was incorrectly set to true unlike in the web app
fix: lib/phpPgAdmin.login.php.diff: use relative file path rather than the path the file was at when the patch was created
/Makefile, lib/phpPgAdmin.login.php.diff: public_ user: added auto-filled password so that users would not be confused as to what to type in the password field
lib/tnrs.py: source_sorting (Constrain by Source): documented the different behavior for this in each match mode (all-matches and best-match)
fix: lib/runscripts/extract.run: export_(): explicitly prevent files from becoming web-accessible, to protect against an incorrect umask in the calling process
lib/tnrs.py: max_names: raised back up to 500 now that a workaround for the Internal Server Errors is in place (https://github.com/iPlantCollaborativeOpenSource/TNRS/issues/7)
fix: lib/tnrs.py: max_names: lowered to 50 because the dev TNRS server is now always crashing with an Internal Server Error when scrubbing 500 names at a time (https://github.com/iPlantCollaborativeOpenSource/TNRS/issues/7)
fix: lib/tnrs.py: Constrain by Source: turn it on so that the download settings reflect what TNRS actually used, while this is broken
fix: lib/tnrs.py: max_names: reduced back to 500 because even 5000 crashes the dev TNRS server
lib/tnrs.py: max_names: reduced to 5000 because 100,000 causes an internal server error
lib/tnrs.py: switched to downloading all matches per name, as is needed to implement #917. note that this will break the parts of the schema that use the tnrs table, until Brad's match-picking algorithm can be implemented, but this tradeoff is necessary to be able to begin scrubbing sooner (Martha; wiki.vegpath.org/2014-05-29_conference_call#TNRS)
lib/tnrs.py: max_names: increased to 100000 because the dev server can handle more names (no simultaneous users), as decided in the conference call (wiki.vegpath.org/2014-05-29_conference_call#TNRS)
fix: lib/PostgreSQL-MySQL.csv: need to replace "double precision" with "double" to work with MySQL Workbench 5.2.47
lib/tnrs.py: commented out the value of max_names that is not active, for clarity
lib/tnrs.py: sources: updated to list/sort order in issue #917
fix: lib/PostgreSQL-MySQL.csv: also remove left-behind lines such as `$$);`
bugfix: lib/runscripts/util.run: $is_runscript: unexport so don't pass it to invoked scripts
bugfix: lib/runscripts/util.run: run_args_cmd(): don't prepend main to args if no args, because for a non-runscript, all args will be passed to main(), leading `main` to be doubled
lib/tnrs.py: use the TNRS dev server (with private URL in tnrs.url) instead of the live server, since that contains datasources that we need
lib/streams.py: added file_get_contents()
lib/tnrs.py: configure the server separately from the base URL
lib/: svn:ignore tnrs.url so the TNRS dev server URL does not become public
lib/tnrs.py: retrieval_request_template: taxonomic_constraint, source_sorting: documented their meaning and why they need to be on/off
lib/runscripts/import.run: added install() target
lib/runscripts/in_datasrc_dir.run: use new local.run
added lib/runscripts/local.run
fix: lib/util.py: dict_subset(): raise an error if collections.OrderedDict isn't available, because some callers may depend on this. note that using dict instead of OrderedDict may be the cause of the joining on the wrong columns bug (issue #902).
bugfix: lib/runscripts/validations.pg.sql.run: updated to reflect that validations.sql is now located inside a subdir, not the datasrc dir
fix: lib/runscripts/file.pg.sql.run: removed include of in_datasrc_dir.run, because this location does not apply to all .sql export scripts
lib/runscripts/validations.pg.sql.run: export_(): make the export idempotent for easier re-runnability
bugfix: lib/sh/db.sh: pg_dump(): need use_pg to import $pg_database before checking for existence of $database
lib/sh/util.sh: import_vars: documented that it's idempotent
bugfix: lib/util.py: use OrderedDict from collections rather than ordereddict to work with Mac OS X 10.8 Mountain Lion (http://vegpath.org/links/#OrderedDict)
bugfix: benign_does_not_exist_error(): removed ignore_e=3, because this exit status is also used for other errors
fix: lib/sh/db.sh: benign_does_not_exist_error(): use benign_error=1, which is now supported properly by stderr_matches()
bugfix: lib/sh/util.sh: stderr_matches(): support $benign_error properly, by handling exit status logging in this func instead
bugfix: lib/sh/db.sh: pg_schema_exists(): also need to benignify "does not exist" error if returns false
bugfix: lib/sh/util.sh: stderr_matches(): need to separately display errors that were incorrectly suppressed due to $benign_error
bugfix: lib/sh/util.sh: is_err(): rethrow must be inverted (rethrow->false if error)
lib/sh/util.sh: added is_err()
lib/sh/local.sh: public_schema_exists(): moved to lib/sh/db.sh since this no longer depends on BIEN-specific configurations
bugfix: lib/sh/db.sh: public_schema_exists(): don't hide the function call tree so it's clear which function is running the psql commands
fix: *{.sh,run}: stderr_matches() callers: added benign_error=1 where needed
fix: *{.sh,run}: stderr_matches() callers: usage: documented that they may require benign_error=1
fix: lib/sh/util.sh: stderr_matches(): usage: documented that this may require benign_error=1
bugfix: lib/sh/db.sh: pg_cmd(): updated for new echo_vars log_level
fix: lib/sh/db.sh: pg_schema_exists(): display the function name so it's clear which function is running the psql commands
fix: lib/sh/db.sh: pg_schema_exists(): don't use log++ because it hides the command that produces the benign error
fix: lib/sh/db.sh: psql(): removed debugging changes
bugfix: lib/sh/util.sh: highlight_log_msg(): when not can_highlight_log_msg, need to remove any surrounding formatting