lib/sql_io.py: cleanup_table(): debug-print null_strs
lib/sql_io.py: null_strs: made it customizable from an env var, since the same list of null_strs doesn't work for all datasources (see #957)
fix: *Makefile: changed line endings to \n so that `patch` can work with pasted input. use `svn di --extensions --ignore-eol-style` to verify no diff.
bugfix: lib/tnrs.py: encode_map: also need to encode + because TNRS removes it from the morphospecies (vegpath.org/wiki/CVS_validation#Bobs-revised-document > issue #4)
lib/sql_io.py: null_strs: added N/A and NA (this will remove a common abbr for North America, but we don't use the continent, so this is OK)
lib/runscripts/table.run: added check_headers()
bugfix: lib/runscripts/table.run: srcs: missing []
srcs: missing [
lib/runscripts/table.run: added header() and use it in header.txt()
lib/runscripts/table.run: 1st_src(): use a variable for this instead, to avoid needing to run this function each time it's used, and to make @srcs available
lib/sh/util.sh: wildcard.(): clarified that it only removes . .. when at then beginning of the list
lib/runscripts/table.run: added header.txt()
lib/runscripts/table.run: added 1st_src()
lib/runscripts/subdir.run: subdir_make(): use new $datasrc_dir
lib/runscripts/subdir.run: added $datasrc_dir
lib/sh/make.sh: make(): added support for $output_data mode which uses --silent
fix: lib/sh/util.sh: already_exists_msg(): changed calling convention to avoid it seeming like `return 0` is run if already_exists_msg() throws an error, when in fact already_exists_msg() is just a command that should be run before returning/errexiting
bugfix: lib/common.Makefile: $(wildcard/): need final pass with $(wildcard) to support inputs without wildcard chars
bugfix: lib/sh/local.sh: $sync_remote_url: jupiter user is always aaronmk, not the local user
*{.sh,run}: use standard WARNING syntax for warning labels
fix: lib/sh/util.sh: verbosity_compat(): documented that this should not be run until right before executing an external command, so that it doesn't mess up the logging mechanism
bugfix: lib/sh/util.sh: functions called by pst(): commented out/removed logging calls that would have caused infinite recursion when pst()'ing a logging function
lib/sh/util.sh: echo_func(): get call context before wrapper(s), which is more useful for debugging. this uses skip_stack_frames()'s lookahead=1 mode.
fix: lib/sh/util.sh: skip_stack_frames() callers: updated for new skip_stack_frames rather than get_stack_frame behavior
lib/sh/util.sh: skip_stack_frames(): added lookahead support, which looks at entry after current to deterine whether to skip current. this is useful for skipping wrappers, by looking at the calling function's name.
fix: lib/sh/util.sh: skip_stack_frames(): for new skip_stack_frames rather than get_stack_frame behavior, stack frames must be skipped in the caller to preserve the stack frame pointer
lib/sh/util.sh: get_stack_frame_after(): renamed to skip_stack_frames() for clarity
lib/sh/util.sh: added skip_stack_frame_in_caller, unskip_stack_frame_in_caller
lib/sh/util.sh: added prev_stack_frame
lib/sh/util.sh: echo_func(): use new format_stack_frame, which adds call context information to what was provided by func_loc
fix: lib/sh/util.sh: format_stack_frame(): need to hide canon_rel_path() info using log+
lib/sh/util.sh: added get_stack_frame_after()
lib/sh/util.sh: added matches()
lib/sh/util.sh: next_stack_frame: documented usage
fix: lib/runscripts/util.run: runscript template: all(): moved example commands to target(), where they would more likely be located
lib/sh/util.sh: format_stack_frame(): support including args
lib/sh/util.sh: debugging: added pst() (print_stack_trace)
lib/sh/util.sh: added stack_trace(), print_stack_trace()
lib/sh/util.sh: added format_stack_frame()
lib/sh/util.sh: added get_stack_frame() and helpers
lib/sh/util.sh: terminal: moved before errors so it can be used by it
lib/sh/util.sh: errors, debugging: moved after datatype sections so their functions can use these
bugfix: lib/sh/util.sh: canon_rel_path() stub: proper no-op requires passing through original path
lib/sh/util.sh: canon_rel_path(): fall back to original path if can't resolve, instead of errexiting
bugfix: lib/sh/util.sh: canon_rel_path(): don't re-localize $path because this clears it
lib/sh/util.sh: canon_rel_path(): import $1 to $path before function body, so that the function body can be moved to a nested function
lib/sh/util.sh: added canon_rel_path() stub for use by debugging functions
lib/sh/util.sh: moved func_loc() to before debugging section so it can be used by debugging functions
bugfix: lib/sh/util.sh: command__exec(): need to restore $verbosity before calling die_e
fix: lib/sh/local.sh: $sync_remote_url: need $USER so user can be overridden when running as root
lib/Firefox_bookmarks.reformat.csv: label page's self-description as such: also support quotations enclosed in '
lib/sh/util.sh: echo_vars(): merge repeated flags so there aren't flags in between the vars (which is also not valid declare syntax)
lib/sh/db.sh: pg_cmd(): log vars on same line to avoid clutter
lib/sh/util.sh: echo_vars(): put all the vars on the same line so they don't clutter up the call graph generated at the default verbosity
lib/tnrs.py single_tnrs_request(), bin/tnrs_client: use_tnrs_export: default to False because this mode uses incorrect selected matches (vegpath.org/issues/943), and the JSON mode that fixes this is now available
bugfix: lib/csvs.py: JsonReader: need to pass col_order to row_dict_to_list_reader
bugfix: lib/tnrs.py: JSON output: need to stringify arrays so they match what is output in TSV-export mode
lib/csvs.py: JsonReader: added support for values that are arrays
lib/csvs.py: MultiFilter: inherit from WrapReader instead of Filter to avoid needing to define a no-op filter_() function
bugfix: lib/csvs.py: row_dict_to_list_reader: need to override next() directly instead of just using Filter, because Filter doesn't support returning multiple rows for one input row (in this case, prepending a header row). this caused the 1st data row to be missing.
lib/csvs.py: Filter: inherit from WrapReader, which separates out the CSV-reader API code
lib/csvs.py: added WrapReader
lib/csvs.py: added Reader
lib/csvs.py: JsonReader: factored out row-dict-to-list into new row_dict_to_list_reader so that JSON-specific preprocessing is kept separate from the row format translation
lib/csvs.py: added MultiFilter, which enables applying multiple filters by nesting
lib/tnrs.py: single_tnrs_request(): JSON mode: implemented output of JSON data
lib/tnrs.py: single_tnrs_request(): factored out wrapping in TnrsOutputStream, since this is done for both modes
fix: lib/tnrs.py: JSON mode: TSV export columns: need to translate these to JSON column names before they can be used with the JSON data
lib/csvs.py: added JsonReader, which reads parsed JSON data as row tuples
lib/csvs.py: added row_dict_to_list(), which translates a CSV dict-based row to a list-based one
lib/csvs.py: RowNumFilter: added support for filtering the header row as well
lib/csvs.py: ColInsertFilter: added support for filtering the header row as well
lib/csvs.py: InputRewriter: documented that this is also a stream (in addition to inheriting from StreamFilter)
bugfix: lib/csvs.py: InputRewriter: accept a reader, as would be expected, instead of a custom stream whose lines are tuples
fix: lib/sql_io.py: append_csv(): use new csvs.ProgressInputFilter instead of streams.ProgressInputStream(csvs.StreamFilter(__)), so that the input to csvs.InputRewriter is a reader, not a stream. this avoids the need for csvs.InputRewriter to accept a stream whose lines are tuples, instead of the expected reader.
lib/csvs.py: added ProgressInputFilter, analogous to streams.ProgressInputStream
lib/sql_io.py: added commented-out debug statement used to troubleshoot copy_expert() errors
lib/dicts.py: added pair_keys(), pair_values()
bugfix: lib/streams.py: CaptureStream: end_idx must also be > start_idx
lib/tnrs.py: single_tnrs_request(): use_tnrs_export=False: need to obtain export columns
lib/csvs.py: added header(stream)
fix: lib/tnrs.py: single_tnrs_request(): need to `assert name_ct >= 1`, because with no names, TNRS hangs indefinitely
bugfix: lib/sh/archives.sh: compress(): don't include dir prefix in zip archive
lib/sh/util.sh: cd(): use echo_run instead of a manual echo_cmd call
fix: lib/sh/util.sh: cd(): indent after running cd rather than before
lib/sh/util.sh: cd(): support rebasing path vars for the new dir
bugfix: lib/sh/archives.sh: compress(): need to use zip's path syntax to avoid the file in the archive being named "-"
lib/tnrs.py: added option to avoid using TNRS's TSV export feature, which currently returns incorrect selected matches (vegpath.org/issues/943). this has been implemented up through the GWT/JSON decoding.
lib/tnrs.py: added gwt_decode()
lib/strings.py: added unesc_quotes() and helper functions
lib/strings.py: added json_decode()
lib/runscripts/extract.run: export_(): also compress created file
lib/sh/archives.sh: added compress(), expand(), which handle compression of individual files
lib/tnrs.py: documentation about output of the retrieve step: added that this is also unusable because the array does not contain all the columns and contains no column names
fix: lib/tnrs.py: retrieval_request_template: source_sorting (Constrain by Source): corrected explanation to reflect that the behavior is actually the same in both modes, since only one match is ever marked as selected, and that match should always come first
bugfix: lib/sh/util.sh: str2varname(): need to lowercase str because on case-insensitive filesystems, paths sometimes canonicalize to a different capitalization than the original
lib/sh/util.sh: added lowercase()
bugfix: lib/sh/util.sh: die(): need stub since this is invoked before it's defined
bugfix: lib/sh/util.sh: setup_log_fd(): don't change $log_fd to stdlog until stdlog is set up, to avoid "Bad file descriptor" errors