schemas/vegbien.sql: Use new TNRS.tnrs_canon instead of tnrs+accepted to avoid creating additional taxonlabels for the parsed, matched, and accepted names and instead just use the most-canonicalized name of the names output by TNRS (the accepted name if available, or the matched name otherwise)
mappings/VegCore-VegBIEN.csv: "if has verbatim name" _if statements that filter something out for TNRS mappings: Also assume true if taxonIsCanonical is specified, because some TNRS tables (eventually such as public.unscrubbed_taxondetermination_view) do not specify a separate "verbatim" taxondetermination but do provide taxonIsCanonical as a flag to turn various mappings on and off
mappings/VegCore-VegBIEN.csv: Remapped matched*Fit_fraction to taxondetermination.taxonfit when a taxondetermination, not just a taxonlabel, is provided
bin/map: map_table(): Resolving prefixes: Fixed bug where need to use list instead of tuple for metadata value mappings
schemas/vegbien.sql: taxondetermination: Added CHECK constraint to allow only taxondeterminations with a minimum fit fraction of 80%, analogous to taxonlabel's taxonlabel_1_matched_label_min_fit() trigger
mappings/VegCore-VegBIEN.csv: Don't create a separate TNRS input taxonlabel if taxonIsCanonical exists
inputs/.TNRS/schema.sql: tnrs_canon: Fixed bug where need to always use Unmatched_terms from tnrs rather than tnrs_accepted
inputs/.TNRS/schema.sql: Added tnrs_canon, which stores the most canonicalized name output by TNRS
schemas/vegbien.sql: analytical_stem_view: accepted_taxonverbatim: Fixed bug where need to join only to the taxonverbatim whose morphospecies is NULL, to avoid joining to multiple taxonverbatims at once. This extra filter is now needed because there can be multiple taxonverbatims for a taxonlabel with different morphospecies.
mappings/VegCore-VegBIEN.csv: taxonlabel.taxonomicname: Prepend the family to the rest of the name using new _merge_prefix() instead of _join_words()/_nullIf(), so that any input taxonomic name that includes the family will not have the family duplicated in the combined taxonomic name. Previously, the duplication was removed only when the rest of the input name was equal to the family. This change fixes a bug in the new TNRS import where a pre-concatenated taxonomic name (Accepted_scientific_name) which includes the family is now used instead of Accepted_name, which only includes it when it's equal to the family.
xml_func.py: Simplifying functions: Merging: Added _merge_prefix() passthru
schemas/functions.sql: Added _merge_prefix()
inputs/.TNRS/schema.sql: tnrs_populate_accepted_scientific_name(): Fixed bug where Accepted_name_family shouldn't be prefixed to Accepted_name if Accepted_name is itself the family, to avoid duplicating the family in the Accepted_scientific_name
inputs/.TNRS/schema.sql: tnrs+accepted: Added new Accepted_scientific_name column and mapped it in public.unscrubbed_taxondetermination_view
schemas/vegbien.sql: tnrs_input_name: Fixed bug where need to filter out tnrs+accepted rows with NULL Accepted_scientific_name, because inputs to tnrs_db must be strings
schemas/vegbien.sql: tnrs_input_name: Prepend TNRS accepted names that have not yet been parsed. This allows parsing TNRS accepted names without first needing to import them into taxonlabels, which may not occur until the next import.
inputs/.TNRS/schema.sql: tnrs+accepted: Use new Accepted_scientific_name to join to tnrs_accepted.Name_submitted
inputs/.TNRS/schema.sql: tnrs: Added tnrs_populate_accepted_scientific_name() trigger
inputs/.TNRS/schema.sql: tnrs: Added Accepted_scientific_name field which will contain the joined-together accepted name that gets re-parsed by TNRS
inputs/.TNRS/: Changed tnrs+accepted to a view (defined in schema.sql) so accepted names would automatically be populated as they are parsed by TNRS, rather than needing to run `make inputs/.TNRS/tnrs+accepted/reinstall` to populate them
mappings/VegCore-VegBIEN.csv: Also map the morphospecies to the accepted taxonverbatim when an accepted name is provided
schemas/vegbien.sql: taxonverbatim: taxonverbatim_unique: Added morphospecies so that there can be multiple taxonverbatims for the same taxonlabel, each with different morphospecies suffixes
inputs/.TNRS/public.unscrubbed_taxondetermination_view/map.csv: Mapped Accepted_name.*
schemas/vegbien.sql: unscrubbed_taxondetermination_view: Use new tnrs+accepted instead of tnrs so that the accepted name can be imported at the same time
import_all: Reinstall tnrs+accepted, for eventual use by unscrubbed_taxondetermination_view
Added inputs/.TNRS/tnrs+accepted/, which self-joins the TNRS results to their parsed accepted names
import_all: Directly import just the TNRS tables that should be imported, because some TNRS tables are included in import_order.txt so that they are part of the automated testing, but should not be imported at the same time as tnrs_accepted/tnrs_other
inputs/import.stats.xls: Updated import times
with_all: $all mode: Fixed bug where need " " before # for it to be interpreted as a comment (unlike in a Makefile, where the " " often needs to be left out to avoid it being treated as part of a variable value)
bin/map: Made $redo flag default to off, because redo mode is slow (all tables have to be truncated) and is only needed when running tests on a public schema with data in it, which would not be the case on a development machine where tests are usually run
import_all: Made temporary vars local, so they wouldn't affect the calling shell
schemas/vegbien.sql: unscrubbed_taxondetermination_view: Sort by taxondetermination.taxonoccurrence_id instead of taxondetermination_id to allow scanning the taxondetermination_single_current_determination index, which includes only current determinations and avoids needing to scan past many non-current determinations. Note that using taxonoccurrence_id does not create sort order ambiguity between taxondeterminations with the same taxonoccurrence_id, because there is only one current determination per taxonoccurrence.
schemas/vegbien.sql: unscrubbed_taxondetermination_view: Inner-join to taxonverbatim and taxonlabel instead of LEFT JOINing, because only taxondeterminations with a taxonlabel can have accepted taxondeterminations (otherwise there would be no name to scrub)
schemas/vegbien.sql: unscrubbed_taxondetermination_view: Inner-join to tnrs instead of LEFT JOINing, because only taxondeterminations whose taxonlabels have already been scrubbed by TNRS should have accepted taxondeterminations added. Removed now-unneeded filter by tnrs.Name_submitted IS NOT NULL, which is replaced by the inner join.
sql_io.py: put_table(): ensure_cond(): Fixed bug where need to wrap strings used in the tracked error message in strings.ustr()
xml_dom.py: replace_with_text(): Fixed bug where need to use scalar.is_nonnull_scalar() instead of is_scalar() to avoid converting None values to the string 'None'
scalar.py: Added is_nonnull_scalar()
README.TXT: Data import: Fixed bug where `make inputs/upload` needs to be run on local machine, not vegbiendev
sql.py: create_table(): Support creating a table like a view
sql.py: Added InvalidTypeException and parse it in parse_exception()
mappings/VegCore.csv: Regenerated from wiki
schemas/vegbien.sql: taxondetermination_set_iscurrent(): Fixed bug where need to sort scrubbed determinations first for scrub.make to work. (Otherwise, a datasource determination might remain iscurrent even after a scrubbed determination was added, causing scrub.make to repeatedly attempt to re-add it.)
inputs/.TNRS/public.unscrubbed_taxondetermination_view/map.csv: Set dateIdentified to _now()
inputs/.TNRS/public.unscrubbed_taxondetermination_view/scrub.make: Unset $n to avoid limiting the # rows/iteration
schemas/py_functions.sql: parse_date_range(): Don't parse strings containing a time, because - and ' ' don't have the same meaning as in a date range
xml_dom.py: replace_with_text(): Support any scalar type recognized by scalar.is_scalar()
scalar.py: is_scalar(): Added datetime.datetime
schemas/functions.sql: Added _now()
import_all: Make $dump_opts, $public_import local vars, so they will be automatically unset if the script is aborted
mappings/VegCore-VegBIEN.csv: identificationType: Fixed bug in mapping where extra *_id/ needed to be removed
inputs/.TNRS/public.unscrubbed_taxondetermination_view/map.csv: Set taxonOccurrenceID to dummy value 0 to enable the taxonoccurrence CHECK constraint to pass. This is needed because the constraint must pass before the pkey (which should already exist) is even checked.
inputs/.TNRS/public.unscrubbed_taxondetermination_view/map.csv: Set identificationType to computer
mappings/VegCore-VegBIEN.csv: Mapped identificationType
schemas/vegbien.sql: unscrubbed_taxondetermination_view: Use `SELECT source_id FROM source WHERE shortname = ...` instead of source_by_shortname() so that the source table is updated to point to the same schema as the view rather than pointing to whichever version (usually public) is first in the search_path
schemas/vegbien.sql: unscrubbed_taxondetermination_view: Fixed bug where need to include only those taxondeterminations that already have a match in TNRS.tnrs, to avoid adding empty TNRS taxondeterminations. As the concurrent tnrs daemon runs, these taxondeterminations will gradually acquire matches in tnrs and then be processed by scrub.
import_all: Make $import_source a local var, so it will be automatically unset if the script is aborted
vegbien_dest: Schema override for referring to a table in the $public schema: Only process the override when $!schemaVar and $!tableVar are non-*empty*, to allow setting $schema=""
schemas/Makefile: DDL generation: vegbien.sql: Unset $dump_opts so that pg_dump does not use env vars left after running import_all
schemas/Makefile: DDL generation: vegbien.sql: Unset $version so that pg_dump always uses the public schema, even after running import_all
README.TXT: Testing: Added commands to put in .profile on a development machine
import_all: Added command to add scrubbed taxondeterminations
import_all: Start tnrs-remake after starting the inputs, so that for subset imports (e.g. n=2), there will already be names to scrub when tnrs-remake starts up and it won't enter pause mode to wait for new rows (the pause is calibrated for full imports, and is too long for subset imports)
with_all: Also exclude .archive/ from the subdirs to forward commands to
inputs/.TNRS/public.unscrubbed_taxondetermination_view/scrub.make: Added option to wait for new rows, in the same way tnrs_db does
inputs/.TNRS/public.unscrubbed_taxondetermination_view/scrub.make: Factored new rows added test out into rowsAdded() function
Added inputs/.TNRS/public.unscrubbed_taxondetermination_view/scrub.make, which adds scrubbed taxondeterminations to VegBIEN
root Makefile: Removed $(subMake), which is now defined properly by lib/common.Makefile
lib/common.Makefile: $(subMake): Removed `--makefile=../input.Makefile`, which is specific just to inputs/Makefile
input.Makefile: Import to VegBIEN: $(import): Print the date at the beginning of the import, so successive imports to the same version can be distinguished
input.Makefile: Import to VegBIEN: \$(import): Fixed bug where 2>&1 needs to come after >>$(log_) rather than before
inputs/.TNRS/tnrs/tnrs.make: Usage: Added tnrs_db's $wait flag
inputs/.TNRS/tnrs/tnrs.make: Fixed Usage message to use make, which is needed to set the PATH correctly
Makefiles: Changed "Usage: `make -s ...`" to "Run with `make -s` to avoid echoing make commands"
input.Makefile: Import to VegBIEN: Added %/log_file to view the import log file path
input.Makefile: Import to VegBIEN: $(import): Append to the log file instead of replacing it, to avoid overwriting the log for a previous import to the same versioned schema. This allows a datasource to be (re-)imported multiple times, and is needed by the new method for linking taxonoccurrences to scrubbed taxonomic names.
input.Makefile: Import to VegBIEN: $(import): Always output just to log file if $(log) is on, rather than also copying output to the terminal when $(n) is set. When $(log) is on, the output can still be viewed by tailing the log.
input.Makefile: Import to VegBIEN: $(import): Merged consecutive $(if $(n),...)
input.Makefile: Import to VegBIEN: $(import): Merged consecutive $(if $(log),...)
Added inputs/.TNRS/public.unscrubbed_taxondetermination_view/
mappings/VegCore-VegBIEN.csv: Mapped taxonOccurrencePkey
input.Makefile: Staging tables installation: Added %_view/install, to prevent trying to edit a view during installation
vegbien_dest: Added schema override support for referring to a table in the $public schema
input.Makefile: Staging tables installation: $(cleanup): Moved setting of $schema, $table before vegbien_dest is run, so it can modify them if needed
inputs/.TNRS/tnrs/tnrs.make: Removed unnecessary setting of $prefix, which now defaults to ""
schemas/vegbien.sql: Added unscrubbed_taxondetermination_view
inputs/import.stats.xls: Moved CTFS to Deleted section
make_analytical_db: ANALYZE each table after its created so that queries use index scans instead of seq scans
schemas/vegbien.sql: sync_analytical_*_to_view(): Added datasource fkey to source.shortname so removing a datasource will also remove the corresponding rows in the analytical views
schemas/vegbien.sql: Regenerated analytical_stem using sync_analytical_stem_to_view()
input.Makefile: Editing import: rm: Time the command
schemas/vegbien.sql: Added covering indexes where needed, as described at <https://projects.nceas.ucsb.edu/nceas/issues/549>
schemas/vegbien.sql: Fixed fkey constraint names
schemas/vegbien.sql: fkeys to source: Added covering indexes where needed, as described at <https://projects.nceas.ucsb.edu/nceas/issues/549>
schemas/vegbien.sql: commconcept: Renamed source_id back to reference_id (it was previously renamed to source_id in a bulk rename)
schemas/vegbien.sql: taxondetermination: Added back reference_id, which is different than the scoping source_id (reference_id was previously renamed to source_id in a bulk rename)