Project

General

Profile

Activity

From 12/19/2012 to 01/17/2013

01/17/2013

11:29 PM Revision 7270: import_times: Add blank line before \"Postprocessing logs\" to separate it from the input logs
Aaron Marcuse-Kubitza
11:28 PM Revision 7269: import_times: Separate out the postprocessing logs (e.g. public.unscrubbed_taxondetermination_view), as the import times in these logs are not aggregated together (each input has its own run of the postprocessing script)
Aaron Marcuse-Kubitza

01/16/2013

02:55 PM Revision 7268: root Makefile: Datasources: import: Use new import_scrub instead of import (input.Makefile)
Aaron Marcuse-Kubitza
02:51 PM Revision 7267: import_all: Use new import_scrub (input.Makefile) instead of import, which avoids needing to start background processes for tnrs-remake and scrub-remake
Aaron Marcuse-Kubitza
02:50 PM Revision 7266: inputs/.TNRS/public.unscrubbed_taxondetermination_view/scrub.make: Fixed bug where need to use tnrs.make's lockfile instead because can't be importing while tnrs.make is scrubbing. tnrs.make leaves tnrs in an incomplete state while running because the accepted names are parsed *after* their matched names. Using a separate lockfile would cause some accepted names to be missing.
Aaron Marcuse-Kubitza
02:27 PM Revision 7265: input.Makefile: Import to VegBIEN: Added import_scrub, which runs `make scrub` after the import
Aaron Marcuse-Kubitza
02:26 PM Revision 7264: root Makefile: Datasources: Added scrub, which runs tnrs-remake and scrub-remake
Aaron Marcuse-Kubitza
02:18 PM Revision 7263: inputs/.TNRS/*/*.make: Only allow one instance of the script to be running at any time, by using new waitself
Aaron Marcuse-Kubitza
02:15 PM Revision 7262: waitpid, lockfile: Changed $interval default to 5s to work with smaller imports, where less waiting is needed
Aaron Marcuse-Kubitza
02:14 PM Revision 7261: Added waitself
Aaron Marcuse-Kubitza
02:11 PM Revision 7260: bin/lockfile: Include the PID in the lockfile to avoid the need to manually remove lockfiles. On Mac, this requires using shlock instead of lockfile.
Aaron Marcuse-Kubitza
01:35 PM Revision 7259: Added bin/lockfile
Aaron Marcuse-Kubitza
01:34 PM Revision 7258: Added pid2name
Aaron Marcuse-Kubitza
01:33 PM Revision 7257: Added name2pids
Aaron Marcuse-Kubitza
01:33 PM Revision 7256: waitpid: Use `ps` instead of /proc to also work on Mac
Aaron Marcuse-Kubitza
01:07 PM Revision 7255: inputs/.TNRS/tnrs/tnrs.make: Fixed bug where need special handling to support being run as a .make script
Aaron Marcuse-Kubitza
11:59 AM Revision 7254: inputs/.geoscrub/_src/README.TXT: Added dates for e-mails from Jim
Aaron Marcuse-Kubitza
11:57 AM Revision 7253: inputs/.geoscrub/_src/README.TXT: Added e-mail from Jim about repository with scripts to generate the geoscrub_output table
Aaron Marcuse-Kubitza
11:02 AM Revision 7252: schemas/vegbien.sql: unscrubbed_taxondetermination_view: Fixed bug where need to use tnrs_accepted.Name_submitted IS NOT NULL rather than tnrs_accepted.* IS NOT NULL, because tnrs_accepted.* (which plain tnrs_accepted gets changed to by PostgreSQL) checks *each field* of the tnrs_accepted tuple rather than checking if the tuple itself is NULL
Aaron Marcuse-Kubitza
10:23 AM Revision 7251: inputs/.TNRS/schema.sql: Added back tnrs+accepted view, which is useful for debugging the import of the TNRS results
Aaron Marcuse-Kubitza
09:21 AM Revision 7250: inputs/REMIB/Specimen/postprocess.sql: Added back ARIZ, NY because some REMIB specimens for these datasources are not yet in the datasources themselves
Aaron Marcuse-Kubitza
08:43 AM Revision 7249: Added inputs/REMIB/Specimen/postprocess.sql to remove institutions that we have direct data for
Aaron Marcuse-Kubitza
08:43 AM Revision 7248: Placed inputs/REMIB/_archive/ under version control
Aaron Marcuse-Kubitza
08:23 AM Revision 7247: Added inputs/SpeciesLink/Specimen/postprocess.sql to remove institutions that we have direct data for
Aaron Marcuse-Kubitza
08:21 AM Revision 7246: Placed inputs/SpeciesLink/_archive/ under version control
Aaron Marcuse-Kubitza
07:56 AM Revision 7245: input.Makefile: $(import?): Renamed $public_import option to $full_import because it applies to any import of all datasources, not just a public import on vegbiendev
Aaron Marcuse-Kubitza
07:23 AM Revision 7244: schemas/vegbien.sql: analytical_stem_view: Changed `WHERE COALESCE(taxondetermination.iscurrent, true)` to a join condition to enable using the taxondetermination_single_current_determination index, which produces the filtered rows directly. Note that this index will not be used for full-database imports, because the query planner uses hash joins everywhere instead of nested loops.
Aaron Marcuse-Kubitza
06:47 AM Revision 7243: db_xml.py: put_table(): Fixed bug where for views, shouldn't advance start (OFFSET clause) after each chunk, because views are typically dynamic and will contain a new set of rows after the first set is imported
Aaron Marcuse-Kubitza
06:41 AM Revision 7242: sql.py: Added view_exists()
Aaron Marcuse-Kubitza
06:16 AM Revision 7241: inputs/.TNRS/schema.sql: Removed no longer used tnrs_canon. unscrubbed_taxondetermination_view uses its definition directly instead.
Aaron Marcuse-Kubitza
06:14 AM Revision 7240: schemas/vegbien.sql: unscrubbed_taxondetermination_view: Added comment from tnrs_canon
Aaron Marcuse-Kubitza
06:12 AM Revision 7239: schemas/vegbien.sql: unscrubbed_taxondetermination_view: Added comment from tnrs_canon
Aaron Marcuse-Kubitza
06:09 AM Revision 7238: schemas/vegbien.sql: unscrubbed_taxondetermination_view: Do the tnrs_canon joins manually instead of using tnrs_canon, to allow PostgreSQL to use a nested loop join on just the needed tnrs rows instead of a hash self-join of all tnrs rows. The query planner is not yet advanced enough to automatically integrate the select on the view into the top-level joins list, which would make this change automatically.
Aaron Marcuse-Kubitza
05:52 AM Revision 7237: inputs/.TNRS/public.unscrubbed_taxondetermination_view/scrub.make: rowsAdded(): Look at last 100 rows instead of last 10, because rows are added to the log file each time the script waits and the Inserted # new rows message must be in the tailed rows
Aaron Marcuse-Kubitza
05:48 AM Revision 7236: inputs/.TNRS/public.unscrubbed_taxondetermination_view/scrub.make: rowsAdded(): Fixed bug where need to test if log file exists before using it in tail, because if tail fails and causes rowsAdded to return false, this error exit status will be indistinguishable from false for no rows added and the script will keep going
Aaron Marcuse-Kubitza
05:40 AM Revision 7235: inputs/.TNRS/public.unscrubbed_taxondetermination_view/scrub.make: Fixed bug where need special handling to support being run as a .make script
Aaron Marcuse-Kubitza
03:35 AM Revision 7234: input.Makefile: Editing import: Added unscrub to remove TNRS taxondeterminations
Aaron Marcuse-Kubitza
03:34 AM Revision 7233: psql_script_vegbien: Added no_query_results option to hide results of calls to void functions
Aaron Marcuse-Kubitza
03:33 AM Revision 7232: schemas/vegbien.sql: Added delete_scrubbed_taxondeterminations()
Aaron Marcuse-Kubitza
01:43 AM Revision 7231: root Makefile: python-Darwin: Added instructions to install dateutil for Python 3 as well as Python 2, for use in PL/Python functions
Aaron Marcuse-Kubitza
01:42 AM Revision 7230: root Makefile: python-Darwin: Added note that Python 2 comes preinstalled
Aaron Marcuse-Kubitza
01:15 AM Revision 7229: Added inputs/GBIF/Specimen/postprocess.sql to remove institutions that we have direct data for
Aaron Marcuse-Kubitza

01/15/2013

10:42 PM Revision 7228: import_all: Run disown_all after background processes have been created, so that they will not be aborted if the shell exits (e.g. due to a broken connection). Note that with_all processes are automatically disowned as they are created, but other processes, such as after_import, were not.
Aaron Marcuse-Kubitza

01/14/2013

05:21 PM Revision 7227: inputs/.TNRS/schema.sql: Removed no longer used array_to_string(). The IMMUTABLE wrapper is only needed for index conditions and other places that require an IMMUTABLE function.
Aaron Marcuse-Kubitza
05:14 PM Revision 7226: input.Makefile: Maps validation: %/new_terms.csv: Filter out terms that map to UNUSED, because these are not mappings that are useful as VegCore synonyms
Aaron Marcuse-Kubitza
05:13 PM Revision 7225: input.Makefile: Maps validation: %/new_terms.csv: Filter out terms that map to UNUSED, because these are not mappings that are useful as VegCore synonyms
Aaron Marcuse-Kubitza
05:12 PM Revision 7224: README.TXT: Data import: Checking free disk space: Updated import schema size to 110GB
Aaron Marcuse-Kubitza
04:37 PM Revision 7223: Added inputs/Madidi/_README.TXT
Aaron Marcuse-Kubitza
04:35 PM Revision 7222: new_terms.csv: Regenerated
Aaron Marcuse-Kubitza
04:34 PM Revision 7221: inputs/Madidi/new_terms.csv: Regenerated
Aaron Marcuse-Kubitza
04:19 PM Revision 7220: inputs/Madidi/_archive/2010-1-2/: Set svn:ignore
Aaron Marcuse-Kubitza
04:18 PM Revision 7219: inputs/Madidi/_README.TXT: Archived to _archive/2010-1-2/
Aaron Marcuse-Kubitza
03:43 PM Revision 7218: inputs/Madidi/: Refreshed. Note that new export has a completely new schema.
Aaron Marcuse-Kubitza
03:42 PM Revision 7217: inputs/Madidi/: Refreshed. Note that new export has a completely new schema.
Aaron Marcuse-Kubitza
01:53 PM Revision 7216: input.Makefile: Maps validation: %/new_terms.csv: Filter out terms that map to UNUSED, because these are not mappings that are useful as VegCore synonyms
Aaron Marcuse-Kubitza
01:18 PM Revision 7215: mappings/VegCore-VegBIEN.csv: fieldNumber (authorEventCode): Fixed bug where locationevent.authorlocationcode should be authoreventcode
Aaron Marcuse-Kubitza
12:19 PM Revision 7214: Added inputs/Madidi/map.csv, created from new_terms.csv
Aaron Marcuse-Kubitza
12:16 PM Revision 7213: inputs/Madidi/_archive/: Set svn:ignore
Aaron Marcuse-Kubitza
12:15 PM Revision 7212: csvs.py: sniff(): TSVs: Don't turn off quoting, because some TSVs (such as Madidi.IndividualObservation) do quote fields
Aaron Marcuse-Kubitza
12:13 PM Revision 7211: csvs.py: TsvReader: Use csv.reader.next() when possible to support quoted fields, such as in Madidi.IndividualObservation
Aaron Marcuse-Kubitza
11:43 AM Revision 7210: input.Makefile: Configuration: $(exts): Added .dat, which the new Madidi files use
Aaron Marcuse-Kubitza
08:39 AM Revision 7209: mappings/Makefile: VegCore.tables.csv: Removed no longer needed removal of Namespaces table, which is now marked as just a section, not a table
Aaron Marcuse-Kubitza
08:37 AM Revision 7208: mappings/VegCore.csv: Regenerated from wiki
Aaron Marcuse-Kubitza
07:39 AM Revision 7207: Added to_do/timeline.2013.xls (from Brad, converted to .xls)
Aaron Marcuse-Kubitza
07:30 AM Revision 7206: to_do/timeline.doc: Renamed to timeline.2012.doc to allow for a separate 2013 timeline
Aaron Marcuse-Kubitza

01/11/2013

05:05 PM Revision 7205: README.TXT: Data import: Deleting imports before the last: Added instructions to keep a previous import instead of deleting it
Aaron Marcuse-Kubitza
04:22 PM Revision 7204: input.Makefile: Staging tables installation: $(logInstall): Always log the installation, regardless of the $log env var, because $log is set by default on development machines but an install log should still be created
Aaron Marcuse-Kubitza
01:03 PM Revision 7203: schemas/vegbien.ERD.mwb: Regenerated exports
Aaron Marcuse-Kubitza
10:19 AM Revision 7202: schemas/vegbien.sql: unscrubbed_taxondetermination_view: Fixed bug where need to handle the case where (SELECT source.source_id FROM source WHERE source.shortname = 'TNRS') is NULL because no TNRS names have been imported yet
Aaron Marcuse-Kubitza
09:44 AM Revision 7201: **/new_terms.csv, **/unmapped_terms.csv: Regenerated using `make missing_mappings`
Aaron Marcuse-Kubitza
09:19 AM Revision 7200: mappings/VegCore-VegBIEN.csv: morphoname: Remapped to the original rather than current taxondetermination because this is the *original* name applied by the author
Aaron Marcuse-Kubitza
09:16 AM Revision 7199: inputs/SALVIAS*/Organism/map.csv: Remapped voucher_string/coll_number to recordNumber instead of catalogNumber, because this number is actually applied by the collector rather than by a herbarium
Aaron Marcuse-Kubitza
09:11 AM Revision 7198: mappings/VegCore-VegBIEN.csv: Mapped recordNumber to new specimenreplicate.collectionnumber
Aaron Marcuse-Kubitza
09:02 AM Revision 7197: mappings/VegCore-VegBIEN.csv: Also map recordNumber (collectionnumber) to the indirect voucher's specimenreplicate
Aaron Marcuse-Kubitza
08:48 AM Revision 7196: inputs/*/*/map.csv: Remapped recordNumber to new individualCode where applicable
Aaron Marcuse-Kubitza
08:44 AM Revision 7195: mappings/VegCore-VegBIEN.csv: Mapped individualCode. authortaxoncode: Prefer tag over recordNumber (collectionnumber), because this applies to the plant rather than the specimen.
Aaron Marcuse-Kubitza
08:17 AM Revision 7194: mappings/VegCore-VegBIEN.csv: Mapped morphoname
Aaron Marcuse-Kubitza
08:16 AM Revision 7193: mappings/VegCore.csv: Regenerated from wiki
Aaron Marcuse-Kubitza
08:14 AM Revision 7192: mappings/VegCore.csv: Regenerated from wiki
Aaron Marcuse-Kubitza
08:04 AM Revision 7191: schemas/vegbien.sql: taxonverbatim: Added morphoname (which is different from the morphospecies suffix)
Aaron Marcuse-Kubitza
07:33 AM Revision 7190: schemas/vegbien.sql: plantobservation: Renamed collectionnumber to authorplantcode since this number, which identifies the *plant*, is actually different from the collectionnumber that identifies the *specimen* collected from it. This distinction is meaningful for plots data, but generally not for specimens data.
Aaron Marcuse-Kubitza
07:28 AM Revision 7189: schemas/vegbien.sql: plantobservation: Renamed collectionnumber to authorplantcode since this number, which identifies the *plant*, is actually different from the collectionnumber that identifies the *specimen* collected from it. This distinction is meaningful for plots data, but generally not for specimens data.
Aaron Marcuse-Kubitza
07:23 AM Revision 7188: schemas/vegbien.sql: specimenreplicate: Added collectionnumber
Aaron Marcuse-Kubitza
07:17 AM Revision 7187: schemas/vegbien.sql: taxonlabel: Removed no longer used matched_label_fit_fraction. Use taxondetermination.taxonfit instead.
Aaron Marcuse-Kubitza
07:02 AM Revision 7186: inputs/*/*/test.xml.ref: Restored inserted row counts, which had gotten auto-accepted from a test run on a non-empty DB
Aaron Marcuse-Kubitza
06:55 AM Revision 7185: schemas/vegbien.ERD.mwb: Expanded analytical_stem to fit the width of all fields
Aaron Marcuse-Kubitza
06:53 AM Revision 7184: schemas/vegbien.sql: taxondetermination: taxondetermination_computer_min_fit CHECK constraint: Fixed bug where need to use CASE instead of OR when a branch of an OR shouldn't be evaluated, because PostgreSQL doesn't support short-circuit OR
Aaron Marcuse-Kubitza
06:38 AM Revision 7183: README.TXT: Debugging: Added instructions for "binary chop" debugging, which requires syncing the DB schema to the svn working copy
Aaron Marcuse-Kubitza
06:08 AM Revision 7182: mappings/VegCore-VegBIEN.csv: Removed no longer used mappings for verbatimScientificName in _if conditions
Aaron Marcuse-Kubitza
06:08 AM Revision 7181: inputs/.NCBI/nodes/test.xml.ref: Restored inserted row counts, which had gotten auto-accepted from a test run on a non-empty DB
Aaron Marcuse-Kubitza
06:06 AM Revision 7180: sql_io.py: put_table(): DuplicateKeyException: Uniquifying input table to avoid internal duplicate keys: Also filter out duplicate rows in the out_table, so that they don't create duplicate key errors and the resulting index holes
Aaron Marcuse-Kubitza
06:01 AM Revision 7179: sql.py: distinct_table(): Added support for custom joins used in creating the new table. This can then be used by sql_io.put_table() to filter out duplicate rows in the out_table, so that they don't create duplicate key errors and the resulting index holes.
Aaron Marcuse-Kubitza
05:53 AM Revision 7178: README.TXT: Documentation: Redmine-formatted list of steps for column-based import: Added step to reinstall public schema first, to reset the sequences so that they don't create a diff when the new steps.by_col.log.sql is committed
Aaron Marcuse-Kubitza
05:48 AM Revision 7177: Added inputs/ACAD/Specimen/logs/steps.by_col.log.sql
Aaron Marcuse-Kubitza
05:45 AM Revision 7176: sql_gen.py: Join: Added support for mapping values which are lists, for use in USING joins
Aaron Marcuse-Kubitza
05:40 AM Revision 7175: inputs/SALVIAS/*/test.xml.ref: Restored SALVIAS* inserted row counts, which had gotten auto-accepted from a test run on a non-empty DB
Aaron Marcuse-Kubitza
05:01 AM Revision 7174: schemas/vegbien.sql: analytical_stem: Added locationName (authorPlotCode), subplot, individualCode (authorPlantCode) for use in validation
Aaron Marcuse-Kubitza
04:57 AM Revision 7173: schemas/vegbien.sql: sync_analytical_stem_to_view(): Drop and re-create dependent objects to avoid errors that analytical_stem can't be dropped because of dependents
Aaron Marcuse-Kubitza
04:56 AM Revision 7172: schemas/vegbien.sql: sync_analytical_stem_to_view(): Changed to PL/pgSQL function to allow adding PL/pgSQL commands
Aaron Marcuse-Kubitza
03:26 AM Revision 7171: schemas/vegbien.ERD.mwb: Moved family_higher_plant_group to leave room for analytical_stem to expand
Aaron Marcuse-Kubitza
03:08 AM Revision 7170: mappings/VegCore-VegBIEN.csv: Removed no longer used mappings for verbatimScientificName in _if conditions
Aaron Marcuse-Kubitza
02:59 AM Revision 7169: mappings/VegCore-VegBIEN.csv: Removed taxonlabel for original taxondetermination, because the original taxondetermination is not scrubbed by scrub.make (only the most current taxondetermination gets scrubbed, because only a single scrubbed determination is added by scrub.make). This still leaves the original taxondetermination's taxonverbatim, which stores the taxonomic information for historical purposes.
Aaron Marcuse-Kubitza
02:44 AM Revision 7168: mappings/VegCore-VegBIEN.csv: Removed no longer used accepted and verbatim (parsed) taxonlabels, which have been replaced by a single accepted or matched taxondetermination created by scrub.make
Aaron Marcuse-Kubitza
02:34 AM Revision 7167: Removed no longer used inputs/.TNRS/tnrs_accepted, tnrs_other. Use the tnrs_canon view instead.
Aaron Marcuse-Kubitza
02:22 AM Revision 7166: Removed no longer used inputs/.TNRS/tnrs_accepted, tnrs_other. Use the tnrs_canon view instead.
Aaron Marcuse-Kubitza
02:18 AM Revision 7165: Added inputs/.TNRS/_archive/
Aaron Marcuse-Kubitza
02:18 AM Revision 7164: Added inputs/.TNRS/tnrs/cleanup.sql to prevent running the default cleanup operations, which don't work on tables which have views referencing them (as is the case for tnrs, which is referenced by tnrs_canon)
Aaron Marcuse-Kubitza
02:07 AM Revision 7163: import_all: Removed no longer needed TNRS import, which has been replaced by scrub.make (which adds TNRS taxondeterminations after the import instead of creating taxonlabel links before it)
Aaron Marcuse-Kubitza
02:03 AM Revision 7162: mappings/VegCore-VegBIEN.csv: Removed TNRS input taxonlabels meant to cross-link to taxonlabels added by the TNRS import, because TNRS taxondeterminations are now created instead
Aaron Marcuse-Kubitza
01:42 AM Revision 7161: schemas/vegbien.sql: analytical_stem_view: Use just the main taxonlabel created by scrub.make instead of all the additional taxonlabels created by the TNRS import
Aaron Marcuse-Kubitza
01:11 AM Revision 7160: mappings/VegCore-VegBIEN.csv: main taxonverbatim.morphospecies "if has verbatim name" condition: Fixed bug where need to remove the taxonIsCanonical flag, because the TNRS.public.unscrubbed_taxondetermination_view table (which uses this flag) *should* include this field (although not other places where the morphospecies is stored by other TNRS tables)
Aaron Marcuse-Kubitza
12:49 AM Revision 7159: schemas/vegbien.sql: taxondetermination: taxondetermination_set_iscurrent() trigger: Also run on delete, to mark another taxondetermination as the current one when a current taxondetermination is deleted
Aaron Marcuse-Kubitza
12:18 AM Revision 7158: inputs/.TNRS/schema.sql: tnrs_canon: Annotations: Always use value from the matched name, because the accepted name does not have this
Aaron Marcuse-Kubitza
12:05 AM Revision 7157: mappings/VegCore-VegBIEN.csv: primary taxonlabel's parent taxonlabel: Fixed bug where a taxonverbatim was incorrectly being created solely to store the taxonRank, even though it was already stored in the taxonlabel's rank field
Aaron Marcuse-Kubitza

01/10/2013

11:52 PM Revision 7156: mappings/VegCore-VegBIEN.csv: Don't map morphospecies to the parsed taxonlabel's taxonepithet, because this causes an extra, parsed taxonlabel to be created for TNRS.public.unscrubbed_taxondetermination_view. It is not needed by the other TNRS tables.
Aaron Marcuse-Kubitza
11:45 PM Revision 7155: inputs/.TNRS/public.unscrubbed_taxondetermination_view/map.csv: Omit Infraspecific_rank to help avoid creating a separate, parsed taxonlabel. Don't map to taxonRank because Name_matched_rank is populated more often.
Aaron Marcuse-Kubitza
11:34 PM Revision 7154: inputs/.TNRS/public.unscrubbed_taxondetermination_view/scrub.make: Reduced $maxPause to 4 hr, because new taxondeterminations are being added throughout the import, so it is unlikely that more than more than 4 hr would pass between successive imports of taxondeterminations (causing scrub.make to stop prematurely)
Aaron Marcuse-Kubitza
11:23 PM Revision 7153: inputs/.TNRS/schema.sql: Removed no longer used tnrs+accepted. Use tnrs_canon or a self-join of tnrs instead
Aaron Marcuse-Kubitza
11:22 PM Revision 7152: schemas/vegbien.sql: tnrs_input_name: Use TNRS.tnrs directly instead of the now-deprecated tnrs+accepted
Aaron Marcuse-Kubitza
11:12 PM Revision 7151: schemas/vegbien.sql: Use new TNRS.tnrs_canon instead of tnrs+accepted to avoid creating additional taxonlabels for the parsed, matched, and accepted names and instead just use the most-canonicalized name of the names output by TNRS (the accepted name if available, or the matched name otherwise)
Aaron Marcuse-Kubitza
10:50 PM Revision 7150: mappings/VegCore-VegBIEN.csv: "if has verbatim name" _if statements that filter something out for TNRS mappings: Also assume true if taxonIsCanonical is specified, because some TNRS tables (eventually such as public.unscrubbed_taxondetermination_view) do not specify a separate "verbatim" taxondetermination but do provide taxonIsCanonical as a flag to turn various mappings on and off
Aaron Marcuse-Kubitza
09:06 PM Revision 7149: mappings/VegCore-VegBIEN.csv: Remapped matched*Fit_fraction to taxondetermination.taxonfit when a taxondetermination, not just a taxonlabel, is provided
Aaron Marcuse-Kubitza
09:03 PM Revision 7148: bin/map: map_table(): Resolving prefixes: Fixed bug where need to use list instead of tuple for metadata value mappings
Aaron Marcuse-Kubitza
08:16 PM Revision 7147: schemas/vegbien.sql: taxondetermination: Added CHECK constraint to allow only taxondeterminations with a minimum fit fraction of 80%, analogous to taxonlabel's taxonlabel_1_matched_label_min_fit() trigger
Aaron Marcuse-Kubitza

01/09/2013

05:34 PM Revision 7146: mappings/VegCore-VegBIEN.csv: Don't create a separate TNRS input taxonlabel if taxonIsCanonical exists
Aaron Marcuse-Kubitza
05:24 PM Revision 7145: inputs/.TNRS/schema.sql: tnrs_canon: Fixed bug where need to always use Unmatched_terms from tnrs rather than tnrs_accepted
Aaron Marcuse-Kubitza
05:07 PM Revision 7144: inputs/.TNRS/schema.sql: Added tnrs_canon, which stores the most canonicalized name output by TNRS
Aaron Marcuse-Kubitza
04:17 PM Revision 7143: schemas/vegbien.sql: analytical_stem_view: accepted_taxonverbatim: Fixed bug where need to join only to the taxonverbatim whose morphospecies is NULL, to avoid joining to multiple taxonverbatims at once. This extra filter is now needed because there can be multiple taxonverbatims for a taxonlabel with different morphospecies.
Aaron Marcuse-Kubitza
03:59 PM Revision 7142: mappings/VegCore-VegBIEN.csv: taxonlabel.taxonomicname: Prepend the family to the rest of the name using new _merge_prefix() instead of _join_words()/_nullIf(), so that any input taxonomic name that includes the family will not have the family duplicated in the combined taxonomic name. Previously, the duplication was removed only when the rest of the input name was *equal to* the family. This change fixes a bug in the new TNRS import where a pre-concatenated taxonomic name (Accepted_scientific_name) which includes the family is now used instead of Accepted_name, which only includes it when it's equal to the family.
Aaron Marcuse-Kubitza
03:52 PM Revision 7141: xml_func.py: Simplifying functions: Merging: Added _merge_prefix() passthru
Aaron Marcuse-Kubitza
03:33 PM Revision 7140: schemas/functions.sql: Added _merge_prefix()
Aaron Marcuse-Kubitza
02:42 PM Revision 7139: inputs/.TNRS/schema.sql: tnrs_populate_accepted_scientific_name(): Fixed bug where Accepted_name_family shouldn't be prefixed to Accepted_name if Accepted_name is itself the family, to avoid duplicating the family in the Accepted_scientific_name
Aaron Marcuse-Kubitza
02:18 PM Revision 7138: inputs/.TNRS/schema.sql: tnrs+accepted: Added new Accepted_scientific_name column and mapped it in public.unscrubbed_taxondetermination_view
Aaron Marcuse-Kubitza
11:06 AM Revision 7137: schemas/vegbien.sql: tnrs_input_name: Fixed bug where need to filter out tnrs+accepted rows with NULL Accepted_scientific_name, because inputs to tnrs_db must be strings
Aaron Marcuse-Kubitza
10:53 AM Revision 7136: schemas/vegbien.sql: tnrs_input_name: Prepend TNRS accepted names that have not yet been parsed. This allows parsing TNRS accepted names without first needing to import them into taxonlabels, which may not occur until the next import.
Aaron Marcuse-Kubitza
10:09 AM Revision 7135: inputs/.TNRS/schema.sql: tnrs+accepted: Use new Accepted_scientific_name to join to tnrs_accepted.Name_submitted
Aaron Marcuse-Kubitza
10:05 AM Revision 7134: inputs/.TNRS/schema.sql: tnrs: Added tnrs_populate_accepted_scientific_name() trigger
Aaron Marcuse-Kubitza
09:57 AM Revision 7133: inputs/.TNRS/schema.sql: tnrs: Added Accepted_scientific_name field which will contain the joined-together accepted name that gets re-parsed by TNRS
Aaron Marcuse-Kubitza
09:13 AM Revision 7132: inputs/.TNRS/: Changed tnrs+accepted to a view (defined in schema.sql) so accepted names would automatically be populated as they are parsed by TNRS, rather than needing to run `make inputs/.TNRS/tnrs+accepted/reinstall` to populate them
Aaron Marcuse-Kubitza
08:16 AM Revision 7131: mappings/VegCore-VegBIEN.csv: Also map the morphospecies to the accepted taxonverbatim when an accepted name is provided
Aaron Marcuse-Kubitza
08:01 AM Revision 7130: schemas/vegbien.sql: taxonverbatim: taxonverbatim_unique: Added morphospecies so that there can be multiple taxonverbatims for the same taxonlabel, each with different morphospecies suffixes
Aaron Marcuse-Kubitza
04:17 AM Revision 7129: inputs/.TNRS/public.unscrubbed_taxondetermination_view/map.csv: Mapped Accepted_name.*
Aaron Marcuse-Kubitza
03:02 AM Revision 7128: schemas/vegbien.sql: unscrubbed_taxondetermination_view: Use new tnrs+accepted instead of tnrs so that the accepted name can be imported at the same time
Aaron Marcuse-Kubitza
02:23 AM Revision 7127: import_all: Reinstall tnrs+accepted, for eventual use by unscrubbed_taxondetermination_view
Aaron Marcuse-Kubitza
02:20 AM Revision 7126: Added inputs/.TNRS/tnrs+accepted/, which self-joins the TNRS results to their parsed accepted names
Aaron Marcuse-Kubitza
02:02 AM Revision 7125: import_all: Directly import just the TNRS tables that should be imported, because some TNRS tables are included in import_order.txt so that they are part of the automated testing, but should not be imported at the same time as tnrs_accepted/tnrs_other
Aaron Marcuse-Kubitza
12:45 AM Revision 7124: inputs/import.stats.xls: Updated import times
Aaron Marcuse-Kubitza

01/08/2013

11:24 PM Revision 7123: with_all: $all mode: Fixed bug where need " " before # for it to be interpreted as a comment (unlike in a Makefile, where the " " often needs to be left out to avoid it being treated as part of a variable value)
Aaron Marcuse-Kubitza
10:55 PM Revision 7122: bin/map: Made $redo flag default to off, because redo mode is slow (all tables have to be truncated) and is only needed when running tests on a public schema with data in it, which would not be the case on a development machine where tests are usually run
Aaron Marcuse-Kubitza
10:19 PM Revision 7121: import_all: Made temporary vars local, so they wouldn't affect the calling shell
Aaron Marcuse-Kubitza
09:45 PM Revision 7120: schemas/vegbien.sql: unscrubbed_taxondetermination_view: Sort by taxondetermination.taxonoccurrence_id instead of taxondetermination_id to allow scanning the taxondetermination_single_current_determination index, which includes only current determinations and avoids needing to scan past many non-current determinations. Note that using taxonoccurrence_id does not create sort order ambiguity between taxondeterminations with the same taxonoccurrence_id, because there is only one current determination per taxonoccurrence.
Aaron Marcuse-Kubitza
09:32 PM Revision 7119: schemas/vegbien.sql: unscrubbed_taxondetermination_view: Inner-join to taxonverbatim and taxonlabel instead of LEFT JOINing, because only taxondeterminations with a taxonlabel can have accepted taxondeterminations (otherwise there would be no name to scrub)
Aaron Marcuse-Kubitza
09:30 PM Revision 7118: schemas/vegbien.sql: unscrubbed_taxondetermination_view: Inner-join to tnrs instead of LEFT JOINing, because only taxondeterminations whose taxonlabels have already been scrubbed by TNRS should have accepted taxondeterminations added. Removed now-unneeded filter by tnrs.Name_submitted IS NOT NULL, which is replaced by the inner join.
Aaron Marcuse-Kubitza
08:46 PM Revision 7117: sql_io.py: put_table(): ensure_cond(): Fixed bug where need to wrap strings used in the tracked error message in strings.ustr()
Aaron Marcuse-Kubitza
08:33 PM Revision 7116: xml_dom.py: replace_with_text(): Fixed bug where need to use scalar.is_nonnull_scalar() instead of is_scalar() to avoid converting None values to the string 'None'
Aaron Marcuse-Kubitza
08:32 PM Revision 7115: scalar.py: Added is_nonnull_scalar()
Aaron Marcuse-Kubitza

01/07/2013

08:17 PM Revision 7114: README.TXT: Data import: Fixed bug where `make inputs/upload` needs to be run on local machine, not vegbiendev
Aaron Marcuse-Kubitza
08:16 PM Revision 7113: sql.py: create_table(): Support creating a table like a view
Aaron Marcuse-Kubitza
08:04 PM Revision 7112: sql.py: Added InvalidTypeException and parse it in parse_exception()
Aaron Marcuse-Kubitza
07:39 PM Revision 7111: mappings/VegCore.csv: Regenerated from wiki
Aaron Marcuse-Kubitza
07:34 PM Revision 7110: schemas/vegbien.sql: taxondetermination_set_iscurrent(): Fixed bug where need to sort scrubbed determinations first for scrub.make to work. (Otherwise, a datasource determination might remain iscurrent even after a scrubbed determination was added, causing scrub.make to repeatedly attempt to re-add it.)
Aaron Marcuse-Kubitza
07:20 PM Revision 7109: inputs/.TNRS/public.unscrubbed_taxondetermination_view/map.csv: Set dateIdentified to _now()
Aaron Marcuse-Kubitza
07:20 PM Revision 7108: inputs/.TNRS/public.unscrubbed_taxondetermination_view/scrub.make: Unset $n to avoid limiting the # rows/iteration
Aaron Marcuse-Kubitza
07:15 PM Revision 7107: schemas/py_functions.sql: parse_date_range(): Don't parse strings containing a time, because - and ' ' don't have the same meaning as in a date range
Aaron Marcuse-Kubitza
07:03 PM Revision 7106: xml_dom.py: replace_with_text(): Support any scalar type recognized by scalar.is_scalar()
Aaron Marcuse-Kubitza
06:54 PM Revision 7105: scalar.py: is_scalar(): Added datetime.datetime
Aaron Marcuse-Kubitza
06:43 PM Revision 7104: schemas/functions.sql: Added _now()
Aaron Marcuse-Kubitza
06:39 PM Revision 7103: import_all: Make $dump_opts, $public_import local vars, so they will be automatically unset if the script is aborted
Aaron Marcuse-Kubitza
06:31 PM Revision 7102: mappings/VegCore-VegBIEN.csv: identificationType: Fixed bug in mapping where extra *_id/ needed to be removed
Aaron Marcuse-Kubitza
06:25 PM Revision 7101: inputs/.TNRS/public.unscrubbed_taxondetermination_view/map.csv: Set taxonOccurrenceID to dummy value 0 to enable the taxonoccurrence CHECK constraint to pass. This is needed because the constraint must pass before the pkey (which should already exist) is even checked.
Aaron Marcuse-Kubitza
06:19 PM Revision 7100: inputs/.TNRS/public.unscrubbed_taxondetermination_view/map.csv: Set identificationType to computer
Aaron Marcuse-Kubitza
06:18 PM Revision 7099: mappings/VegCore-VegBIEN.csv: Mapped identificationType
Aaron Marcuse-Kubitza
06:15 PM Revision 7098: mappings/VegCore.csv: Regenerated from wiki
Aaron Marcuse-Kubitza
05:39 PM Revision 7097: schemas/vegbien.sql: unscrubbed_taxondetermination_view: Use `SELECT source_id FROM source WHERE shortname = ...` instead of source_by_shortname() so that the source table is updated to point to the same schema as the view rather than pointing to whichever version (usually public) is first in the search_path
Aaron Marcuse-Kubitza
05:23 PM Revision 7096: schemas/vegbien.sql: unscrubbed_taxondetermination_view: Fixed bug where need to include only those taxondeterminations that already have a match in TNRS.tnrs, to avoid adding empty TNRS taxondeterminations. As the concurrent tnrs daemon runs, these taxondeterminations will gradually acquire matches in tnrs and then be processed by scrub.
Aaron Marcuse-Kubitza
05:00 PM Revision 7095: import_all: Make $import_source a local var, so it will be automatically unset if the script is aborted
Aaron Marcuse-Kubitza
04:49 PM Revision 7094: vegbien_dest: Schema override for referring to a table in the $public schema: Only process the override when $!schemaVar and $!tableVar are non-*empty*, to allow setting $schema=""
Aaron Marcuse-Kubitza
04:47 PM Revision 7093: schemas/Makefile: DDL generation: vegbien.sql: Unset $dump_opts so that pg_dump does not use env vars left after running import_all
Aaron Marcuse-Kubitza
04:44 PM Revision 7092: schemas/Makefile: DDL generation: vegbien.sql: Unset $version so that pg_dump always uses the public schema, even after running import_all
Aaron Marcuse-Kubitza
04:13 PM Revision 7091: mappings/VegCore.csv: Regenerated from wiki
Aaron Marcuse-Kubitza
04:13 PM Revision 7090: README.TXT: Testing: Added commands to put in .profile on a development machine
Aaron Marcuse-Kubitza
04:10 PM Revision 7089: import_all: Added command to add scrubbed taxondeterminations
Aaron Marcuse-Kubitza
04:09 PM Revision 7088: mappings/VegCore.csv: Regenerated from wiki
Aaron Marcuse-Kubitza
04:08 PM Revision 7087: import_all: Start tnrs-remake *after* starting the inputs, so that for subset imports (e.g. n=2), there will already be names to scrub when tnrs-remake starts up and it won't enter pause mode to wait for new rows (the pause is calibrated for full imports, and is too long for subset imports)
Aaron Marcuse-Kubitza
04:01 PM Revision 7086: with_all: Also exclude .archive/ from the subdirs to forward commands to
Aaron Marcuse-Kubitza
03:40 PM Revision 7085: inputs/.TNRS/public.unscrubbed_taxondetermination_view/scrub.make: Added option to wait for new rows, in the same way tnrs_db does
Aaron Marcuse-Kubitza
03:38 PM Revision 7084: inputs/.TNRS/public.unscrubbed_taxondetermination_view/scrub.make: Factored new rows added test out into rowsAdded() function
Aaron Marcuse-Kubitza
03:09 PM Revision 7083: Added inputs/.TNRS/public.unscrubbed_taxondetermination_view/scrub.make, which adds scrubbed taxondeterminations to VegBIEN
Aaron Marcuse-Kubitza
02:00 PM Revision 7082: root Makefile: Removed $(subMake), which is now defined properly by lib/common.Makefile
Aaron Marcuse-Kubitza
01:59 PM Revision 7081: lib/common.Makefile: $(subMake): Removed `--makefile=../input.Makefile`, which is specific just to inputs/Makefile
Aaron Marcuse-Kubitza
01:43 PM Revision 7080: input.Makefile: Import to VegBIEN: $(import): Print the date at the beginning of the import, so successive imports to the same version can be distinguished
Aaron Marcuse-Kubitza
01:37 PM Revision 7079: input.Makefile: Import to VegBIEN: \$(import): Fixed bug where 2>&1 needs to come after >>$(log_) rather than before
Aaron Marcuse-Kubitza
01:35 PM Revision 7078: inputs/.TNRS/tnrs/tnrs.make: Usage: Added tnrs_db's $wait flag
Aaron Marcuse-Kubitza
01:34 PM Revision 7077: inputs/.TNRS/tnrs/tnrs.make: Fixed Usage message to use make, which is needed to set the PATH correctly
Aaron Marcuse-Kubitza
11:47 AM Revision 7076: Makefiles: Changed "Usage: `make -s ...`" to "Run with `make -s` to avoid echoing make commands"
Aaron Marcuse-Kubitza
11:44 AM Revision 7075: input.Makefile: Import to VegBIEN: Added %/log_file to view the import log file path
Aaron Marcuse-Kubitza
11:28 AM Revision 7074: input.Makefile: Import to VegBIEN: $(import): Append to the log file instead of replacing it, to avoid overwriting the log for a previous import to the same versioned schema. This allows a datasource to be (re-)imported multiple times, and is needed by the new method for linking taxonoccurrences to scrubbed taxonomic names.
Aaron Marcuse-Kubitza
11:22 AM Revision 7073: input.Makefile: Import to VegBIEN: $(import): Always output just to log file if $(log) is on, rather than also copying output to the terminal when $(n) is set. When $(log) is on, the output can still be viewed by tailing the log.
Aaron Marcuse-Kubitza
11:16 AM Revision 7072: input.Makefile: Import to VegBIEN: $(import): Merged consecutive $(if $(n),...)
Aaron Marcuse-Kubitza
11:14 AM Revision 7071: input.Makefile: Import to VegBIEN: $(import): Merged consecutive $(if $(log),...)
Aaron Marcuse-Kubitza
11:07 AM Revision 7070: Added inputs/.TNRS/public.unscrubbed_taxondetermination_view/
Aaron Marcuse-Kubitza
11:05 AM Revision 7069: mappings/VegCore-VegBIEN.csv: Mapped taxonOccurrencePkey
Aaron Marcuse-Kubitza
10:58 AM Revision 7068: mappings/VegCore.csv: Regenerated from wiki
Aaron Marcuse-Kubitza
10:41 AM Revision 7067: input.Makefile: Staging tables installation: Added %_view/install, to prevent trying to edit a view during installation
Aaron Marcuse-Kubitza
10:31 AM Revision 7066: vegbien_dest: Added schema override support for referring to a table in the $public schema
Aaron Marcuse-Kubitza
10:29 AM Revision 7065: input.Makefile: Staging tables installation: $(cleanup): Moved setting of $schema, $table before vegbien_dest is run, so it can modify them if needed
Aaron Marcuse-Kubitza
09:42 AM Revision 7064: inputs/.TNRS/tnrs/tnrs.make: Removed unnecessary setting of $prefix, which now defaults to ""
Aaron Marcuse-Kubitza
09:40 AM Revision 7063: schemas/vegbien.sql: Added unscrubbed_taxondetermination_view
Aaron Marcuse-Kubitza

01/04/2013

10:10 PM Revision 7062: inputs/import.stats.xls: Moved CTFS to Deleted section
Aaron Marcuse-Kubitza
10:03 PM Revision 7061: make_analytical_db: ANALYZE each table after its created so that queries use index scans instead of seq scans
Aaron Marcuse-Kubitza
09:40 PM Revision 7060: schemas/vegbien.sql: sync_analytical_*_to_view(): Added datasource fkey to source.shortname so removing a datasource will also remove the corresponding rows in the analytical views
Aaron Marcuse-Kubitza
09:36 PM Revision 7059: schemas/vegbien.sql: Regenerated analytical_stem using sync_analytical_stem_to_view()
Aaron Marcuse-Kubitza
09:19 PM Revision 7058: input.Makefile: Editing import: rm: Time the command
Aaron Marcuse-Kubitza
09:15 PM Task #549 (Resolved): add covering indexes on fkeys
Aaron Marcuse-Kubitza
07:09 PM Task #549: add covering indexes on fkeys
Added for direct fkeys which are in use Aaron Marcuse-Kubitza
05:56 PM Task #549 (Resolved): add covering indexes on fkeys
* use instructions for *[[Postgres queries#Adding covering indexes on foreign keys|adding covering indexes on foreign... Aaron Marcuse-Kubitza
09:14 PM Revision 7057: schemas/vegbien.sql: Added covering indexes where needed, as described at <https://projects.nceas.ucsb.edu/nceas/issues/549>
Aaron Marcuse-Kubitza
09:11 PM Revision 7056: schemas/vegbien.sql: Fixed fkey constraint names
Aaron Marcuse-Kubitza
09:09 PM Revision 7055: schemas/vegbien.sql: Added covering indexes where needed, as described at <https://projects.nceas.ucsb.edu/nceas/issues/549>
Aaron Marcuse-Kubitza
06:59 PM Revision 7054: schemas/vegbien.sql: fkeys to source: Added covering indexes where needed, as described at <https://projects.nceas.ucsb.edu/nceas/issues/549>
Aaron Marcuse-Kubitza
06:22 PM Revision 7053: schemas/vegbien.sql: commconcept: Renamed source_id back to reference_id (it was previously renamed to source_id in a bulk rename)
Aaron Marcuse-Kubitza
06:20 PM Revision 7052: schemas/vegbien.sql: taxondetermination: Added back reference_id, which is different than the scoping source_id (reference_id was previously renamed to source_id in a bulk rename)
Aaron Marcuse-Kubitza
06:04 PM Revision 7051: schemas/vegbien.sql: Renamed taxonconcept_concept_source_id_fkey back to taxonconcept_concept_reference_id_fkey
Aaron Marcuse-Kubitza
06:02 PM Revision 7050: schemas/vegbien.sql: Renamed *_reference_id_fkey fkeys to *_source_id_fkey
Aaron Marcuse-Kubitza
05:32 PM Revision 7049: inputs/CTFS/_no_import: Temporarily remove CTFS from the public DB per Rick Condit's request (due to validation issues)
Aaron Marcuse-Kubitza
05:25 PM Revision 7048: import_all: Run import with $public_import set in order to exclude excluded datasources
Aaron Marcuse-Kubitza
05:23 PM Revision 7047: input.Makefile: Import to VegBIEN: %/import: Don't run the import if $public_import flag is set and the datasource contains a _no_import file. This allows just excluding a datasource from the public DB, without also removing it from automated testing.
Aaron Marcuse-Kubitza
05:17 PM Revision 7046: lib/common.Makefile: Added $(and), $(or), $(not)
Aaron Marcuse-Kubitza
04:30 PM Revision 7045: mappings/VegCore.csv: Regenerated from wiki
Aaron Marcuse-Kubitza
04:12 PM Revision 7044: schemas/vegbien.sql: taxondetermination: Added scoping source_id field to allow other datasources (e.g. TNRS) to make taxondeterminations. (Repurposed existing non-scoping source_id.)
Aaron Marcuse-Kubitza

01/03/2013

08:25 AM Revision 7043: make_analytical_db: Fixed bug where can't give public_ select access to all analytical_db views because this apparently adds access rather than passing through the underlying table's permissions
Aaron Marcuse-Kubitza
08:18 AM Revision 7042: make_analytical_db: Give public_ select access to analytical_db views. This causes the actual access to depend on the underlying table's permissions.
Aaron Marcuse-Kubitza
07:43 AM Revision 7041: mappings/VegCore.csv: Regenerated from wiki
Aaron Marcuse-Kubitza
07:39 AM Revision 7040: mappings/Makefile: VegCore.csv: Include only terms that start with a lowercase letter or are all caps. This also avoids the need to filter out VegCore.tables.csv.
Aaron Marcuse-Kubitza
07:31 AM Revision 7039: mappings/VegCore.csv: Changed line endings to \n in preparation for not running filter_out_cs on the file (which changes line endings to \r\n)
Aaron Marcuse-Kubitza
02:31 AM Revision 7038: import_all: `make backups/vegbien.$version.backup/test`: Documented that this uses $dump_opts. $dump_opts must be manually set when running this command outside of import_all.
Aaron Marcuse-Kubitza
02:21 AM Revision 7037: backups/Makefile: Synchronization: %/download: Download the .md5 file first, so that the user is prompted right away for their password rather than after the main file has finished downloading, by which time the password prompt has timed out
Aaron Marcuse-Kubitza
12:02 AM Revision 7036: mappings/VegCore.csv: Regenerated from wiki
Aaron Marcuse-Kubitza

01/02/2013

11:34 PM Revision 7035: mappings/VegCore.csv: Regenerated from wiki
Aaron Marcuse-Kubitza
11:32 PM Revision 7034: mappings/Makefile: VegCore.csv: Fixed bug where need to filter out VegCore.tables.csv case-sensitively so that field names which are the same as a table name don't get filtered out
Aaron Marcuse-Kubitza
11:23 PM Revision 7033: Added filter_out_cs
Aaron Marcuse-Kubitza
09:21 PM Revision 7032: README.TXT: Data import: Added step to ensure there are no local modifications using `svn st`
Aaron Marcuse-Kubitza
08:56 PM Revision 7031: make_analytical_db: Also grant USAGE on the analytical_db schema itself to bien_read, public_
Aaron Marcuse-Kubitza
07:14 PM Revision 7030: README.TXT: Data import: After import: Also check that the provider_count table contains entries for all inputs
Aaron Marcuse-Kubitza
07:03 PM Revision 7029: Added inputs/.geoscrub/_src/geovalidity-table.txt, which was attached to Jim's geovalidation e-mail (provided in README.TXT)
Aaron Marcuse-Kubitza
06:43 PM Revision 7028: inputs/import.stats.xls: Updated import times
Aaron Marcuse-Kubitza
06:39 PM Revision 7027: README.TXT: Data import: recording the import times in inputs/import.stats.xls: Updated column group header to "By column"
Aaron Marcuse-Kubitza
06:36 PM Revision 7026: backups/Makefile: Removed no longer used $(psqlVerbose)
Aaron Marcuse-Kubitza
06:36 PM Revision 7025: backups/Makefile: Removed %.backup/rm_indexes, which is no longer needed because archived imports are now backed up instead of being stored without indexes in the live DB
Aaron Marcuse-Kubitza
06:31 PM Revision 7024: backups/Makefile: %.backup/remove: Fixed bug where need to use $no_search_path option to psql_script_vegbien
Aaron Marcuse-Kubitza

12/21/2012

03:34 PM Revision 7023: import_all: Allow caller to override $dump_opts
Aaron Marcuse-Kubitza
03:33 PM Revision 7022: pg_dump_vegbien: Renamed $opts env var to $dump_opts to avoid conflicting with other commands' vars of the same name
Aaron Marcuse-Kubitza
03:22 PM Revision 7021: mappings/VegCore.csv: Regenerated from wiki
Aaron Marcuse-Kubitza
03:20 PM Revision 7020: mappings/VegCore.csv: Regenerated from wiki
Aaron Marcuse-Kubitza
02:14 PM Revision 7019: schemas/vegbien.sql: location: Removed location_unique_within_parent_by_sourceaccessioncode, which duplicates location_unique_within_creator_by_sourceaccessioncode because the sourceaccessioncode is globally unique
Aaron Marcuse-Kubitza
02:10 PM Revision 7018: schemas/vegbien.sql: analytical_stem_view: projectID: Use project.projectname if project.sourceaccessioncode isn't provided
Aaron Marcuse-Kubitza
02:02 PM Revision 7017: schemas/vegbien.sql: location: location_unique_within_parent: Split into *_by_sourceaccessioncode and *_by_authorlocationcode_position, with each ID being matched separately. This way, if the initial import of a subplot's location provides both fields, but fkey references use only one field, the fkey references will still match the existing location because only one of the fields needs to match.
Aaron Marcuse-Kubitza
01:26 PM Revision 7016: schemas/vegbien.sql: analytical_stem_view: elevationInMeters: Use parent_location.elevation_m when location.elevation_m not provided
Aaron Marcuse-Kubitza
01:17 PM Revision 7015: schemas/vegbien.sql: analytical_stem_view: scientificName: Fixed bug where need to use accepted_taxon*label*.taxonomicname instead of accepted_taxonverbatim.taxonomicname, because taxonverbatim's name component fields aren't populated if the name doesn't match a scrubbed name. The datasource's own taxonverbatim can't be used for this because the canon_label_id refers to the concatenated taxonomic name owned by the TNRS datasource.
Aaron Marcuse-Kubitza
01:00 PM Revision 7014: inputs/NVS/Plot/map.csv: Corrected Plot ID mapping to go to subplotID instead of locationID, because each subplot gets its own ID in this field
Aaron Marcuse-Kubitza
12:50 PM Revision 7013: schemas/vegbien.sql: location: location_unique_within_parent: Also apply this constraint when sourceaccessioncode is provided, because it may be a concatenated value populated for use by the analytical DB but which is not used as an fkey by the datasource itself
Aaron Marcuse-Kubitza
12:30 PM Revision 7012: schemas/vegbien.sql: analytical_stem_view: locationID: Concatenate parent location's and subplot's IDs using '; ' instead of ' '
Aaron Marcuse-Kubitza
12:22 PM Revision 7011: schemas/vegbien.sql: analytical_*: Renamed locationName to locationID because it's now globally unique (within the datasource) and can be used as a sourceaccessioncode
Aaron Marcuse-Kubitza
12:19 PM Revision 7010: schemas/vegbien.sql: analytical_stem_view: locationName: For subplots without their own sourceaccessioncode (globally unique ID), prepend the parent location's unique ID so that locationName is globally unique
Aaron Marcuse-Kubitza
12:07 PM Revision 7009: mappings/VegCore-VegBIEN.csv: locationID/locationName + subplot -> location.sourceaccessioncode mapping: Fixed bug where subplot was incorrectly being mapped to this field even when there was no location*. (This field can only be populated if both location* *and* subplot are specified.) Also only map locationID for this, to avoid inconsistencies where one table supplies locationID+subplot, while another table supplies locationName+subplot, but they both get mapped to the same field, preventing plots from being matched up with their observations when creating the analytical_stem.
Aaron Marcuse-Kubitza
11:31 AM Revision 7008: xml_func.py: Simplifying functions: Logic: _and(), _or(): Evaluate an expression of only constant values
Aaron Marcuse-Kubitza
11:30 AM Revision 7007: lists.py: Added and_(), or_()
Aaron Marcuse-Kubitza
11:28 AM Revision 7006: xml_func.py: is_scalar(): Fixed bug where need to check if value is a string before calling is_var_name()
Aaron Marcuse-Kubitza
10:15 AM Revision 7005: inputs/NVS/StemObservation/map.csv: Remapped Verbatim Code to authorTaxonCode, because as it's used this is actually an identifier for the taxon, not the stem, despite Nick Spencer's revised mapping
Aaron Marcuse-Kubitza

12/20/2012

05:21 PM Revision 7004: schemas/vegbien.ERD.mwb: Regenerated exports
Aaron Marcuse-Kubitza
05:21 PM Revision 7003: README.TXT: Schema changes: Update graphical ERD exports: Added step to commit changes
Aaron Marcuse-Kubitza
05:02 PM Revision 7002: inputs/NVS/*/map.csv: Remapped with Nick Spencer's suggested changes
Aaron Marcuse-Kubitza
04:41 PM Revision 7001: xml_func.py: _first(): Fixed bug where need to choose the first *non-empty* param, by first pruning empty child nodes
Aaron Marcuse-Kubitza
04:38 PM Revision 7000: mappings/VegCore-VegBIEN.csv: authortaxoncode mappings: Only using authorTaxonCode if there is no plant ID: Added individualID, stemID to the terms that cause authorTaxonCode not to be mapped to VegBIEN authortaxoncode
Aaron Marcuse-Kubitza
04:03 PM Revision 6999: mappings/VegCore-VegBIEN.csv: authortaxoncode mappings: Only using authorTaxonCode if there is no plant ID: Added individualID, stemID to the terms that cause authorTaxonCode not to be mapped to VegBIEN authortaxoncode
Aaron Marcuse-Kubitza
03:59 PM Revision 6998: schemas/vegbien.sql: analytical_*: Renamed individualID to individualObservationID because this actually corresponds to plantobservation.sourceaccessioncode, which is an observation *of* an individual
Aaron Marcuse-Kubitza
03:56 PM Revision 6997: mappings/VegCore.csv: Regenerated from wiki
Aaron Marcuse-Kubitza
03:53 PM Revision 6996: README.TXT: Data import: Recording the import times: Changed <version> back to $version because these commands are actually run on vegbiendev, where $version is set. (Modifications to import.stats.xls would be made on your local machine.)
Aaron Marcuse-Kubitza
03:50 PM Revision 6995: README.TXT: Data import: Added step to unset $version before starting the import, to avoid importing on top of the last import's data
Aaron Marcuse-Kubitza
02:47 PM Revision 6994: README.TXT: Data import: Replaced $version with <version> where it needs to be manually filled in
Aaron Marcuse-Kubitza
02:40 PM Revision 6993: README.TXT: Data import: On nimoy: Added command to set $version
Aaron Marcuse-Kubitza
02:36 PM Task #548 (New): remove benign errors from the data provider feedback tables
h3. Examples... Aaron Marcuse-Kubitza
02:26 PM Revision 6992: mappings/VegCore-VegBIEN.csv: authortaxoncode mappings: Only use authorTaxonCode if there is no plant ID, because an individual plant gets its own taxonoccurrence and thus needs the taxonoccurrence's IDs to be unique to the plant, regardless of what the author designates as the taxonoccurrence code
Aaron Marcuse-Kubitza
01:47 PM Revision 6991: Generated inputs/NVS/new_terms.csv
Aaron Marcuse-Kubitza
01:47 PM Revision 6990: input.Makefile: SVN: $(svnFilesGlob): Also match *terms.csv in top-level dir
Aaron Marcuse-Kubitza
01:23 PM Revision 6989: mappings/VegCore-VegBIEN.csv: Mapped authorTaxonCode
Aaron Marcuse-Kubitza
01:12 PM Revision 6988: mappings/VegCore.csv: Regenerated from wiki
Aaron Marcuse-Kubitza
01:12 PM Revision 6987: README.TXT: Maintenance: VegCore data dictionary: Added step to commit updated mappings/VegCore.csv
Aaron Marcuse-Kubitza
12:13 PM Revision 6986: schemas/Makefile: %/publish: Fixed bug where commands were not being run transactionally, because --single-transaction requires `--file -` to work properly
Aaron Marcuse-Kubitza
11:36 AM Revision 6985: input.Makefile: Editing import: Removed rotate because appending the current svn revision doesn't make sense, since this is not related to the revision used to import the datasource
Aaron Marcuse-Kubitza
11:34 AM Revision 6984: input.Makefile: Editing import: Added rename/% and use it in rotate
Aaron Marcuse-Kubitza
11:21 AM Revision 6983: inputs/import.stats.xls: Updated import times
Aaron Marcuse-Kubitza
11:21 AM Revision 6982: schemas/Makefile: Use $* instead of $(@D) for clarity. $(@D) is only needed when the dir part of the target includes a prefix in addition to the % stem.
Aaron Marcuse-Kubitza
10:45 AM Revision 6981: make_analytical_db: Automatically call export_analytical_db when finished
Aaron Marcuse-Kubitza
10:35 AM Revision 6980: schemas/vegbien.sql: make_family_higher_plant_group(): Added `taxonepithet IS NOT NULL` filter, to allow make_analytical_db to proceed even when the NCBI import fails (leaving some nodes with rank = 'family' but no associated taxonepithet). The most recent NCBI import failed due to the search_path/DuplicateException bug resulting from the import schema and public being in the search_path together.
Aaron Marcuse-Kubitza
10:14 AM Revision 6979: schemas/Makefile: Fixed bug where need `SHELL := /bin/bash` for \$(confirmRmPublicSchema) to work correctly
Aaron Marcuse-Kubitza
10:12 AM Revision 6978: lib/common.Makefile: $(confirm): Added comment that this requires `SHELL := /bin/bash` to work correctly
Aaron Marcuse-Kubitza
10:09 AM Revision 6977: import_all: after_import(): Added `make backups/vegbien.$version.backup/test`
Aaron Marcuse-Kubitza
10:05 AM Revision 6976: sql.py: DbConn._db(): search_path: Don't append the existing search_path, because it usually includes the public schema, which is now different from the schema being imported into. This fixes a bug where sql.function_exists() would find public-schema functions in both the public schema and the import's schema because both were in the search_path, causing a DuplicateException "more than one function named ...". Note that the elements of the existing search_path are no longer needed now that vegbien_dest's $schemas includes $public. Also note that if an instance of DbConn does not specify the schemas param, the existing search_path will be left as-is rather than overwritten with an empty list.
Aaron Marcuse-Kubitza
09:54 AM Revision 6975: README.TXT: Data import: recording the import times in inputs/import.stats.xls: Added step to determine the import date using import_date
Aaron Marcuse-Kubitza
09:52 AM Revision 6974: import_date: Added note that Mac and Linux differ in the order they sort the logs in
Aaron Marcuse-Kubitza
09:50 AM Revision 6973: README.TXT: Data import: recording the import times in inputs/import.stats.xls: Updated pattern for new log filename format
Aaron Marcuse-Kubitza
09:47 AM Revision 6972: README.TXT: Data import: recording the import times in inputs/import.stats.xls: Removed extra ./ before bin/import_times
Aaron Marcuse-Kubitza
09:46 AM Revision 6971: import_date: Added note that the time this outputs is the time the first special input *finished* importing. The import itself generally starts a few minutes before that, and the exact time is in that import's public schema comment.
Aaron Marcuse-Kubitza
09:41 AM Revision 6970: import_date: Removed duplicate Usage message at top of file, which is repeated in the Usage message provided when the program is run with no arguments
Aaron Marcuse-Kubitza
09:40 AM Revision 6969: Added import_date
Aaron Marcuse-Kubitza
09:38 AM Revision 6968: Added mtime
Aaron Marcuse-Kubitza
09:29 AM Revision 6967: lib/common.Makefile: System: Added $(mtime)
Aaron Marcuse-Kubitza
09:27 AM Revision 6966: lib/common.Makefile: $(date): Factored date format out into $(dateFmt)
Aaron Marcuse-Kubitza
09:25 AM Revision 6965: backups/Makefile: Factored $(isMac) out into lib/common.Makefile
Aaron Marcuse-Kubitza
08:51 AM Task #446: fix deadlock in INSERT IGNORE replacement
Added additional bug occurrence Aaron Marcuse-Kubitza
08:30 AM Revision 6964: README.TXT: Data import: tailing logs: Updated pattern for new log filename format
Aaron Marcuse-Kubitza

12/19/2012

02:02 PM Revision 6963: schemas/Makefile: Installation: %/publish: Fixed bug where need quotes around source schema name
Aaron Marcuse-Kubitza
01:57 PM Revision 6962: README.TXT: Data import: Moved deletion previous imports before the import, so that full DB backup can be automated
Aaron Marcuse-Kubitza
01:55 PM Revision 6961: README.TXT: Data import: `make backups/vegbien.$version.backup/test`: Added --exclude-schema=public to leave out the previous (now published) import so it doesn't bloat the backup. Note that public is included in the vegbien.$version.backup for the previous import, named according to its version.
Aaron Marcuse-Kubitza
01:49 PM Revision 6960: import_all: after_import(): Added `make backups/TNRS.backup-remake`
Aaron Marcuse-Kubitza
01:46 PM Revision 6959: README.TXT: Data import: Added step to publish the import to the public schema
Aaron Marcuse-Kubitza
01:42 PM Revision 6958: import_all: after_import(): Added export_analytical_db
Aaron Marcuse-Kubitza
01:36 PM Revision 6957: README.TXT: Data import: bin/export_analytical_db: Removed `env public=$version` because export_analytical_db now uses $version as $public when provided
Aaron Marcuse-Kubitza
01:35 PM Revision 6956: README.TXT: Data import: To remake analytical DB: Removed `env public=...` because $version (which replaces $public) is now set automatically by import_all
Aaron Marcuse-Kubitza
01:32 PM Revision 6955: schemas/Makefile: Installation: py_functions/install: Removed `env public=`, which is not needed since $(psqlAsAdminVegbien) does not use psql_script_vegbien (which uses $public)
Aaron Marcuse-Kubitza
01:28 PM Revision 6954: export_analytical_db: Use vegbien_dest to set the default value for $public
Aaron Marcuse-Kubitza
01:21 PM Revision 6953: README.TXT: Data import: If many inputs have errors: Updated command to `make schemas/$version/uninstall` because the current import's schema is now named $version
Aaron Marcuse-Kubitza
01:15 PM Revision 6952: schemas/Makefile: Installation: $(schemas), $(schemasReversed) (used e.g. by `make schemas/reinstall`): Removed public so that when `make schemas/reinstall` is run before an import, it will not remove any active (published) import which resides in the public schema
Aaron Marcuse-Kubitza
01:10 PM Revision 6951: README.TXT: Schema changes: Reinstall public separately from the other schemas so that it will still be reinstalled when schemas/reinstall excludes the public schema to avoid removing any active (published) import
Aaron Marcuse-Kubitza
01:01 PM Revision 6950: vegbien_dest callers: Removed no longer needed explicit setting $prefix to "", because this is now the default value
Aaron Marcuse-Kubitza
01:00 PM Revision 6949: vegbien_dest: Changed default $prefix to "", so that the majority of callers don't need to manually set $prefix to "" to avoid it defaulting to out_
Aaron Marcuse-Kubitza
12:45 PM Revision 6948: README.TXT: Data import: Use env var $version, which is now set by import_all, instead of manually inserting the version for <version>
Aaron Marcuse-Kubitza
12:40 PM Revision 6947: vegbien_dest: Also export $version
Aaron Marcuse-Kubitza
12:30 PM Revision 6946: import_all: Run the import directly into a new, already-versioned public schema. This removes the need to manually rename the schema after import, and allows the backup commands to use the stored $version shell variable to refer to the last import.
Aaron Marcuse-Kubitza
12:25 PM Revision 6945: schemas/Makefile: %/publish: Added instruction to run `unset version` after the command, to clear the $version shell variable which will be set by import_all
Aaron Marcuse-Kubitza
12:12 PM Revision 6944: README.TXT: Data import: Replaced <import_name> with <version> because the import name is now just the version
Aaron Marcuse-Kubitza
12:10 PM Revision 6943: README.TXT: Data import: Replaced r<revision> with <version> because the version string is now equal to r<revision>
Aaron Marcuse-Kubitza
12:09 PM Revision 6942: README.TXT: Backups: Replaced <date> with <version> because the date is no longer included in the version string
Aaron Marcuse-Kubitza
12:08 PM Revision 6941: README.TXT: Name archived imports without the "public." prefix so that their backups will work with the new `make backups/%.backup/remove` command, which does not add back the prefix
Aaron Marcuse-Kubitza
11:56 AM Revision 6940: backups/Makefile; $(public*): Don't add a "public." prefix to get the name of the public schema
Aaron Marcuse-Kubitza
11:40 AM Revision 6939: backups/Makefile: Removed no longer used $(rmSchema)
Aaron Marcuse-Kubitza
11:39 AM Revision 6938: backups/Makefile: Use \$(rmSchemaCmd) from lib/common.Makefile instead of \$(rmSchema)
Aaron Marcuse-Kubitza
11:20 AM Revision 6937: vegbien_dest: Use $version as $public when $public not provided. When neither is provided, continue to use "public" and also set $version to that.
Aaron Marcuse-Kubitza
11:11 AM Revision 6936: schemas/Makefile: Installation: rotate: Use just the version, without the "public." prefix
Aaron Marcuse-Kubitza
11:04 AM Revision 6935: schemas/Makefile: Installation: `public/install public%/install`: Generalized to %/install to allow public schema versions with any name. This requires moving `%/install: %.sql` before it to override it.
Aaron Marcuse-Kubitza
11:00 AM Revision 6934: schemas/Makefile: Installation: Merged public/install and public%/install
Aaron Marcuse-Kubitza
10:54 AM Revision 6933: schemas/Makefile: Installation: Moved %/uninstall to beginning of section because it applies to all schemas
Aaron Marcuse-Kubitza
10:52 AM Revision 6932: schemas/Makefile: Installation: public: Generalized public%/publish to %/publish so that public schema versions don't have to start with public_
Aaron Marcuse-Kubitza
10:34 AM Revision 6931: schemas/Makefile: Installation: %/uninstall: Also display schema delete confirmation for schemas whose name is just the version suffix (r<revision #>)
Aaron Marcuse-Kubitza
10:32 AM Revision 6930: schemas/Makefile: Merged public%/uninstall and %/uninstall
Aaron Marcuse-Kubitza
09:49 AM Revision 6929: lib/common.Makefile: Added version target, which prints the current $(version) value
Aaron Marcuse-Kubitza
09:36 AM Revision 6928: schemas/Makefile: Installation: public: public%/uninstall: Fixed bug where need to remove the *specified* version of the public schema, not public itself. Generalized $(confirmRmPublicSchema) so it could also be used for named versions of the public schema. Inlined $(rmPublicSchema) since it's now only used in one place.
Aaron Marcuse-Kubitza
09:26 AM Revision 6927: lib/common.Makefile: Revisions: $(version): Use just the revision # to avoid cluttering the schema and log file names with long datetime strings
Aaron Marcuse-Kubitza
09:25 AM Revision 6926: schemas/Makefile: public%/install: schema comment: Include current date/time after version
Aaron Marcuse-Kubitza
09:20 AM Revision 6925: lib/common.Makefile: Replaced no longer used $(date) with function to generate human-readable text date (rather than date to put in filename). Removed leading zeros from date and hour. Added timezone.
Aaron Marcuse-Kubitza
09:07 AM Revision 6924: backups/Makefile: Removed no longer used $(dateFmt), $(mtime)
Aaron Marcuse-Kubitza
08:59 AM Revision 6923: backups/Makefile: Removed %.backup/rotate, because this incorrectly causes the current time rather than the version to be used in the backup filename. The version should instead be specified in the backup filename when it's created.
Aaron Marcuse-Kubitza
08:51 AM Revision 6922: schemas/Makefile: Installation: public: Added public%/publish to replace the current public schema with the given version
Aaron Marcuse-Kubitza
08:37 AM Revision 6921: schemas/Makefile: Installation: public: public/uninstall: Added public%/uninstall as a target to allow uninstalling versions of the public schema
Aaron Marcuse-Kubitza
08:30 AM Revision 6920: schemas/Makefile: Installation: public: public%/install: Add a comment on the schema containing the versioned schema name, so that if the schema is later renamed to just public (i.e. "published" as the current version), it will still be possible to tell which version the public schema came from
Aaron Marcuse-Kubitza
08:22 AM Revision 6919: schemas/Makefile: Installation: public: Added public%/install, to install a version of the public schema
Aaron Marcuse-Kubitza
07:59 AM Revision 6918: schemas/Makefile: Removed unused $(os)
Aaron Marcuse-Kubitza
07:58 AM Revision 6917: schemas/Makefile: Removed unused $(SED)
Aaron Marcuse-Kubitza
06:22 AM Revision 6916: Moved schemas-related commands from root Makefile to schemas/Makefile
Aaron Marcuse-Kubitza
06:15 AM Revision 6915: Makefiles: Factored out common vars/functions into lib/common.Makefile
Aaron Marcuse-Kubitza
05:59 AM Revision 6914: root Makefile: $(psqlNoSearchPath): Merged $(psqlAsBien) into it because it's the only place $(psqlAsBien) is used
Aaron Marcuse-Kubitza
05:56 AM Revision 6913: root Makefile: $(psqlAsBien): Use psql_script_vegbien instead of psql_vegbien, which adds $(psqlOpts) itself
Aaron Marcuse-Kubitza
05:50 AM Revision 6912: schemas/Makefile: Include lib/common.Makefile
Aaron Marcuse-Kubitza
05:23 AM Revision 6911: inputs/import.stats.xls: Reformatted so the first by column import and the comparison by row import will fit on the same page when printed on portrait-mode letter paper
Aaron Marcuse-Kubitza
05:10 AM Revision 6910: inputs/import.stats.xls: Changed import type labels to By row/By column so they would fit into one field, leaving the extra field free to contain the revision #
Aaron Marcuse-Kubitza
05:02 AM Revision 6909: lib/common.Makefile: Revisions: Allow $(version) to be overridden in the environment, so that the public schema and all log files share the same, pregenerated version
Aaron Marcuse-Kubitza
04:16 AM Revision 6908: schemas/vegbien.sql: Merged provider_view, provider_count, and owner_count into provider_count, using the combining query for Brad's data providers page at <http://bien.nceas.ucsb.edu/bien/people/data-providers/>
Aaron Marcuse-Kubitza
01:23 AM Revision 6907: schemas/vegbien.sql: sync_taxon_trait_to_view(): Changed pkey to index because there can be multiple values of the same taxon's trait from different observations
Aaron Marcuse-Kubitza
01:16 AM Revision 6906: mappings/Makefile: VegCore.csv: Filter out the VegCore tables so they are not matched as terms. This is necessary because some terms have the same name as a table, but the term should be the match rather than the table.
Aaron Marcuse-Kubitza
12:29 AM Revision 6905: sql.py: DbConn.col_info(): raising sql_gen.NoUnderlyingTableException: Fixed bug where also need to catch DoesNotExistException, which is thrown by ::regclass
Aaron Marcuse-Kubitza
12:26 AM Revision 6904: sql.py: DbConn.col_info(): Fixed bug where need to run run_query() recoverably, because this query throws an exception if the column's table does not exist (the information_schema query just returned no rows)
Aaron Marcuse-Kubitza
12:22 AM Revision 6903: sql.py: DbConn.col_info(): Fixed bug where need to use pg_get_expr() on pg_attrdef.adbin instead of shortcut field adsrc, because adsrc does not include schema qualifiers on table names (including strings passed to `nextval('..._seq'::regclass)`)
Aaron Marcuse-Kubitza
 

Also available in: Atom