Project

General

Profile

Statistics
| Revision:

# Date Author Comment
7341 01/22/2013 07:43 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: stemobservation: Added stemobservation_non_empty CHECK constraint to prevent creating an empty stemobservation for plantobservation rows without stem data but with stem mappings

7340 01/22/2013 07:36 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: stemobservation: Added stemobservation_non_empty CHECK constraint to prevent creating an empty stemobservation for plantobservation rows without stem data but with stem mappings

7339 01/22/2013 07:34 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: stemobservation: Added stemobservation_non_empty CHECK constraint to prevent creating an empty stemobservation for plantobservation rows without stem data but with stem mappings

7338 01/22/2013 07:16 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: taxonverbatim: taxonverbatim_unique: Added morphoname for cases when there is just a morphoname, and to distinguish taxonverbatims with the same taxonlabel but different morphonames

7337 01/22/2013 07:11 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: taxonverbatim: Allow taxonlabel_id to be NULL when morphoname is provided

7336 01/22/2013 07:09 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: taxonverbatim: Allow taxonlabel_id to be NULL when morphoname is provided

7335 01/22/2013 07:04 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: taxonverbatim: Added source_id to allow creating taxonverbatims without a (scoping) taxonlabel

7334 01/22/2013 05:34 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: analytical_stem_view: Removed speciesBinomialWithMorphospecies now that it's duplicated by scientificNameWithMorphospecies

7333 01/22/2013 05:28 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: analytical_stem_view: scientificNameWithMorphospecies: Create it using the speciesBinomialWithMorphospecies formula, per Brad's request at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Spot-checking#2013-1-18>

7332 01/22/2013 05:05 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: analytical_stem_view: Added coordinateSource to indicate whether coordinates are from county_centroids (georeferencing) or the source data

7331 01/22/2013 05:00 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: Added coordinatesource enum

7330 01/22/2013 04:50 PM Aaron Marcuse-Kubitza

mappings/VegCore.csv: Regenerated from wiki

7329 01/22/2013 04:34 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: analytical_stem_view: coordinates: Also use the county_centroids coordinates when the datasource coordinates are not geovalid. (Note that canon_place.geovalid will be NULL, i.e. not true, when the datasource coordinates are NULL.)

7328 01/22/2013 04:28 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: scientificName: Set to taxonverbatim.taxonname instead per Brad's changes at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Spot-checking#2013-1-18&gt;. Renamed to taxonName since this now doesn't include the author, which is part of DwC's scientificName field.

7327 01/22/2013 03:55 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: sync_analytical_stem_to_view(): Support running the function when dependent views do not exist. This allows using the sync function when changing column names of the analytical_stem_view, which sometimes requires manually dropping and re-creating the analytical_aggregate_view.

7326 01/22/2013 02:49 PM Aaron Marcuse-Kubitza

backups/Makefile: %.md5/test: Added comment to run with `make -s` to avoid echoing make commands

7325 01/22/2013 02:42 PM Aaron Marcuse-Kubitza

README.TXT: Full database import: Added steps to scrub unscrubbed taxondeterminations (if they are not scrubbed automatically)

7324 01/22/2013 02:06 PM Aaron Marcuse-Kubitza

inputs/.geoscrub/_src/README.TXT: Added e-mails from Jim about how the county_centroids data was generated

7323 01/22/2013 01:18 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: analytical_stem_view: coordinates: Use new county_centroids coordinates and uncertainty when the datasource's coordinates are not available

7322 01/22/2013 01:10 PM Aaron Marcuse-Kubitza

Added inputs/.geoscrub/county_centroids/ from Jim

7321 01/22/2013 01:09 PM Aaron Marcuse-Kubitza

inputs/.geoscrub/import_order.txt: Added geoscrub_output

7320 01/22/2013 12:24 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

7319 01/22/2013 12:19 PM Aaron Marcuse-Kubitza

README.TXT: Full database import: In PostgreSQL: Added step to check that there are TNRS taxondeterminations

7318 01/22/2013 12:17 PM Aaron Marcuse-Kubitza

README.TXT: Full database import: In PostgreSQL: Added step to check that unscrubbed_taxondetermination_view returns no rows

7317 01/18/2013 02:37 PM Aaron Marcuse-Kubitza

Added inputs/newWorld/newWorldCountries/_no_import

7316 01/18/2013 02:33 PM Aaron Marcuse-Kubitza

to_do/timeline.2013.xls: Updated with Brad's modifications

7315 01/18/2013 02:18 PM Aaron Marcuse-Kubitza

Added inputs/FIA/_src/FIA_summary.b-e.00079.pdf from Bob

7314 01/18/2013 02:07 PM Aaron Marcuse-Kubitza

Added inputs/.herbaria/_archive/

7313 01/18/2013 01:02 PM Aaron Marcuse-Kubitza

inputs/.herbaria/: Removed no longer needed geoscrub.*.sql, which has been replaced with bien3_adb.*.sql

7312 01/18/2013 01:00 PM Aaron Marcuse-Kubitza

inputs/.herbaria/: Removed no longer needed herbaria/. Use ih/ instead.

7311 01/18/2013 12:58 PM Aaron Marcuse-Kubitza

Added inputs/.herbaria/ih/ and corresponding bien3_adb MySQL export

7310 01/18/2013 12:43 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: Don't create NCBI crosslinks for the matched taxonomic name. These crosslinks are no longer needed now that TNRS provides a separate accepted name on which crosslinks can be made.

7309 01/18/2013 12:32 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: unscrubbed_taxondetermination_view: Include the accepted name's row next to the matched name's row instead of merging the two together into one TNRS row, to allow including separate taxondeterminations for the matched and accepted names. Added Max_score from TNRS.tnrs.

7308 01/18/2013 12:25 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: taxondetermination_set_iscurrent(): Added new determinationtype accepted to sort order

7307 01/18/2013 12:01 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: Mapped accepted* taxonomic name, now to separate accepted taxondetermination

7306 01/18/2013 11:35 AM Aaron Marcuse-Kubitza

mappings/VegCore.csv: Regenerated from wiki

7305 01/18/2013 11:20 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: taxondetermination_set_iscurrent(): Changed TNRS determinationtype from computer to matched, to allow for a separate accepted determinationtype

7304 01/18/2013 10:57 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: taxonlabel: Removed creationdate, which duplicates taxondetermination.determinationdate

7303 01/18/2013 10:08 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: analytical_stem_view: isNewWorld: Removed no longer needed COALESCE to false, because newWorldCountries now uses false where applicable instead of NULL. This also ensures that isNewWorld will be NULL if there is no country name to test, which was not the case in the previous workaround.

7302 01/18/2013 10:02 AM Aaron Marcuse-Kubitza

Added inputs/newWorld/newWorldCountries/ with postprocess.sql that sets isNewWorld to false wherever it's NULL. (The input table only marks New World countries as true, but doesn't mark non-New World countries as false.)

7301 01/18/2013 09:50 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: analytical_stem_view: isNewWorld: Fixed bug where need to COALESCE "newWorldCountries"."isNewWorld" to false, because it is only set to a boolean for countries that are New World

7300 01/18/2013 09:19 AM Aaron Marcuse-Kubitza

README.TXT: Full database import: freeing disk space: Updated import schema size, which is smaller due to the removed CTFS staging tables, removed duplicate rows, and possibly fewer index holes

7299 01/18/2013 08:56 AM Aaron Marcuse-Kubitza

README.TXT: Full database import: After running `make schemas/$version/publish`, added `unset version` to make sure future version-dependent commands use the public schema

7298 01/18/2013 08:50 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: taxon_trait_view: Fixed bug where measurementUnit needed to be set to trait.units, not name

7297 01/18/2013 08:42 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: provider_count_view: Don't set default values for sourcetype/observationtype, because the appropriate values are now set for all top-level inputs and these defaults are not applicable for data owners not in geoscrub.herbaria

7296 01/18/2013 08:41 AM Aaron Marcuse-Kubitza

inputs/bien2_traits/Source/map.csv: Mapped observationType

7295 01/18/2013 08:27 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: taxondetermination: Removed taxondetermination_computer_min_fit CHECK constraint, whose functionality is now duplicated by unscrubbed_taxondetermination_view's Max_score filter condition. The score threshold value should only be maintained in one place, namely unscrubbed_taxondetermination_view.

7294 01/18/2013 08:23 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: unscrubbed_taxondetermination_view: Fixed bug where need to filter out any names that will be rejected by taxondetermination's constraints, because otherwise, these names will stay in unscrubbed_taxondetermination_view and be repeatedly reimported

7293 01/18/2013 07:38 AM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: tnrs: Added Max_score column for use in filtering out names that will be rejected by taxondetermination's constraints

7292 01/18/2013 07:22 AM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: Renamed tnrs_populate_accepted_scientific_name() trigger to tnrs_populate_derived_fields() to accommodate additional derived fields

7291 01/18/2013 07:14 AM Aaron Marcuse-Kubitza

tnrs_db: Support multiple appended columns in the tnrs table

7290 01/18/2013 07:13 AM Aaron Marcuse-Kubitza

csvs.py: ColInsertFilter: Support adding multiple, consecutive columns

7289 01/18/2013 06:30 AM Aaron Marcuse-Kubitza

schemas/functions.sql: _max(), _min(): Put $n params all on one line to match other aggregating functions

7288 01/18/2013 06:28 AM Aaron Marcuse-Kubitza

schemas/functions.sql: _max(), _min(): Use PostgreSQL built-in functions GREATEST, LEAST instead of a query with aggregating functions

7287 01/18/2013 06:02 AM Aaron Marcuse-Kubitza

README.TXT: Added Single datasource import section with commands to import/reimport/scrub just a datasource rather than the full DB

7286 01/18/2013 05:54 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: taxondetermination: taxondetermination_set_iscurrent_on_delete() trigger: Fixed bug where need to suppress any foreign key exception, which occurs during a cascading delete because the associated taxonoccurrence has already been deleted, preventing any other taxondeterminations of that taxonoccurrence from being updated

7285 01/18/2013 05:35 AM Aaron Marcuse-Kubitza

input.Makefile: Taxonomic scrubbing: Added reimport_scrub

7284 01/18/2013 05:34 AM Aaron Marcuse-Kubitza

input.Makefile: Import to VegBIEN: Added reimport

7283 01/18/2013 05:28 AM Aaron Marcuse-Kubitza

input.Makefile: Taxonomic scrubbing: Added rescrub

7282 01/18/2013 05:21 AM Aaron Marcuse-Kubitza

input.Makefile: Taxonomic scrubbing: Added scrub target and use it in import_scrub

7281 01/18/2013 05:18 AM Aaron Marcuse-Kubitza

input.Makefile: Import to VegBIEN: Moved import, rm to top of section since they are top-level targets and don't depend on the variables defined for %/import

7280 01/18/2013 05:17 AM Aaron Marcuse-Kubitza

input.Makefile: Moved rm to Import to VegBIEN section

7279 01/18/2013 05:16 AM Aaron Marcuse-Kubitza

input.Makefile: Moved taxonomic scrubbing targets to separate Taxonomic scrubbing section

7278 01/18/2013 04:43 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

7277 01/18/2013 03:34 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: provider_count_view: Include only sources with at least one row. Currently (as of r7023), all entries in BIEN2's geoscrub.herbaria are also in VegBIEN, so the filter is not yet necessary, but switching to bien3_adb.ih could create source entries without data rows which should be excluded from the providers list.

7276 01/18/2013 03:25 AM Aaron Marcuse-Kubitza

import_all: Output the PIDs of the import_scrub and after_import processes, so those processes can be managed without shell job control. This is useful if the connection is lost to the remote shell running the import, which prevents using job control on the import processes.

7275 01/18/2013 01:23 AM Aaron Marcuse-Kubitza

input.Makefile: Import to VegBIEN: import_scrub: Run `make scrub` in the background, to allow the import to continue with the next table rather than having to wait for the current table to be scrubbed

7274 01/18/2013 12:53 AM Aaron Marcuse-Kubitza

inputs/.TNRS/public.unscrubbed_taxondetermination_view/scrub.make: Moved waitself call to top of script

7273 01/18/2013 12:52 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

7272 01/18/2013 12:24 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Added Postprocessing section for use with the next import

7271 01/18/2013 12:05 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times. Total does not yet include postprocessing.

7270 01/17/2013 11:29 PM Aaron Marcuse-Kubitza

import_times: Add blank line before \"Postprocessing logs\" to separate it from the input logs

7269 01/17/2013 11:28 PM Aaron Marcuse-Kubitza

import_times: Separate out the postprocessing logs (e.g. public.unscrubbed_taxondetermination_view), as the import times in these logs are not aggregated together (each input has its own run of the postprocessing script)

7268 01/16/2013 02:55 PM Aaron Marcuse-Kubitza

root Makefile: Datasources: import: Use new import_scrub instead of import (input.Makefile)

7267 01/16/2013 02:51 PM Aaron Marcuse-Kubitza

import_all: Use new import_scrub (input.Makefile) instead of import, which avoids needing to start background processes for tnrs-remake and scrub-remake

7266 01/16/2013 02:50 PM Aaron Marcuse-Kubitza

inputs/.TNRS/public.unscrubbed_taxondetermination_view/scrub.make: Fixed bug where need to use tnrs.make's lockfile instead because can't be importing while tnrs.make is scrubbing. tnrs.make leaves tnrs in an incomplete state while running because the accepted names are parsed after their matched names. Using a separate lockfile would cause some accepted names to be missing.

7265 01/16/2013 02:27 PM Aaron Marcuse-Kubitza

input.Makefile: Import to VegBIEN: Added import_scrub, which runs `make scrub` after the import

7264 01/16/2013 02:26 PM Aaron Marcuse-Kubitza

root Makefile: Datasources: Added scrub, which runs tnrs-remake and scrub-remake

7263 01/16/2013 02:18 PM Aaron Marcuse-Kubitza

inputs/.TNRS/*/*.make: Only allow one instance of the script to be running at any time, by using new waitself

7262 01/16/2013 02:15 PM Aaron Marcuse-Kubitza

waitpid, lockfile: Changed $interval default to 5s to work with smaller imports, where less waiting is needed

7261 01/16/2013 02:14 PM Aaron Marcuse-Kubitza

Added waitself

7260 01/16/2013 02:11 PM Aaron Marcuse-Kubitza

bin/lockfile: Include the PID in the lockfile to avoid the need to manually remove lockfiles. On Mac, this requires using shlock instead of lockfile.

7259 01/16/2013 01:35 PM Aaron Marcuse-Kubitza

Added bin/lockfile

7258 01/16/2013 01:34 PM Aaron Marcuse-Kubitza

Added pid2name

7257 01/16/2013 01:33 PM Aaron Marcuse-Kubitza

Added name2pids

7256 01/16/2013 01:33 PM Aaron Marcuse-Kubitza

waitpid: Use `ps` instead of /proc to also work on Mac

7255 01/16/2013 01:07 PM Aaron Marcuse-Kubitza

inputs/.TNRS/tnrs/tnrs.make: Fixed bug where need special handling to support being run as a .make script

7254 01/16/2013 11:59 AM Aaron Marcuse-Kubitza

inputs/.geoscrub/_src/README.TXT: Added dates for e-mails from Jim

7253 01/16/2013 11:57 AM Aaron Marcuse-Kubitza

inputs/.geoscrub/_src/README.TXT: Added e-mail from Jim about repository with scripts to generate the geoscrub_output table

7252 01/16/2013 11:02 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: unscrubbed_taxondetermination_view: Fixed bug where need to use tnrs_accepted.Name_submitted IS NOT NULL rather than tnrs_accepted.* IS NOT NULL, because tnrs_accepted.* (which plain tnrs_accepted gets changed to by PostgreSQL) checks each field of the tnrs_accepted tuple rather than checking if the tuple itself is NULL

7251 01/16/2013 10:23 AM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: Added back tnrs+accepted view, which is useful for debugging the import of the TNRS results

7250 01/16/2013 09:21 AM Aaron Marcuse-Kubitza

inputs/REMIB/Specimen/postprocess.sql: Added back ARIZ, NY because some REMIB specimens for these datasources are not yet in the datasources themselves

7249 01/16/2013 08:43 AM Aaron Marcuse-Kubitza

Added inputs/REMIB/Specimen/postprocess.sql to remove institutions that we have direct data for

7248 01/16/2013 08:43 AM Aaron Marcuse-Kubitza

Placed inputs/REMIB/_archive/ under version control

7247 01/16/2013 08:23 AM Aaron Marcuse-Kubitza

Added inputs/SpeciesLink/Specimen/postprocess.sql to remove institutions that we have direct data for

7246 01/16/2013 08:21 AM Aaron Marcuse-Kubitza

Placed inputs/SpeciesLink/_archive/ under version control

7245 01/16/2013 07:56 AM Aaron Marcuse-Kubitza

input.Makefile: $(import?): Renamed $public_import option to $full_import because it applies to any import of all datasources, not just a public import on vegbiendev

7244 01/16/2013 07:23 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: analytical_stem_view: Changed `WHERE COALESCE` to a join condition to enable using the taxondetermination_single_current_determination index, which produces the filtered rows directly. Note that this index will not be used for full-database imports, because the query planner uses hash joins everywhere instead of nested loops.

7243 01/16/2013 06:47 AM Aaron Marcuse-Kubitza

db_xml.py: put_table(): Fixed bug where for views, shouldn't advance start (OFFSET clause) after each chunk, because views are typically dynamic and will contain a new set of rows after the first set is imported

7242 01/16/2013 06:41 AM Aaron Marcuse-Kubitza

sql.py: Added view_exists()