Project

General

Profile

Statistics
| Revision:
  • svn:ignore: *

# Date Author Comment
12038 02/04/2014 10:01 AM Aaron Marcuse-Kubitza

fix: inputs/CVS/project/: added _no_import since this should not also be imported separately from taxon_observation.**

12018 02/02/2014 12:49 AM Aaron Marcuse-Kubitza

inputs/input.Makefile: add!: verify/: also svn:ignore *.tsv, *.txt

11970 01/20/2014 11:33 AM Aaron Marcuse-Kubitza

moved everything into /trunk/ to create the standard svn layout, for use with tools that require this (eg. git-svn). IMPORTANT: do NOT do an `svn up`. instead, re-use your working copy's existing files with `svn switch` (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.switch.html).

11955 01/15/2014 09:23 AM Aaron Marcuse-Kubitza

inputs/VegBank/vegbank.~.clean_up.sql, inputs/CVS/cvs.~.clean_up.sql: Prevent "column name specified more than once" errors when tables are joined: put tables in alphabetical order for consistency

11931 12/20/2013 02:56 PM Aaron Marcuse-Kubitza

inputs/CVS/^taxon_observation.**.sample/create.sql, map.csv: added new project columns

11930 12/20/2013 02:44 PM Aaron Marcuse-Kubitza

inputs/CVS/taxon_observation.**/postprocess.sql: added the project table

11929 12/20/2013 02:42 PM Aaron Marcuse-Kubitza

inputs/CVS/project/map.csv: mapped stopDate->projectEndDate

11928 12/20/2013 02:35 PM Aaron Marcuse-Kubitza

mapped inputs/CVS/project/, which includes the projectName for attribution

11917 12/16/2013 07:05 PM Aaron Marcuse-Kubitza

inputs/CVS/^taxon_observation.**.sample/create.sql: added Mike Lee's additional plots used to validate confidentiality-related fields (wiki.vegpath.org/CVS_validation#plots-to-include)

11916 12/16/2013 06:00 PM Aaron Marcuse-Kubitza

bugfix: inputs/CVS/^taxon_observation.**.sample/create.sql: include taxonName in the subset of columns that's imported for the validation, because it is _alt-ed with scientificName for forming the TNRS input name. this is unique to CVS, which is why it was not part of the validation subset copied from the VegBank subset.

11905 12/11/2013 10:54 PM Aaron Marcuse-Kubitza

fix: inputs/CVS/^taxon_observation.**.sample/: added _no_import because this table duplicates part of what's imported from taxon_observation.**

11881 12/09/2013 07:24 PM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record_plants/test.xml.ref: reran test, which added yearCollected/monthCollected/dayCollected

11880 12/09/2013 07:23 PM Aaron Marcuse-Kubitza

inputs/CVS/plantConcept_/create.sql: documented runtime (3 min)

11842 12/05/2013 12:27 PM Aaron Marcuse-Kubitza

inputs/CVS/cvs.~.clean_up.sql: remove plot.realLatitude/realLongitude, since this is private data that should not be publicly visible

11841 12/05/2013 12:19 PM Aaron Marcuse-Kubitza

inputs/CVS/cvs.~.clean_up.sql: remove plot.realLatitude/realLongitude, since this is private data that should not be publicly visible

11820 12/04/2013 04:57 PM Aaron Marcuse-Kubitza

inputs/CVS/^taxon_observation.**.sample/create.sql: uncommented identifiedBy since this is now part of taxonObservation_

11819 12/04/2013 04:08 PM Aaron Marcuse-Kubitza

fix: inputs/CVS/observation_community/create.sql: communityName: populate from commConcept.commName instead, because commInterpretation.commname is not always populated. this requires left-joining to commConcept.

11818 12/04/2013 03:58 PM Aaron Marcuse-Kubitza

inputs/CVS/observation_community/map.csv: updated output column names to new input column names, to avoid later output column collisions

11817 12/04/2013 03:42 PM Aaron Marcuse-Kubitza

inputs/CVS/observation_community/header.csv, map.csv: updated input column names for cvs.~.clean_up.sql renamings

11815 12/04/2013 04:19 AM Aaron Marcuse-Kubitza

inputs/CVS/cvs.~.clean_up.sql: commClass, commConcept fields: prepend table name to avoid inter-table collisions upon join

11814 12/04/2013 03:43 AM Aaron Marcuse-Kubitza

added inputs/CVS/observation_community/, as for VegBank

11813 12/04/2013 03:32 AM Aaron Marcuse-Kubitza

inputs/CVS/cvs.~.clean_up.sql: commClass.dba_src_ID: prepend table name to avoid inter-table collisions upon join

11812 12/03/2013 04:32 PM Aaron Marcuse-Kubitza

added inputs/CVS/observationContributor_/, which adds the people collecting the plot

11811 12/03/2013 04:02 PM Aaron Marcuse-Kubitza

inputs/CVS/cvs.~.clean_up.sql: observationContributor.dba_src_ID: prepended table name to avoid collision when left-joining to party

11809 12/03/2013 02:57 PM Aaron Marcuse-Kubitza

fix: inputs/CVS/taxonObservation_/create.sql: mapped identifiedBy, which involves joining to party

11808 12/03/2013 02:35 PM Aaron Marcuse-Kubitza

inputs/CVS/cvs.~.clean_up.sql: don't rename taxonInterpretation.PARTY_ID, so that this can be USING-joined to party in inputs/CVS/taxonObservation_/create.sql

11805 12/03/2013 08:25 AM Aaron Marcuse-Kubitza

inputs/CVS/^taxon_observation.**.sample/map.csv: synced output columns to input columns (which removes the extra *s)

11804 12/03/2013 08:00 AM Aaron Marcuse-Kubitza

fix: inputs/CVS/plot_/postprocess.sql: locality: include the site name (authorLocation), because this is part of the unique specification of the place that was sampled, and Bob wants this to be included in VegBIEN

11803 12/03/2013 07:58 AM Aaron Marcuse-Kubitza

inputs/CVS/^taxon_observation.**.sample/create.sql: removed parentLocationID, since this is unused in CVS

11799 12/03/2013 05:19 AM Aaron Marcuse-Kubitza

fix: inputs/CVS/taxon_observation.**/map.csv: omit authorPlantName because it is not specific to the taxonInterpretation row (this is in a separate taxonInterpretation for the original determination instead)

11796 12/02/2013 02:46 PM Aaron Marcuse-Kubitza

fix: inputs/CVS/plot_/map.csv: PARENT_ID: remapped to UNUSED, to clarify that subplots are not implemented through this field

11788 11/26/2013 11:11 PM Aaron Marcuse-Kubitza

**/new_terms.csv, unmapped_terms.csv updated (using `make missing_mappings`)

11776 11/26/2013 01:55 PM Aaron Marcuse-Kubitza

added inputs/CVS/^taxon_observation.**.sample/, used for the extract. note that the column list is slightly different than for VegBank.

11775 11/26/2013 01:42 PM Aaron Marcuse-Kubitza

inputs/CVS/taxonObservation_/map.csv: removed taxonObservation_-- prefix from terms that do not need to be table-specific (like for VegBank)

11774 11/26/2013 01:32 PM Aaron Marcuse-Kubitza

fix: inputs/CVS/taxonObservation_/map.csv: plantConcept_ columns: synced input and output column names to their names in plantConcept_

11773 11/26/2013 01:30 PM Aaron Marcuse-Kubitza

fix: inputs/CVS/taxonObservation_/map.csv: plantConcept_ columns: synced input and output column names to their names in plantConcept_

11772 11/26/2013 01:26 PM Aaron Marcuse-Kubitza

inputs/CVS/plantConcept_/map.csv: removed plantConcept_-- prefix from terms that do not need to be table-specific (like for VegBank)

11761 11/26/2013 05:56 AM Aaron Marcuse-Kubitza

bugfix: inputs/CVS/import_order.txt: added taxon_observation.**

11760 11/26/2013 05:54 AM Aaron Marcuse-Kubitza

inputs/CVS/: don't import joined tables, because they are now imported in the taxon_observation.** left-join instead

11759 11/26/2013 05:53 AM Aaron Marcuse-Kubitza

inputs/CVS/: added taxon_observation.** left-join of the tables, using the steps at http://wiki.vegpath.org/Left-joining_a_datasource. this involves renaming taxonOccurrenceID->taxonOccurrenceID__overall_plot so that it can then be joined together with aggregateOrganismObservationID to create the full taxonOccurrenceID (as in VegBank).

11758 11/26/2013 05:46 AM Aaron Marcuse-Kubitza

inputs/CVS/stemCount_/map.csv: remapped stratum_ID->*STRATUM_ID so it would match up with stratum.*STRATUM_ID

11757 11/25/2013 10:14 PM Aaron Marcuse-Kubitza

inputs/CVS/taxonObservation_/map.csv: mapped TAXONINTERPRETATION_ID to identificationID

11756 11/25/2013 10:03 PM Aaron Marcuse-Kubitza

added inputs/CVS/stratum/

11755 11/25/2013 10:02 PM Aaron Marcuse-Kubitza

added inputs/CVS/stratumType/

11754 11/25/2013 09:43 PM Aaron Marcuse-Kubitza

inputs/CVS/: prepended the table name to each column name to prevent column collisions, using the steps at http://wiki.vegpath.org/Left-joining_a_datasource

11753 11/25/2013 08:07 PM Aaron Marcuse-Kubitza

bugfix: inputs/CVS/plantConcept_/map.csv: PLANTCONCEPT_ID: remapped without * prefix so that the USING join in inputs/CVS/taxonObservation_/create.sql would continue to work

11752 11/25/2013 08:05 PM Aaron Marcuse-Kubitza

inputs/CVS/taxonObservation_/header.csv, map.csv: updated to use plantConcept_ renamed columns

11751 11/25/2013 08:03 PM Aaron Marcuse-Kubitza

bugfix: inputs/CVS/plantConcept_/map.csv: PLANTCONCEPT_ID: remapped without * prefix so that the USING join in inputs/CVS/taxonObservation_/create.sql would continue to work

11749 11/25/2013 07:52 PM Aaron Marcuse-Kubitza

inputs/CVS/: switched to new-style import, using the steps at http://wiki.vegpath.org/Adding_new-style_import_to_a_datasource

11748 11/25/2013 07:32 PM Aaron Marcuse-Kubitza

inputs/CVS/taxonObservation_/map.csv: updated for CVS refresh

11747 11/25/2013 07:17 PM Aaron Marcuse-Kubitza

inputs/CVS/taxonObservation_/map.csv: updated input column names to plantConcept_ renamings

11746 11/25/2013 07:06 PM Aaron Marcuse-Kubitza

inputs/CVS/plantConcept_/header.csv, map.csv: updated for CVS refresh

11745 11/25/2013 06:51 PM Aaron Marcuse-Kubitza

fix: inputs/CVS/plot_/map.csv: removed filter-less collisions. note that the name county_ is assigned in plot_/create.sql, not cvs.~.clean_up.sql as one might expect, because this is a generated column.

11744 11/25/2013 06:42 PM Aaron Marcuse-Kubitza

fix: inputs/CVS/plot_/map.csv: removed filter-less collisions

11743 11/25/2013 06:41 PM Aaron Marcuse-Kubitza

fix: inputs/CVS/plot_/map.csv: removed filter-less collisions

11742 11/25/2013 05:32 PM Aaron Marcuse-Kubitza

fix: inputs/CVS/taxonObservation_/map.csv: moved inherited derived columns to right after the other columns, because for this table, these are actually real input columns rather than appended derived columns. the column order must match header.csv to avoid mis-renamings.

11741 11/25/2013 04:51 PM Aaron Marcuse-Kubitza

inputs/CVS/taxonObservation_/map.csv: removed filter functions, which are now performed in plantConcept_

11740 11/25/2013 04:43 PM Aaron Marcuse-Kubitza

inputs/CVS/taxonObservation_/postprocess.sql: added _parent index to facilitate joins

11739 11/25/2013 04:24 PM Aaron Marcuse-Kubitza

fix: inputs/CVS/taxonObservation_/header.csv, map.csv: updated for CVS refresh and addition of plantConcept_ derived columns

11738 11/25/2013 03:22 PM Aaron Marcuse-Kubitza

inputs/CVS/stemCount_/: translated filters to postprocessing derived columns, using the steps at http://wiki.vegpath.org/Adding_new-style_import_to_a_datasource#1-Translate-filters-to-postprocessing-derived-columns. note that the inserted row count changes, because there is now a primary key (which the table is auto-sorted by) where previously there was none.

11729 11/21/2013 05:20 PM Aaron Marcuse-Kubitza

inputs/CVS/plot_/: translated column filters to postprocessing derived columns, using the steps at http://wiki.vegpath.org/Adding_new-style_import_to_a_datasource#1-Translate-filters-to-postprocessing-derived-columns

11727 11/21/2013 04:27 PM Aaron Marcuse-Kubitza

inputs/CVS/plot_/postprocess.sql: added pkey from the primary joined table

11726 11/21/2013 04:11 PM Aaron Marcuse-Kubitza

inputs/CVS/plot_/map.csv: documented assumptions about the units of fields

11725 11/21/2013 03:52 PM Aaron Marcuse-Kubitza

inputs/CVS/plot_/map.csv: documented assumptions about the units and meaning of numeric codes for fields

11724 11/21/2013 03:01 PM Aaron Marcuse-Kubitza

inputs/CVS/plantConcept_/: translated multi-column filters to postprocessing derived columns, using the steps at http://wiki.vegpath.org/Adding_new-style_import_to_a_datasource#1-Translate-filters-to-postprocessing-derived-columns

11723 11/21/2013 02:54 PM Aaron Marcuse-Kubitza

inputs/CVS/plantConcept_/: translated multi-column filters to postprocessing derived columns, using the steps at http://wiki.vegpath.org/Adding_new-style_import_to_a_datasource#1-Translate-filters-to-postprocessing-derived-columns

11721 11/21/2013 01:58 PM Aaron Marcuse-Kubitza

inputs/CVS/plantConcept_/postprocess.sql: added pkey from the primary joined table

11720 11/21/2013 01:11 PM Aaron Marcuse-Kubitza

inputs/CVS/observation_/postprocess.sql: added pkey from the primary joined table. added _parent index to facilitate joins.

11718 11/21/2013 01:01 PM Aaron Marcuse-Kubitza

bugfix: inputs/CVS/observation_/create.sql: only include one soilObs for each observation (using DISTINCT ON), rather than just left-joining them

11707 11/21/2013 08:26 AM Aaron Marcuse-Kubitza

bugfix: inputs/CVS/stemCount_/map.csv: ensure the aggregateoccurrence.sourceaccessioncode is always populated, because this is a required field when using sourceaccessioncodes. without it, the import will exclude rows which lack a value in this field because it cannot deduplicate on it for these rows, leading to the dropping of large numbers of occurrences. this shows up when comparing provider_count to the input table's row count, and produces the following error in the .errors table:...

11705 11/21/2013 12:24 AM Aaron Marcuse-Kubitza

copyright scrub: inputs/: removed data provider-owned schema and documentation files, which are not BIEN copyright and should not be part of what is submitted for open-sourcing. these files will remain accessible via the web interface (fs.vegpath.org), but will not be in the repository.

11688 11/18/2013 06:23 AM Aaron Marcuse-Kubitza

inputs/CVS/run: `make .../reinstall`: documented vegbiendev runtime (45 min)

11687 11/18/2013 05:35 AM Aaron Marcuse-Kubitza

removed inputs/CVS/cvs-archive-2012-12-04.schema.sql, which has been replaced by cvs-eep-archive-2013-10-22-VegBIEN.schema.sql

11682 11/18/2013 04:54 AM Aaron Marcuse-Kubitza

added inputs/CVS/_src/cvs-eep-archive-2013-10-22-VegBIEN.zip.url

11681 11/18/2013 04:54 AM Aaron Marcuse-Kubitza

added inputs/CVS/cvs-eep-archive-2013-10-22-VegBIEN.schema.sql

11680 11/18/2013 04:52 AM Aaron Marcuse-Kubitza

inputs/CVS/run: documented `make .../reinstall` runtime (25 min)

11678 11/18/2013 04:26 AM Aaron Marcuse-Kubitza

added inputs/CVS/_src/cvs-eep-archive-2013-10-22-VegBIEN.schema.sql

11677 11/18/2013 04:23 AM Aaron Marcuse-Kubitza

added inputs/CVS/_src/cvs-eep-archive-2013-10-22-VegBIEN.schema.sql.run, which makes the SQL suitable for PostgreSQL

11513 10/30/2013 09:49 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: mapped taxon_determination__is_current, taxon_determination__is_original

11512 10/30/2013 09:46 PM Aaron Marcuse-Kubitza

bugfix: mappings/VegCore-VegBIEN.csv: main taxondetermination: use [!isoriginal=true] instead of [!isoriginal] so that adding a manual isoriginal field does not prevent this selector from matching

11510 10/30/2013 09:06 PM Aaron Marcuse-Kubitza

mappings/VegCore.htm: regenerated from wiki. added taxon_determination__is_current, taxon_determination__is_original.

11397 10/22/2013 09:39 AM Aaron Marcuse-Kubitza

inputs/CVS/_src/: added refresh from Mike Lee

11396 10/21/2013 07:14 PM Aaron Marcuse-Kubitza

fix: bin/map: put template: comment out the "Put template:" label so that the output is valid XML, and displays properly in a browser rather than showing a syntax error

11278 10/17/2013 12:19 AM Aaron Marcuse-Kubitza

inputs/CVS/plot_/map.csv: realLatitude, realLongitude: remapped to UNUSED because these columns are actually empty

11277 10/13/2013 07:32 PM Aaron Marcuse-Kubitza

inputs/CVS/taxonObservation_/map.csv: collector_ID: remapped it to UNUSED and removed the join to party via it, like in VegBank

11276 10/13/2013 07:03 PM Aaron Marcuse-Kubitza

inputs/CVS/: deleted stemLocation_, because the CVS stemLocation table is empty (unlike VegBank)

11275 10/13/2013 05:26 PM Aaron Marcuse-Kubitza

inputs/CVS/import_order.txt: added plantConcept_/ so it would get automapped after switching to new-style import

11274 10/13/2013 04:24 PM Aaron Marcuse-Kubitza

inputs/CVS/taxonObservation_/map.csv: denorm_{tri,quad}*: mapped to infraspecificRank*, infraspecificEpithet*

11273 10/13/2013 04:02 PM Aaron Marcuse-Kubitza

inputs/CVS/taxonObservation_/map.csv: infraspecific ranks: remapped to EQUIV#to:species (which is the speciesBinomial), because these actually contain the full taxonomic name at that rank, like VegBank

11272 10/13/2013 03:50 PM Aaron Marcuse-Kubitza

inputs/CVS/taxonObservation_/map.csv: genus: documented that unlike VegBank, does not include genus author

11271 10/13/2013 03:47 PM Aaron Marcuse-Kubitza

inputs/CVS/taxonObservation_/map.csv: denorm_* terms _alt-ed with normalized terms: use DUPLICATE#of instead where possible. documented where and why _alt was necessary (this applies to a few rows for division, genus).

11270 10/13/2013 03:42 PM Aaron Marcuse-Kubitza

bugfix: inputs/CVS/taxonObservation_/map.csv: species: remapped to speciesBinomial, not specificEpithet (like for VegBank). however, note that denorm_species is in fact the epithet, unlike VegBank.

11269 10/13/2013 03:19 PM Aaron Marcuse-Kubitza

fix: inputs/CVS/taxonObservation_/postprocess.sql: removed {} around denorm_genus to match the normalized genus

11268 10/13/2013 02:28 PM Aaron Marcuse-Kubitza

inputs/CVS/taxonObservation_/map.csv: removed unnecessary alts for terms that don't have a duplicate denorm* or hierarchical field

11267 10/13/2013 02:06 PM Aaron Marcuse-Kubitza

fix: inputs/CVS/taxonObservation_/postprocess.sql: fix 1 row that has denorm_kingdom != Kingdom (i.e. both NOT NULL but not the same)

11264 10/13/2013 12:10 AM Aaron Marcuse-Kubitza

bugfix: inputs/CVS/plot_/create.sql: like for VegBank, need to compare place.*PLOT_ID*, not PLOTPLACE_ID, with plot.PLOT_ID

11107 09/29/2013 08:58 PM Aaron Marcuse-Kubitza

bugfix: mappings/VegCore-VegBIEN.csv: nest all taxonoccurrences inside a stratum event, so that the parent locationevent is always fully populated before child locationevents point to it. (previously, a stub parent event was created when the child event was imported first, which blocked the fully-populated parent event from being inserted later on.) this uses auto-folding (for VegBank/CVS) and auto-forwarding (for other datasources) to prune empty stratum events for taxonoccurrences that don't have strata. (see wiki.vegpath.org/Auto-folding, wiki.vegpath.org/Auto-forwarding for more info about these normalization techniques.) note that the inserted row counts stay exactly the same for all datasources except VegBank (which was being fixed), indicating that this signficant change to the mappings did not change the semantics of the import of taxonoccurrences.

11105 09/28/2013 10:40 PM Aaron Marcuse-Kubitza

bugfix: mappings/VegCore-VegBIEN.csv: stratum's locationevent: link this to the parent locationevent, so that the parent locationevent's information (such as locationeventcontributors) is accessible to the stratum's locationevent

10866 09/04/2013 11:06 PM Aaron Marcuse-Kubitza

inputs/*/*/test.xml.ref: updated source.shortname for new datasource name, which now starts out with .new suffix

10591 08/04/2013 01:23 AM Aaron Marcuse-Kubitza

inputs/CVS/stemLocation_/test.xml.ref: set inserted row count back. it had changed because $version was still set in the environment, and this was causing a non-emtpty public schema to be used as the testing schema.