Project

General

Profile

Statistics
| Revision:
  • svn:ignore: .~*

# Date Author Comment
11980 01/20/2014 10:07 PM Aaron Marcuse-Kubitza

inputs/SALVIAS/run_: refresh(): `datasrc_make reinstall`: updated runtime. documented that runtimes are from starscream.

11979 01/20/2014 08:09 PM Aaron Marcuse-Kubitza

added inputs/SALVIAS/run_, which includes a refresh() target

11970 01/20/2014 11:33 AM Aaron Marcuse-Kubitza

moved everything into /trunk/ to create the standard svn layout, for use with tools that require this (eg. git-svn). IMPORTANT: do NOT do an `svn up`. instead, re-use your working copy's existing files with `svn switch` (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.switch.html).

11965 01/16/2014 01:22 AM Aaron Marcuse-Kubitza

bugfix: inputs/.TNRS/schema.sql: scrubbed_family: Name_matched_accepted_family was missing from the TNRS results at one point, so we are now using Family_matched as a workaround to populate this. the workaround is for accepted names only, as no opinion names do not have an Accepted_name_family to prepend to the scrubbed name to parse.

11964 01/16/2014 01:19 AM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: reexported from live DB, which changes the element order

11961 01/15/2014 10:18 AM Aaron Marcuse-Kubitza

inputs/VegBank/import_order.txt: added projectcontributor_

11960 01/15/2014 10:11 AM Aaron Marcuse-Kubitza

inputs/VegBank/projectcontributor_/map.csv, postprocess.sql: added project_participant

11957 01/15/2014 09:41 AM Aaron Marcuse-Kubitza

added inputs/VegBank/projectcontributor_/

11956 01/15/2014 09:29 AM Aaron Marcuse-Kubitza

inputs/VegBank/vegbank.~.clean_up.sql: projectcontributor.surname: prepend table name to avoid join collisions

11955 01/15/2014 09:23 AM Aaron Marcuse-Kubitza

inputs/VegBank/vegbank.~.clean_up.sql, inputs/CVS/cvs.~.clean_up.sql: Prevent "column name specified more than once" errors when tables are joined: put tables in alphabetical order for consistency

11943 01/14/2014 08:34 PM Aaron Marcuse-Kubitza

inputs/publishable datasources.xlsx: updated

11942 01/14/2014 08:31 PM Aaron Marcuse-Kubitza

inputs/datasource_release_status.xlsx: renamed to `publishable datasources.xlsx` to match the spreadsheet title

11934 12/20/2013 04:41 PM Aaron Marcuse-Kubitza

inputs/VegBank/^taxon_observation.**.sample/create.sql, map.csv: added new project columns

11933 12/20/2013 04:31 PM Aaron Marcuse-Kubitza

inputs/VegBank/taxon_observation.**/postprocess.sql: added the project table

11932 12/20/2013 04:25 PM Aaron Marcuse-Kubitza

mapped inputs/VegBank/project/, which includes the projectName for attribution

11931 12/20/2013 02:56 PM Aaron Marcuse-Kubitza

inputs/CVS/^taxon_observation.**.sample/create.sql, map.csv: added new project columns

11930 12/20/2013 02:44 PM Aaron Marcuse-Kubitza

inputs/CVS/taxon_observation.**/postprocess.sql: added the project table

11929 12/20/2013 02:42 PM Aaron Marcuse-Kubitza

inputs/CVS/project/map.csv: mapped stopDate->projectEndDate

11928 12/20/2013 02:35 PM Aaron Marcuse-Kubitza

mapped inputs/CVS/project/, which includes the projectName for attribution

11927 12/20/2013 01:25 AM Aaron Marcuse-Kubitza

inputs/VegBIEN/Redmine/svn/.htaccess: updated to use much faster direct repository URL rather than Redmine web interface, now that the repository itself is publicly accessible in addition to the Redmine view of it

11924 12/20/2013 12:28 AM Aaron Marcuse-Kubitza

fix: inputs/TEX/Specimen*/map.csv, postprocess.sql: habitat: also placed in occurrenceRemarks so that this field gets parsed for growth form information, as requested by Brad (wiki.vegpath.org/TEX_validation#2013-2-26)

11923 12/19/2013 11:49 PM Aaron Marcuse-Kubitza

fix: inputs/TEX/Specimen*/map.csv: mapped constant values for specimenHolderInstitutions, country. these have to be added with `rm=1 ./inputs/TEX/Specimen.../run postprocess`.

11922 12/19/2013 11:42 PM Aaron Marcuse-Kubitza

bugfix: inputs/TEX/Specimen2/map.csv: mapped BARCODE to accessionNumber so that we have a unique ID for each row

11920 12/17/2013 08:06 AM Aaron Marcuse-Kubitza

inputs/datasource_release_status.xlsx: updated

11917 12/16/2013 07:05 PM Aaron Marcuse-Kubitza

inputs/CVS/^taxon_observation.**.sample/create.sql: added Mike Lee's additional plots used to validate confidentiality-related fields (wiki.vegpath.org/CVS_validation#plots-to-include)

11916 12/16/2013 06:00 PM Aaron Marcuse-Kubitza

bugfix: inputs/CVS/^taxon_observation.**.sample/create.sql: include taxonName in the subset of columns that's imported for the validation, because it is _alt-ed with scientificName for forming the TNRS input name. this is unique to CVS, which is why it was not part of the validation subset copied from the VegBank subset.

11912 12/16/2013 01:43 PM Aaron Marcuse-Kubitza

bugfix: inputs/.TNRS/schema.sql: granted bien_read SELECT access to derived views as well as the core tnrs table

11911 12/15/2013 05:30 PM Aaron Marcuse-Kubitza

updated inputs/datasource_release_status.xlsx

11910 12/15/2013 05:27 PM Aaron Marcuse-Kubitza

added inputs/datasource_release_status.xlsx, export of Google spreadsheet at https://docs.google.com/spreadsheet/ccc?key=0ArZXrTAXd-TYdDRRb2RxYi11TWZrQVh5bVdKOURCeFE

11905 12/11/2013 10:54 PM Aaron Marcuse-Kubitza

fix: inputs/CVS/^taxon_observation.**.sample/: added _no_import because this table duplicates part of what's imported from taxon_observation.**

11904 12/11/2013 10:42 PM Aaron Marcuse-Kubitza

bugfix: inputs/VegBank/plot/: added _no_import because this table is left-joined and should not be imported separately

11903 12/11/2013 10:40 PM Aaron Marcuse-Kubitza

bugfix: inputs/{.NCBI,CTFS}/*.src/: added _no_import because these tables are left-joined and should not be imported separately

11902 12/11/2013 09:56 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: removed table names from datasources where only one table is imported

11901 12/11/2013 09:52 PM Aaron Marcuse-Kubitza

fix: inputs/import.stats.xls: removed deleted tables from current import

11900 12/11/2013 09:51 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: updated import times

11888 12/10/2013 06:35 AM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record_plants/map.csv: row_num: remapped to plain *row_num, like the other datasources that have this field

11887 12/10/2013 06:31 AM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record_plants/postprocess.sql: Remove institutions that we have direct data for: rerun time: noted that this is only fast after manual vacuuming of the table (to remove the deleted rows from the index). autovacuum apparently does not run, although it should.

11881 12/09/2013 07:24 PM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record_plants/test.xml.ref: reran test, which added yearCollected/monthCollected/dayCollected

11880 12/09/2013 07:23 PM Aaron Marcuse-Kubitza

inputs/CVS/plantConcept_/create.sql: documented runtime (3 min)

11879 12/09/2013 06:59 PM Aaron Marcuse-Kubitza

inputs/CTFS/*.src/: added test.xml.ref

11878 12/09/2013 06:58 PM Aaron Marcuse-Kubitza

inputs/CTFS/*.src/: added VegBIEN.csv

11877 12/09/2013 06:56 PM Aaron Marcuse-Kubitza

bugfix: inputs/CTFS/TaxonOccurrence*/map.csv: things mapped to taxonObservationID: remapped to taxonOccurrenceID since taxonObservationID is not mapped to anything in VegBIEN (denormalized VegCore doesn't distinguish between taxon occurrences and taxon observations of them)

11876 12/09/2013 05:46 PM Aaron Marcuse-Kubitza

bugfix: inputs/ARIZ/~.clean_up.sql: prevent "column already exists" errors when there is an input column of the same name as an output column

11873 12/09/2013 04:16 PM Aaron Marcuse-Kubitza

inputs/.geoscrub/import_order.txt: added county_centroids so that it would be installed by new-style import

11871 12/09/2013 03:37 PM Aaron Marcuse-Kubitza

inputs/FIA/TREE/run: documented import() runtime (1.5 h), which includes table cleanup runtime (1 h)

11869 12/09/2013 02:43 PM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record_plants/run: updated import() runtime (same), documented table cleanup runtime (1.5 h)

11868 12/09/2013 02:38 PM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record_plants/postprocess.sql: CREATE INDEX ... specimenHolderInstitutions: documented runtime (45 min)

11867 12/09/2013 02:28 PM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record_plants/postprocess.sql: Remove institutions that we have direct data for: documented runtime (3.5 min)

11865 12/06/2013 07:46 AM Aaron Marcuse-Kubitza

bugfix: inputs/CTFS/import_order.txt: added *.src so that these would be installed under new-style import as well. this means that their columns will now be automapped, requiring the names to be renamed to VegCore names in */create.sql. note that VegCore taxonOccurrenceID has been renamed to taxonObservationID since this was last run.

11864 12/06/2013 06:56 AM Aaron Marcuse-Kubitza

inputs/.geoscrub/run: documented import() runtime (20 min)

11863 12/06/2013 06:12 AM Aaron Marcuse-Kubitza

bugfix: inputs/.NCBI/import_order.txt: added nodes.src, names.src so that these would be installed under new-style import as well. this means that their columns will now be automapped, requiring the names to be renamed to VegCore names in nodes/create.sql.

11849 12/06/2013 02:44 AM Aaron Marcuse-Kubitza

bugfix: inputs/input.Makefile: install: for new-style datasources, use the associated runscript instead (the old-style install target will not do everything that's needed for a new-style datasource)

11847 12/06/2013 12:51 AM Aaron Marcuse-Kubitza

bugfix: inputs/input.Makefile: install: for new-style datasources, use the associated runscript instead (the old-style install target will not do everything that's needed for a new-style datasource)

11843 12/05/2013 11:38 PM Aaron Marcuse-Kubitza

inputs/FIA/COND/postprocess.sql: filtering formula: documented that this was created by Brad, and provided the URL to it on nimoy

11842 12/05/2013 12:27 PM Aaron Marcuse-Kubitza

inputs/CVS/cvs.~.clean_up.sql: remove plot.realLatitude/realLongitude, since this is private data that should not be publicly visible

11841 12/05/2013 12:19 PM Aaron Marcuse-Kubitza

inputs/CVS/cvs.~.clean_up.sql: remove plot.realLatitude/realLongitude, since this is private data that should not be publicly visible

11820 12/04/2013 04:57 PM Aaron Marcuse-Kubitza

inputs/CVS/^taxon_observation.**.sample/create.sql: uncommented identifiedBy since this is now part of taxonObservation_

11819 12/04/2013 04:08 PM Aaron Marcuse-Kubitza

fix: inputs/CVS/observation_community/create.sql: communityName: populate from commConcept.commName instead, because commInterpretation.commname is not always populated. this requires left-joining to commConcept.

11818 12/04/2013 03:58 PM Aaron Marcuse-Kubitza

inputs/CVS/observation_community/map.csv: updated output column names to new input column names, to avoid later output column collisions

11817 12/04/2013 03:42 PM Aaron Marcuse-Kubitza

inputs/CVS/observation_community/header.csv, map.csv: updated input column names for cvs.~.clean_up.sql renamings

11815 12/04/2013 04:19 AM Aaron Marcuse-Kubitza

inputs/CVS/cvs.~.clean_up.sql: commClass, commConcept fields: prepend table name to avoid inter-table collisions upon join

11814 12/04/2013 03:43 AM Aaron Marcuse-Kubitza

added inputs/CVS/observation_community/, as for VegBank

11813 12/04/2013 03:32 AM Aaron Marcuse-Kubitza

inputs/CVS/cvs.~.clean_up.sql: commClass.dba_src_ID: prepend table name to avoid inter-table collisions upon join

11812 12/03/2013 04:32 PM Aaron Marcuse-Kubitza

added inputs/CVS/observationContributor_/, which adds the people collecting the plot

11811 12/03/2013 04:02 PM Aaron Marcuse-Kubitza

inputs/CVS/cvs.~.clean_up.sql: observationContributor.dba_src_ID: prepended table name to avoid collision when left-joining to party

11810 12/03/2013 03:44 PM Aaron Marcuse-Kubitza

bugfix: inputs/input.Makefile: %/header.csv: errexit the command so that errors won't scroll by, which in this case requires `set -o pipefail`

11809 12/03/2013 02:57 PM Aaron Marcuse-Kubitza

fix: inputs/CVS/taxonObservation_/create.sql: mapped identifiedBy, which involves joining to party

11808 12/03/2013 02:35 PM Aaron Marcuse-Kubitza

inputs/CVS/cvs.~.clean_up.sql: don't rename taxonInterpretation.PARTY_ID, so that this can be USING-joined to party in inputs/CVS/taxonObservation_/create.sql

11805 12/03/2013 08:25 AM Aaron Marcuse-Kubitza

inputs/CVS/^taxon_observation.**.sample/map.csv: synced output columns to input columns (which removes the extra *s)

11804 12/03/2013 08:00 AM Aaron Marcuse-Kubitza

fix: inputs/CVS/plot_/postprocess.sql: locality: include the site name (authorLocation), because this is part of the unique specification of the place that was sampled, and Bob wants this to be included in VegBIEN

11803 12/03/2013 07:58 AM Aaron Marcuse-Kubitza

inputs/CVS/^taxon_observation.**.sample/create.sql: removed parentLocationID, since this is unused in CVS

11802 12/03/2013 07:45 AM Aaron Marcuse-Kubitza

bugfix: inputs/input.Makefile: `%/install: %/create.sql`: errexit the command so that errors won't scroll by, which in this case requires `set -o pipefail`

11801 12/03/2013 06:51 AM Aaron Marcuse-Kubitza

inputs/VegBank/plot/postprocess.sql: locality: include the site name (authorlocation), because this is part of the unique specification of the place that was sampled

11799 12/03/2013 05:19 AM Aaron Marcuse-Kubitza

fix: inputs/CVS/taxon_observation.**/map.csv: omit authorPlantName because it is not specific to the taxonInterpretation row (this is in a separate taxonInterpretation for the original determination instead)

11796 12/02/2013 02:46 PM Aaron Marcuse-Kubitza

fix: inputs/CVS/plot_/map.csv: PARENT_ID: remapped to UNUSED, to clarify that subplots are not implemented through this field

11794 11/27/2013 11:04 PM Aaron Marcuse-Kubitza

inputs/input.Makefile: scrub: clarified that using & (background process) also ignores TNRS errors (the primary purpose of & , of course, is to run asynchronously)

11792 11/27/2013 09:24 PM Aaron Marcuse-Kubitza

inputs/.geoscrub/geoscrub_output/run: import() runtime: added starscream runtime (20 min)

11790 11/27/2013 08:33 PM Aaron Marcuse-Kubitza

inputs/.geoscrub/geoscrub_output/run: documented import() runtime (15 min)

11789 11/26/2013 11:18 PM Aaron Marcuse-Kubitza

inputs/.geoscrub/Source/map.csv: source__modified_date: updated for current run

11788 11/26/2013 11:11 PM Aaron Marcuse-Kubitza

**/new_terms.csv, unmapped_terms.csv updated (using `make missing_mappings`)

11786 11/26/2013 11:07 PM Aaron Marcuse-Kubitza

inputs/.geoscrub/geoscrub_output/geoscrub.csv.run: updated upload time (30 s)

11785 11/26/2013 11:00 PM Aaron Marcuse-Kubitza

inputs/.geoscrub/geoscrub_output/geoscrub.csv.run: export_(): updated runtime (25 s)

11782 11/26/2013 09:57 PM Aaron Marcuse-Kubitza

inputs/.geoscrub/geoscrub_output/geoscrub.csv.run: make(): derived/biengeo/geoscrub.sh: documented runtime (2.5 h)

11781 11/26/2013 09:45 PM Aaron Marcuse-Kubitza

inputs/.geoscrub/geoscrub_output/geoscrub.csv.run: don't connect to DB as the root user, because this is not needed now that the geoscrub schema is owned by the bien user. this avoids a sudo password prompt at the end of the geoscrubbing run.

11777 11/26/2013 02:23 PM Aaron Marcuse-Kubitza

bugfix: inputs/input.Makefile: $(import): except in a full-database import, errexit so that the import will stop on an error and not let it scroll by

11776 11/26/2013 01:55 PM Aaron Marcuse-Kubitza

added inputs/CVS/^taxon_observation.**.sample/, used for the extract. note that the column list is slightly different than for VegBank.

11775 11/26/2013 01:42 PM Aaron Marcuse-Kubitza

inputs/CVS/taxonObservation_/map.csv: removed taxonObservation_-- prefix from terms that do not need to be table-specific (like for VegBank)

11774 11/26/2013 01:32 PM Aaron Marcuse-Kubitza

fix: inputs/CVS/taxonObservation_/map.csv: plantConcept_ columns: synced input and output column names to their names in plantConcept_

11773 11/26/2013 01:30 PM Aaron Marcuse-Kubitza

fix: inputs/CVS/taxonObservation_/map.csv: plantConcept_ columns: synced input and output column names to their names in plantConcept_

11772 11/26/2013 01:26 PM Aaron Marcuse-Kubitza

inputs/CVS/plantConcept_/map.csv: removed plantConcept_-- prefix from terms that do not need to be table-specific (like for VegBank)

11761 11/26/2013 05:56 AM Aaron Marcuse-Kubitza

bugfix: inputs/CVS/import_order.txt: added taxon_observation.**

11760 11/26/2013 05:54 AM Aaron Marcuse-Kubitza

inputs/CVS/: don't import joined tables, because they are now imported in the taxon_observation.** left-join instead

11759 11/26/2013 05:53 AM Aaron Marcuse-Kubitza

inputs/CVS/: added taxon_observation.** left-join of the tables, using the steps at http://wiki.vegpath.org/Left-joining_a_datasource. this involves renaming taxonOccurrenceID->taxonOccurrenceID__overall_plot so that it can then be joined together with aggregateOrganismObservationID to create the full taxonOccurrenceID (as in VegBank).

11758 11/26/2013 05:46 AM Aaron Marcuse-Kubitza

inputs/CVS/stemCount_/map.csv: remapped stratum_ID->*STRATUM_ID so it would match up with stratum.*STRATUM_ID

11757 11/25/2013 10:14 PM Aaron Marcuse-Kubitza

inputs/CVS/taxonObservation_/map.csv: mapped TAXONINTERPRETATION_ID to identificationID

11756 11/25/2013 10:03 PM Aaron Marcuse-Kubitza

added inputs/CVS/stratum/

11755 11/25/2013 10:02 PM Aaron Marcuse-Kubitza

added inputs/CVS/stratumType/

11754 11/25/2013 09:43 PM Aaron Marcuse-Kubitza

inputs/CVS/: prepended the table name to each column name to prevent column collisions, using the steps at http://wiki.vegpath.org/Left-joining_a_datasource

11753 11/25/2013 08:07 PM Aaron Marcuse-Kubitza

bugfix: inputs/CVS/plantConcept_/map.csv: PLANTCONCEPT_ID: remapped without * prefix so that the USING join in inputs/CVS/taxonObservation_/create.sql would continue to work

11752 11/25/2013 08:05 PM Aaron Marcuse-Kubitza

inputs/CVS/taxonObservation_/header.csv, map.csv: updated to use plantConcept_ renamed columns