Project

General

Profile

Statistics
| Revision:
  • svn:ignore: .~*

# Date Author Comment
12993 03/30/2014 06:12 PM Aaron Marcuse-Kubitza

inputs/input.Makefile: validate: redirect the output to the log, as for other import-related operations

12992 03/30/2014 06:08 PM Aaron Marcuse-Kubitza

inputs/input.Makefile: import: validate at the end of the import

12991 03/30/2014 06:02 PM Aaron Marcuse-Kubitza

inputs/input.Makefile: added new-style aggregating validations (`validate` target)

12988 03/30/2014 05:41 PM Aaron Marcuse-Kubitza

added inputs/GBIF/_src/0001000-131106143450413.zip.header.txt, which is useful to see what fields will be available when we switch to the new GBIF export format

12985 03/30/2014 05:11 PM Aaron Marcuse-Kubitza

added inputs/GBIF/_src/0001000-131106143450413.zip.header.txt.run

12968 03/29/2014 04:06 AM Aaron Marcuse-Kubitza

*{.sh,run}: runscript targets: use begin_target instead of echo_func so the target name is properly echoed. note that this requires using with_rm so that $rm is properly progagated to applicable invoked targets. (previously, $rm was progagated to all invoked targets. note that with_rm only works inside a runscript target that starts with begin_target.)

12967 03/29/2014 03:58 AM Aaron Marcuse-Kubitza

lib/sh/make.sh: self_make(): renamed to with_rm() for clarity, since this is used only to progagate $rm, and does not also invoke a command with the same name as the current function, as the name might suggest

12963 03/28/2014 02:39 AM Aaron Marcuse-Kubitza

fix: inputs/*/*/map.csv: remapped occurrenceID-mapped fields to dataProviderRecordID when these were not globally unique DwC occurrenceIDs (http://rs.tdwg.org/dwc/terms/#occurrenceID)

12962 03/28/2014 02:34 AM Aaron Marcuse-Kubitza

fix: inputs/CTFS/AggregateObservation/map.csv: field mapped to occurrenceID: remapped to aggregateOrganismObservationID because these are not specimen occurrences

12961 03/28/2014 02:32 AM Aaron Marcuse-Kubitza

fix: mappings/VegCore-VegBIEN.csv: taxonoccurrence.sourceaccessioncode: need to populate from aggregateOrganismObservationID when only that is available

12960 03/28/2014 02:03 AM Aaron Marcuse-Kubitza

bugfix: inputs/NY/Ecatalog_all/map.csv: can't use CatalogNumber as pkey because it's not unique and not always populated. this fixes the NY NULL accessionNumbers bug (wiki.vegpath.org/Aggregating_validations_status#bugs).

12958 03/28/2014 01:29 AM Aaron Marcuse-Kubitza

inputs/XAL/Specimen/header.csv: updated

12922 03/27/2014 03:36 AM Aaron Marcuse-Kubitza

added inputs/NY/validations*.sql*

12920 03/27/2014 03:31 AM Aaron Marcuse-Kubitza

bugfix: lib/common.Makefile: $(add*): need to wrap w/ $(wildcard) to prevent "targets don't exist" error, because svn 1.7 does not suppress this error even with --force

12919 03/27/2014 03:27 AM Aaron Marcuse-Kubitza

bugfix: inputs/input.Makefile: add!: add* of $(svnFiles): need to ignore errors because svn 1.7 does not suppress the "targets don't exist" error even with --force

12891 03/25/2014 04:18 AM Aaron Marcuse-Kubitza

inputs/run: postprocess(): documented runtime on vegbiendev (1 h)

12886 03/24/2014 05:35 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: specimenreplicate.institution_id: renamed to duplicate_institutions_sourcelist_id, as decided in the conference calls (wiki.vegpath.org/2014-03-13_conference_call#schema-changes-2)

12885 03/24/2014 05:32 PM Aaron Marcuse-Kubitza

inputs/run: postprocess(): updated runtime (25 min)

12882 03/24/2014 05:02 PM Aaron Marcuse-Kubitza

inputs/run: postprocess(): updated runtime (20 min)

12879 03/24/2014 01:49 AM Aaron Marcuse-Kubitza

mappings/VegCore.htm: regenerated from wiki: rename specimenHolderInstitutions to specimen_duplicate_institutions, as decided in the 2014-03-13 conference call (wiki.vegpath.org/2014-03-13_conference_call#schema-changes-2). note that most schema changes (such as this one) involve mappings changes, which are handled automatically by `inputs/run postprocess; yes|make inputs/{NVS,SALVIAS,TEAM}/test`.

12873 03/23/2014 11:43 PM Aaron Marcuse-Kubitza

bugfix: inputs/GBIF/table.run: switched to using lib/runscripts/table.run instead of mysql.table.run because some subdirs (Source/) need the regular table.run to work properly. mysql.table.run should instead be used directly by subdirs that use the MySQL install.

12869 03/22/2014 05:56 AM Aaron Marcuse-Kubitza

inputs/XAL/Specimen/test.xml.ref: updated for sample data.csv, which contains the columns as a CSV. this fixes a bug where a map.csv must be used on a table that contains the same set of columns (ie. not one with no columns if there are any mappings).

12867 03/22/2014 05:06 AM Aaron Marcuse-Kubitza

fix: inputs/input.Makefile: don't treat *.xml as data files since these are not currently supported

12795 03/21/2014 02:16 AM Aaron Marcuse-Kubitza

fix: inputs/input.Makefile: removed no longer used special handling of XML inputs, support for which was never added to the Makefile. (bin/map, however, does support importing an XML file into a database.) this fixes a bug in XAL, which used to abort with an error but now just imports an empty table.

12794 03/21/2014 12:34 AM Aaron Marcuse-Kubitza

fix: inputs/input.Makefile: %/install: don't ignore errors if table does not exist, to ensure a proper errexit. this is now possible because every dir that this target is being run on should be a data dir. (Source/ used to be a metadata-only dir.)

12793 03/21/2014 12:31 AM Aaron Marcuse-Kubitza

bugfix: inputs/input.Makefile: $(cleanup): need `set -o pipefail`

12792 03/21/2014 12:02 AM Aaron Marcuse-Kubitza

inputs/VegBank/run: `rm=1 import()`: updated runtime (1 h)

12791 03/20/2014 11:54 PM Aaron Marcuse-Kubitza

inputs/VegBank/taxon_observation.**/test.xml.ref: updated inserted row count

12790 03/20/2014 11:54 PM Aaron Marcuse-Kubitza

inputs/VegBank/projectcontributor_/test.xml.ref: updated inserted row count

12788 03/20/2014 10:44 PM Aaron Marcuse-Kubitza

bugfix: inputs/VegBank/import_order.txt: added missing project, needed to trigger the staging table renaming for the project table

12787 03/20/2014 10:42 PM Aaron Marcuse-Kubitza

inputs/VegBank/run: documented `rm=1 import()` runtime (>1.5 h)

12786 03/20/2014 10:40 PM Aaron Marcuse-Kubitza

inputs/VegBank/run: documented `datasrc_make sql/install` runtime (25 min)

12785 03/20/2014 08:27 PM Aaron Marcuse-Kubitza

inputs/MO/Specimen/test.xml.ref: updated, which adds dateCollected mappings

12784 03/20/2014 08:20 PM Aaron Marcuse-Kubitza

inputs/WIN/Specimen/test.xml.ref: updated to map.csv, which has eventDate->dateCollected

12783 03/20/2014 08:13 PM Aaron Marcuse-Kubitza

inputs/VegBank/plantconcept_/create.sql: updated runtime (25 min, ~same)

12779 03/20/2014 07:58 PM Aaron Marcuse-Kubitza

*{.sh,run}: use new begin_target instead of `echo_func; set_make_vars`

12776 03/20/2014 07:47 PM Aaron Marcuse-Kubitza

inputs/VegBank/plot/postprocess.sql: remove institutions that we have direct data for: CVS: updated runtime (same)

12758 03/18/2014 05:47 PM Aaron Marcuse-Kubitza

bugfix: inputs/VegBank/plot/postprocess.sql: use CVS.plot_ instead because that has the renamed staging table columns, and is compatible with auto-renaming of the SQL script columns

12757 03/18/2014 05:41 PM Aaron Marcuse-Kubitza

inputs/CVS/plot_/postprocess.sql: add unique constraint on locationName (analogous to the unique constraint in plot), for use by inputs/VegBank/plot/postprocess.sql in removing inter-datasource duplication

12753 03/18/2014 05:10 PM Aaron Marcuse-Kubitza

inputs/VegBank/taxon_observation.**/test.xml.ref: updated inserted row count

12752 03/18/2014 05:34 AM Aaron Marcuse-Kubitza

inputs/run: postprocess(): documented runtime (30 min)

12751 03/18/2014 05:16 AM Aaron Marcuse-Kubitza

bugfix: inputs/input.Makefile: %/postprocess.sql: don't perform replacements using map.csv, because map.csv is not idempotent. this functionality was only there to facilitate switching to new-style import, which is now largely done. (the remaining datasources NVS, SALVIAS, TEAM contain only 1 postprocess.sql: inputs/SALVIAS/projects/postprocess.sql (`st inputs/{NVS,SALVIAS,TEAM}/*/postprocess.sql`).)

12747 03/18/2014 04:33 AM Aaron Marcuse-Kubitza

inputs/input.Makefile: %/postprocess.sql: always run this, not just if the associated map spreadsheets change, to avoid needing to `touch` them to cause %/postprocess.sql to run

12745 03/18/2014 04:24 AM Aaron Marcuse-Kubitza

fix: inputs/*/*/postprocess.sql: un-doubled *

12744 03/18/2014 04:06 AM Aaron Marcuse-Kubitza

bugfix: inputs/input.Makefile: %/postprocess.sql: also need to apply renames from mappings/VegCore.thesaurus.csv, as these have been applied to map.csv

12714 03/14/2014 07:35 PM Aaron Marcuse-Kubitza

added inputs/run, which runs all the inputs' runscripts using the new auto-forwarding

12703 03/14/2014 05:25 PM Aaron Marcuse-Kubitza

removed unused inputs/table.run. inputs/*/table.run include lib/runscripts/table.run directly.

12679 03/13/2014 05:03 PM Aaron Marcuse-Kubitza

inputs/SALVIAS/validations.sql: implemented _plots_19_count_of_censuses_per_plot_in_each_project

12638 03/11/2014 09:56 PM Aaron Marcuse-Kubitza

bugfix: inputs/SALVIAS/validations.sql: plots_07_list_of_plots_with_counts_of_individuals_per_species: renamed to _plots_07_list_of_plots*which_use*_... because this query is not intended to include the actual counts, just to say which plots have them (the correct "which use" wording is also used in queries #8, 9)

12635 03/07/2014 10:49 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql, inputs/SALVIAS/validations.sql: added _plots_06a_list_of_stems, for use in figuring out the diff in _plots_06_list_of_plots_with_stem_measurements

12605 03/06/2014 08:52 AM Aaron Marcuse-Kubitza

fix: inputs/SALVIAS/validations.sql: _plots_18_list_of_subplots_codes_for_each_plot_for_each_project: changed columns to match output query

12603 03/06/2014 08:29 AM Aaron Marcuse-Kubitza

fix: inputs/SALVIAS/validations.sql: _plots_15_pct_cover_of_each_verb_taxon_in_each_plot_in_each_pro: changed types to match output query

12602 03/06/2014 08:14 AM Aaron Marcuse-Kubitza

bugfix: inputs/SALVIAS/validations.sql: _plots_15_pct_cover_of_each_verb_taxon_in_each_plot_in_each_pro: changed summarizing column from mean_cover->totalpercentcover to match output query

12601 03/06/2014 08:12 AM Aaron Marcuse-Kubitza

bugfix: inputs/SALVIAS/validations.sql: _plots_10a_aggregate_observation_individual_counts: changed individual_id type to match output query

12596 03/06/2014 12:07 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql, inputs/SALVIAS/validations.sql: added _plots_10a_aggregate_observation_individual_counts, for use in debugging diffs in _plots_10_count_of_individuals_per_plot_in_each_proj

12538 02/27/2014 07:56 PM Aaron Marcuse-Kubitza

fix: inputs/SALVIAS/validations.sql: renamed SiteCode to plot_code to match output queries

12526 02/27/2014 06:58 PM Aaron Marcuse-Kubitza

inputs/SALVIAS/validations.sql: use plot_code instead of plotcode for easier readability

12516 02/27/2014 01:27 PM Aaron Marcuse-Kubitza

bugfix: *.sql: public.source_by_shortname(): need to wrap it in a nested SELECT because Postgres incorrectly does not constant-fold (inline) it, leading to a slowdown when it is therefore run many times. this is done using the steps at wiki.vegpath.org/Postgres_queries#wrap-function-call-in-nested-SELECT .

12508 02/26/2014 11:58 PM Aaron Marcuse-Kubitza

fix: inputs/SALVIAS/validations.sql: plotMetadata.SiteCode: need to match types with the output query column

12417 02/24/2014 10:51 PM Aaron Marcuse-Kubitza

fix: inputs/SALVIAS/validations.sql: _plots_02_list_of_project_names: altered column aliases to match output query

12407 02/24/2014 08:58 AM Aaron Marcuse-Kubitza

inputs/SALVIAS/validations.sql: added Brad's comments from validation/aggregating/plots/SALVIAS/bien3_validations_salvias_db_original.VegCore.sql

12406 02/24/2014 08:53 AM Aaron Marcuse-Kubitza

added inputs/SALVIAS/validations*.sql

12367 02/23/2014 12:13 PM Aaron Marcuse-Kubitza

fix: schemas/vegbien.sql: _traits_08_taxonname_trait_and_value_for_first_5000_records: renamed to _traits_08_taxonname_trait_and_value because this actually includes all the records, not just the first 5000. this uses the new public_validations.rename_query_view() to rename all associated tables and views, including handling truncated names.

12286 02/17/2014 01:58 PM Aaron Marcuse-Kubitza

bugfix: inputs/bien2_traits/validations.sql: _traits_01_count_records: changed column names to match public_validations._traits_01_count_records

12246 02/16/2014 04:22 PM Aaron Marcuse-Kubitza

bugfix: inputs/bien2_traits/validations.sql: use a wrapper function for util.ifnull() so that the views don't get dropped when the util schema is reinstalled

12224 02/14/2014 03:09 PM Aaron Marcuse-Kubitza

validation/aggregating/*/*.sql, schemas/vegbien.sql, lib/runscripts/validations.pg.sql.run, inputs/bien2_traits/validations.sql: added _ to beginning of each view name so the validation views would sort at the top in the datasource's tables list. this will also make the validation result sets easily distinguishable from the data tables.

12221 02/14/2014 12:20 PM Aaron Marcuse-Kubitza

added inputs/bien2_traits/validations.sql, from validation/aggregating/traits/BIEN2_traits/bien3_validations_traits_original_mysql.VegCore.sql

12220 02/14/2014 12:20 PM Aaron Marcuse-Kubitza

inputs/input.Makefile: $(svnFilesGlob): added validations.sql

12213 02/14/2014 11:00 AM Aaron Marcuse-Kubitza

added inputs/bien2_traits/validations.sql.run

12158 02/13/2014 07:26 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: updated import times

12157 02/13/2014 06:53 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: updated import times

12112 02/07/2014 05:37 AM Aaron Marcuse-Kubitza

fix: inputs/VegBIEN/Redmine/wiki/.htaccess: redirect to new main page when accessed without trailing /

12060 02/06/2014 01:14 PM Aaron Marcuse-Kubitza

inputs/bien2_traits/TraitObservation/postprocess.sql: remove rows with no taxon name, which are invalid, and which helps simplify the aggregating validations queries

12053 02/06/2014 03:05 AM Aaron Marcuse-Kubitza

fix: inputs/VegBIEN/Redmine/svn/.htaccess: updated repository URL to point to trunk/

12040 02/04/2014 10:42 AM Aaron Marcuse-Kubitza

bugfix: inputs/SALVIAS/verify/plots.out.sql: fixed ' quoting syntax to use '' instead of \' to escape '

12039 02/04/2014 10:32 AM Aaron Marcuse-Kubitza

inputs/input.Makefile: verify/%.out: use a *.sql file in the verify/ directory itself to generate *.out, so that each datasource can have its own set of output queries. for datasources that should share the same set of queries, they can instead be symlinked to the same file.

12038 02/04/2014 10:01 AM Aaron Marcuse-Kubitza

fix: inputs/CVS/project/: added _no_import since this should not also be imported separately from taxon_observation.**

12030 02/02/2014 11:18 PM Aaron Marcuse-Kubitza

added inputs/XAL/Specimen/_no_import, since this is a demo-only datasource and there isn't a staging table for it

12029 02/02/2014 11:10 PM Aaron Marcuse-Kubitza

inputs/.geoscrub/county_centroids/test.xml.ref, inputs/.NCBI/{names.src,nodes.src}/test.xml.ref: accepted test outputs (generated now that these tables are in import_order.txt)

12028 02/02/2014 10:31 PM Aaron Marcuse-Kubitza

inputs/FIA/taxon_observation.**/header.csv: updated for new REF_RESEARCH_STATION.country metadata value col

12018 02/02/2014 12:49 AM Aaron Marcuse-Kubitza

inputs/input.Makefile: add!: verify/: also svn:ignore *.tsv, *.txt

12005 01/23/2014 01:44 AM Aaron Marcuse-Kubitza

inputs/publishable datasources.xlsx: updated

12004 01/23/2014 01:43 AM Aaron Marcuse-Kubitza

inputs/publishable datasources.xlsx: updated

12003 01/23/2014 01:35 AM Aaron Marcuse-Kubitza

fix: inputs/SALVIAS/projects/postprocess.sql: remove private data that should not be publicly visible: remove projects that do not have "There are no specific use conditions attached to this dataset"

12002 01/23/2014 01:22 AM Aaron Marcuse-Kubitza

fix: inputs/SALVIAS/salvias_plots.~.clean_up.sql: Remove private data that should not be publicly visible: also need to remove metadata-only plots

12001 01/23/2014 01:11 AM Aaron Marcuse-Kubitza

bugfix: inputs/SALVIAS/plotMetadata_/map.csv: things mapped to project_participant: remapped to event__participant because these actually relate to the event, not the project, even though they seem like project-related fields

11999 01/23/2014 12:54 AM Aaron Marcuse-Kubitza

fix: inputs/SALVIAS/plotMetadata_/map.csv, inputs/Madidi/LocationObservation/map.csv: things mapped to communityID: remapped to communityName, which is what's used in analytical_stem (communityID is for numeric IDs)

11997 01/22/2014 11:01 PM Aaron Marcuse-Kubitza

inputs/SALVIAS/plotMetadata_/create.sql, map.csv: expanded plot_administrator:party_code_party_ and mapped plot_administrator_name to a 2nd project_participant

11996 01/22/2014 10:59 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: project_participant: use [!...] negative lookahead assertion so that multiple project_participant columns will properly map to separate projectcontributor rows

11994 01/22/2014 09:16 PM Aaron Marcuse-Kubitza

inputs/SALVIAS/plotMetadata_/map.csv: mapped PrimOwnerID_name->project_participant

11993 01/22/2014 09:07 PM Aaron Marcuse-Kubitza

inputs/SALVIAS/plotMetadata_/create.sql: added join to PrimOwnerID:party_code_party_

11992 01/22/2014 01:06 PM Aaron Marcuse-Kubitza

bugfix: inputs/SALVIAS/import_order.txt: added party_code_party_

11991 01/22/2014 12:50 PM Aaron Marcuse-Kubitza

bugfix: inputs/SALVIAS/party_code_party_/create.sql: need to remove duplicate entries in party_code_party

11988 01/22/2014 11:10 AM Aaron Marcuse-Kubitza

inputs/SALVIAS/party_code_party_/map.csv: mapped fullname->event_participant_name for use by other tables

11987 01/22/2014 10:34 AM Aaron Marcuse-Kubitza

mapped inputs/SALVIAS/party_code_party_/

11983 01/20/2014 10:12 PM Aaron Marcuse-Kubitza

inputs/SALVIAS/_MySQL/salvias_plots.*.sql: refreshed. this adds the party and party_code_party tables Brad provided for mapping the plot contributors.

11982 01/20/2014 10:10 PM Aaron Marcuse-Kubitza

fix: inputs/SALVIAS/salvias_plots.~.clean_up.sql: Delete rows that do not satisfy foreign key constraints: also need to do this for plotObservations, since the refreshed data contains dangling rows for that as well

11981 01/20/2014 10:08 PM Aaron Marcuse-Kubitza

inputs/SALVIAS/run_: documented *.sql install runtime (3 min), as separate from the full `datasrc_make reinstall` runtime (3.5 min)

11980 01/20/2014 10:07 PM Aaron Marcuse-Kubitza

inputs/SALVIAS/run_: refresh(): `datasrc_make reinstall`: updated runtime. documented that runtimes are from starscream.

11979 01/20/2014 08:09 PM Aaron Marcuse-Kubitza

added inputs/SALVIAS/run_, which includes a refresh() target