Project

General

Profile

Statistics
| Revision:

# Date Author Comment
6809 12/12/2012 06:39 PM Aaron Marcuse-Kubitza

inputs/Madidi/Organism/map.csv: Remapped Habit from verbatimGrowthForm to growthForm, which points to the same place

6808 12/12/2012 06:27 PM Aaron Marcuse-Kubitza

inputs/CVS/taxonObservation_/map.csv: Use denorm_* denormalized taxonomic ranks in place of the normalized ranks when both are provided

6807 12/12/2012 06:25 PM Aaron Marcuse-Kubitza

input.Makefile: Maps validation: %/new_terms.csv: Fixed bug where need to filter unmapped_terms.csv's terms out of the output column, not the input column, because that's what the unmapped terms are generated from. Usually these columns are the same for unmapped terms, but sometimes an output term is changed from the original column's name but still doesn't match a VegCore term in mappings/VegCore-VegBIEN.csv.

6806 12/12/2012 06:08 PM Aaron Marcuse-Kubitza

input.Makefile: SVN: add: Added comment with instructions to update all inputs with these settings, using `make inputs/add`

6805 12/12/2012 06:07 PM Aaron Marcuse-Kubitza

input.Makefile: SVN: add: verify: Also ignore *.xlsx

6804 12/12/2012 06:00 PM Aaron Marcuse-Kubitza

README.TXT: Data import: Creating enough disk space: Added instructions for removing archived backups to free up space

6803 12/12/2012 05:15 PM Aaron Marcuse-Kubitza

inputs/CVS/taxonObservation_/map.csv: Fixed bug where taxonLevel, not taxonRank, needs to be mapped to taxonRank, because CVS's taxonRank is actually a number, while taxonLevel contains the corresponding text string

6802 12/12/2012 05:12 PM Aaron Marcuse-Kubitza

README.TXT: Data import: Before import, added step to make sure there is at least 100GB of disk space

6801 12/12/2012 04:41 PM Aaron Marcuse-Kubitza

sql_io.py: put_table(): is_function: Fixed bug where need to add the pkeys table's test pkey constraint after the data is added rather than when the empty table is created, to avoid adding a pkey constraint that will later be violated by data which returns multiple output rows for an input row (such as calls to _split())

6800 12/12/2012 04:36 PM Aaron Marcuse-Kubitza

sql_io.py: put_table(): insert_into_pkeys(): Allow callers to override run_query_into()'s add_pkey_ param in case the initial version of the pkeys table should not yet have the test pkey constraint (e.g. because data is added after the table is created)

6799 12/12/2012 04:24 PM Aaron Marcuse-Kubitza

README.TXT: Data import: Checking for errors: Search for "Command exited with non-zero status" to find errors, which is faster than checking that each input's log ends in "Encountered 0 error(s)"

6798 12/12/2012 04:13 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

6797 12/12/2012 03:50 PM Aaron Marcuse-Kubitza

README.TXT: Data import: import_all: Corrected text of note about time until control is returned to the shell

6796 12/12/2012 03:42 PM Aaron Marcuse-Kubitza

README.TXT: Data import: Moved download of logs to right after the import is done, because this is a quick step that doesn't depend on the backup- and export-creation steps

6795 12/11/2012 11:41 AM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: institutionCode: Removed mapping to sourcename.matched_source_id, which is now autopopulated. Split any list of institutionCodes apart using new _split().

6794 12/11/2012 11:28 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: sourcename: Added sourcename_set_matched_source_id() trigger

6793 12/11/2012 11:22 AM Aaron Marcuse-Kubitza

schemas/functions.sql: Added _split()

6792 12/11/2012 11:13 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: sourcelist_unique: Removed COALESCE around name because it's NOT NULL

6791 12/11/2012 11:11 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: Allow multiple institutionCodes for each specimenreplicate by linking new sourcelist table many-to-many to source via sourcename (which is now a linking table)

6790 12/11/2012 10:50 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: sourcename: Removed system, which has been replaced by source_id as the scoping field

6789 12/11/2012 10:42 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: party: Added sourceaccessioncode and uniquify on it instead when provided. vegbien.ERD.mwb: Rearranged party-related tables to allow the tables to be fully expanded.

6788 12/11/2012 10:40 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: Renamed sampletype to observationtype to match the VegCore term

6787 12/11/2012 10:09 AM Aaron Marcuse-Kubitza

Added inputs/SALVIAS/salvias_users.~.clean_up.sql

6786 12/11/2012 10:01 AM Aaron Marcuse-Kubitza

inputs/SALVIAS/: Added salvias_users tables

6785 12/11/2012 10:00 AM Aaron Marcuse-Kubitza

my2pg: Translate blob to bytea

6784 12/11/2012 09:55 AM Aaron Marcuse-Kubitza

my2pg: Also remove UNIQUE and FULLTEXT inline indexes

6783 12/11/2012 08:55 AM Aaron Marcuse-Kubitza

mappings/VegCore.csv: UNUSED: Comments: Added Redmine formatting

6782 12/11/2012 08:55 AM Aaron Marcuse-Kubitza

mappings/VegCore.csv: OMIT: Changed "is omitted" to "should be omitted", because the mappings specify suggestions rather than requirements as to how a field should be used

6781 12/11/2012 08:51 AM Aaron Marcuse-Kubitza

mappings/VegCore.csv: Removed no longer used subInstitutionCode. Use datasource, institutionCode instead.

6780 12/11/2012 08:51 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: analytical_*: Renamed subInstitutionCode to institutionCode because this is the institution storing the specimen, as defined by DwC

6779 12/11/2012 08:45 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: analytical_*: Renamed institutionCode to datasource because this is actually the top-level datasource providing the record, not the institution storing the specimen

6778 12/11/2012 08:38 AM Aaron Marcuse-Kubitza

mappings/VegCore.csv: Added datasource

6777 12/11/2012 08:32 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: Renamed sampletype to observationtype to match the VegCore term

6776 12/11/2012 08:16 AM Aaron Marcuse-Kubitza

mappings/VegCore.csv: referenceType: Added closed list values

6775 12/11/2012 08:08 AM Aaron Marcuse-Kubitza

mappings/VegCore.csv: observationMeasure: Re-sourced to SALVIAS:observation_type, since SALVIAS comes before VegBIEN in the source precedence

6774 12/11/2012 08:01 AM Aaron Marcuse-Kubitza

mappings/VegCore.csv: Renamed sampleType to observationType to match the SALVIAS term it's derived from

6773 12/11/2012 07:57 AM Aaron Marcuse-Kubitza

inputs/SALVIAS-CSV/Plot/map.csv: Mapped observation_type

6772 12/11/2012 07:37 AM Aaron Marcuse-Kubitza

mappings/VegCore.csv: Added individualCount_*cm_or_more used by analytical_aggregate

6771 12/11/2012 07:10 AM Aaron Marcuse-Kubitza

mappings/VegCore.csv: subplotX/Y: Added definition

6770 12/11/2012 07:07 AM Aaron Marcuse-Kubitza

mappings/VegCore.csv: subplot*: Sources: Put SALVIAS:subplot last, because the specific field is closer in meaning to the term than the category

6769 12/11/2012 07:02 AM Aaron Marcuse-Kubitza

mappings/VegCore.csv: project*Date: Added definition

6768 12/11/2012 06:58 AM Aaron Marcuse-Kubitza

mappings/VegCore.csv: parentPlotName: Added definition

6767 12/11/2012 06:56 AM Aaron Marcuse-Kubitza

mappings/VegCore.csv: parent*: Sources: Put VegBank:PARENT_ID last, because the specific field is closer in meaning to the term than the category

6766 12/11/2012 06:40 AM Aaron Marcuse-Kubitza

mappings/VegCore.csv: locationName: Added definition

6765 12/11/2012 06:33 AM Aaron Marcuse-Kubitza

mappings/VegCore.csv: eventDate/startDate/endDate: Added definition

6764 12/11/2012 06:27 AM Aaron Marcuse-Kubitza

mappings/VegCore.csv: locationName: Sources: Put VegX:plotName first because it is closest in meaning to the term

6763 12/11/2012 06:22 AM Aaron Marcuse-Kubitza

mappings/VegCore.csv: recordedBy*: Added definition

6762 12/11/2012 06:15 AM Aaron Marcuse-Kubitza

mappings/VegCore.csv: recordedBy.middleName: Added source to DwC:recordedBy

6761 12/11/2012 06:01 AM Aaron Marcuse-Kubitza

inputs/CVS/: Joined together stemCount and stemLocation tables to create stemLocation_, in order to include the stem size class's measurements in each tagged stem's stemobservation (in addition to in the stemobservation for the aggregateoccurrence as a whole)

6760 12/11/2012 05:32 AM Aaron Marcuse-Kubitza

inputs/CVS/: Joined together taxonImportance and stemCount tables to create stemCount_, because stemCount actually stores stem abundance by size, rather than grouping stems by organism (http://vegbankdev.nceas.ucsb.edu/vegbank/views/dba_tabledescription_detail.jsp?view=detail&wparam=stemcount&entity=dba_tabledescription&where=where_tablename), and is thus an AggregateOccurrence-related table along with taxonImportance

6759 12/11/2012 05:30 AM Aaron Marcuse-Kubitza

inputs/CVS/: Joined together taxonImportance and stemCount tables to create stemCount_, because stemCount actually stores stem abundance by size, rather than grouping stems by organism (http://vegbankdev.nceas.ucsb.edu/vegbank/views/dba_tabledescription_detail.jsp?view=detail&wparam=stemcount&entity=dba_tabledescription&where=where_tablename), and is thus an AggregateOccurrence-related table along with taxonImportance

6758 12/11/2012 04:41 AM Aaron Marcuse-Kubitza

inputs/CVS/taxonObservation_/map.csv: Fixed bug where need to indicate that data is plots data to prevent the specimenreplicate ID from being forwarded to the location ID

6757 12/11/2012 04:38 AM Aaron Marcuse-Kubitza

inputs/VegBank/taxonobservation_/map.csv: Fixed bug where need to indicate that data is plots data to prevent the specimenreplicate ID from being forwarded to the location ID

6756 12/11/2012 04:37 AM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: Don't forward specimenreplicate IDs to location for plots data (where the specimenreplicate IDs apply only to the specimen)

6755 12/11/2012 04:31 AM Aaron Marcuse-Kubitza

xml_func.py: Simplifying functions: Added _eq()

6754 12/11/2012 04:31 AM Aaron Marcuse-Kubitza

xml_func.py: Added is_scalar()

6753 12/11/2012 04:30 AM Aaron Marcuse-Kubitza

xml_func.py: process(): row-based mode: preserving complex funcs: Fixed bug where functions with no params would crash reduce() because it requires at least one value when no initial value is specified

6752 12/11/2012 04:28 AM Aaron Marcuse-Kubitza

Added scalar.py

6751 12/11/2012 03:39 AM Aaron Marcuse-Kubitza

Renamed inputs/NCU-NCSC/ to NCU because this is the primary herbarium contained in the data

6750 12/11/2012 03:33 AM Aaron Marcuse-Kubitza

inputs/VegBank/: Joined together stemcount and stemlocation tables to create stemlocation_, in order to include the stem size class's measurements in each tagged stem's stemobservation (in addition to in the stemobservation for the aggregateoccurrence as a whole)

6749 12/11/2012 03:29 AM Aaron Marcuse-Kubitza

inputs/VegBank/: Joined together stemcount and stemlocation tables to create stemlocation_, in order to include the stem size class's measurements in each tagged stem's stemobservation (in addition to in the stemobservation for the aggregateoccurrence as a whole)

6748 12/11/2012 03:18 AM Aaron Marcuse-Kubitza

inputs/VegBank/stemlocation/map.csv: Also mapped stemlocation_id to individualID to create one plantobservation for each stemobservation

6747 12/11/2012 03:15 AM Aaron Marcuse-Kubitza

inputs/VegBank/stemlocation/map.csv: Remapped stemcount_id to aggregateOccurrenceID to match stemcount_id's mapping in stemcount_

6746 12/11/2012 02:59 AM Aaron Marcuse-Kubitza

inputs/VegBank/: Joined together taxonimportance and stemcount tables to create stemcount_, because stemcount actually stores stem abundance by size, rather than grouping stems by organism (http://vegbankdev.nceas.ucsb.edu/vegbank/views/dba_tabledescription_detail.jsp?view=detail&wparam=stemcount&entity=dba_tabledescription&where=where_tablename)

6745 12/11/2012 02:53 AM Aaron Marcuse-Kubitza

Added inputs/VegBank/_archive

6744 12/11/2012 02:50 AM Aaron Marcuse-Kubitza

input.Makefile: Testing: Added `%/test: %/test.xml` to allow testing just a subdir

6743 12/11/2012 02:42 AM Aaron Marcuse-Kubitza

input.Makefile: General targets: Added `%/: %/map.csv` to allow remaking just a subdirectory

6742 12/11/2012 01:53 AM Aaron Marcuse-Kubitza

inputs/CVS/: Refreshed data with new export from Bob

6741 12/11/2012 01:52 AM Aaron Marcuse-Kubitza

inputs/CVS/cvs-archive-2012-12-04.schema.sql: Fixed types using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Tools#MS-Access-database-MDB>

6740 12/11/2012 01:48 AM Aaron Marcuse-Kubitza

bin/map: Removed column names simplification, which was causing columns with the same alphanumeric characters but different punctuation to be simplified to the same name. Name simplification is now performed by the mapping mechanism itself, and can be overridden in the mappings.

6739 12/11/2012 01:24 AM Aaron Marcuse-Kubitza

Regenerated inputs/VegBank/new_terms.csv

6738 12/11/2012 12:08 AM Aaron Marcuse-Kubitza

Added inputs/NCU/_src/NCU_specimens_public_2012-12-10.zip.url

6737 12/11/2012 12:04 AM Aaron Marcuse-Kubitza

inputs/NCU/: Refreshed data with new export from Bob

6736 12/10/2012 09:33 PM Aaron Marcuse-Kubitza

Renamed inputs/NCU-NCSC/ to NCU because this is the primary herbarium contained in the data

6735 12/10/2012 09:31 PM Aaron Marcuse-Kubitza

Renamed inputs/NCU-NCSC/ to NCU because this is the primary herbarium contained in the data

6734 12/10/2012 09:21 PM Aaron Marcuse-Kubitza

Added inputs/NCU-NCSC/_archive

6733 12/10/2012 09:21 PM Aaron Marcuse-Kubitza

input.Makefile: SVN: add: Also add _archive/ subdir

6732 12/10/2012 08:23 PM Aaron Marcuse-Kubitza

publish_analytical_db: Time the import of the data

6731 12/10/2012 08:17 PM Aaron Marcuse-Kubitza

export_analytical_db: Also create a .md5 for the export

6730 12/10/2012 08:16 PM Aaron Marcuse-Kubitza

export_analytical_db: Run commands in the root svn dir

6729 12/10/2012 08:05 PM Aaron Marcuse-Kubitza

mappings/VegCore.csv: soil composition terms: Removed ppm units from the definition, since units are actually fraction or percent

6728 12/10/2012 08:03 PM Aaron Marcuse-Kubitza

README.TXT: Data import: Moved On local machine steps after On nimoy steps, because the On nimoy steps are more important

6727 12/10/2012 07:59 PM Aaron Marcuse-Kubitza

mappings/VegCore.csv: Comments: Added quotes around quotations from other sources

6726 12/10/2012 07:56 PM Aaron Marcuse-Kubitza

mappings/VegCore.csv: Definitions: Added quotes around quotations from other sources

6725 12/10/2012 07:52 PM Aaron Marcuse-Kubitza

Added backups/fix_perms

6724 12/10/2012 07:45 PM Aaron Marcuse-Kubitza

backups/Makefile: Synchronization: %/download: Also download any .md5 file for the file

6723 12/10/2012 07:24 PM Aaron Marcuse-Kubitza

README.TXT: Data import: On nimoy: Added instructions to verify the export's MD5 sum

6722 12/10/2012 07:23 PM Aaron Marcuse-Kubitza

README.TXT: Data import: On nimoy: Replaced step to manually upload the analytical_aggregate export with the command to download it from jupiter

6721 12/10/2012 07:18 PM Aaron Marcuse-Kubitza

README.TXT: Data import: On nimoy: Removed step to rename any existing analytical_aggregate table, since the import is now done directly into the versioned table

6720 12/10/2012 07:11 PM Aaron Marcuse-Kubitza

mappings/VegCore.csv: VegX terms without definitions in VegX: Added definitions from non-VegX sources, etc.

6719 12/10/2012 06:28 PM Aaron Marcuse-Kubitza

README.TXT: Data import: Added instructions to verify the backups' MD5 sums on jupiter

6718 12/10/2012 06:23 PM Aaron Marcuse-Kubitza

README.TXT: Data import: Removed step to copy backups to jupiter, because this now done by `make backups/upload`

6717 12/10/2012 06:11 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: sync_*_to_view(): Also add `GRANT SELECT TO bien_read` on the view used to generate the table, in case the permission was lost when the view was modified

6716 12/10/2012 06:08 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: sync_*_to_view(): Added `GRANT SELECT TO bien_read`

6715 12/10/2012 06:04 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: analytical_*: Added back bien_read's SELECT permissions, which had gotten removed when the tables were re-synced to their views

6714 12/10/2012 06:03 PM Aaron Marcuse-Kubitza

schemas/vegbien.my.sql: Regenerated with expanded repl word matching

6713 12/10/2012 06:00 PM Aaron Marcuse-Kubitza

repl: :-prefixing of words to form vars: Fixed bug where : must be matched as a lookbehind assertion, not a capturing group, because the provided regexp itself or its replacement may reference capturing groups, which it expects to be numbered starting with 1

6712 12/10/2012 05:47 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

6711 12/10/2012 05:47 PM Aaron Marcuse-Kubitza

Regenerated inputs/NY/Specimen/new_terms.csv

6710 12/07/2012 06:49 PM Aaron Marcuse-Kubitza

inputs/JBM/Specimen/test.xml.ref: Updated inserted row count, which had gotten changed when a test was run on a non-empty database