/ - Changes - BIEN 3 - NCEAS Projects

root @ 6465

#	Date	Author	Comment
6465	11/26/2012 01:33 PM	Aaron Marcuse-Kubitza	make_analytical_db: mk_analytical_table(): Factored table references in different schemas out into vars
6464	11/25/2012 09:31 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem_view: recordNumber: Combine identifying fields in taxonoccurrence, plantobservation, and stemobservation to ensure that this field is unique within the plot and not NULL
6463	11/25/2012 09:13 PM	Aaron Marcuse-Kubitza	Regenerated vegbien.ERD exports
6462	11/25/2012 08:52 PM	Aaron Marcuse-Kubitza	make_analytical_db: Moved set -x () around just psql_verbose_vegbien so embedded $() expressions wouldn't also be in set -x (verbose) mode
6461	11/25/2012 08:49 PM	Aaron Marcuse-Kubitza	make_analytical_db: Fixed bug where need to use bash instead of sh because vegbien_dest requires it
6460	11/25/2012 08:37 PM	Aaron Marcuse-Kubitza	make_analytical_db: Factored analytical_* table creation code out into mk_analytical_table() function
6459	11/25/2012 08:28 PM	Aaron Marcuse-Kubitza	make_analytical_db: Create analytical_db views pointing to the analytical_* versions in the public schema
6458	11/25/2012 08:21 PM	Aaron Marcuse-Kubitza	vegbien_dest: $schemas: Removed analytical_db because views that will be added to it were shadowing public schema tables with the same names during population of those tables in make_analytical_db
6457	11/25/2012 07:47 PM	Aaron Marcuse-Kubitza	vegbien_dest: Export $public, to make sure it's available to any invoked scripts as an env var
6456	11/25/2012 07:45 PM	Aaron Marcuse-Kubitza	vegbien_dest: $schemas: Added analytical_db
6455	11/25/2012 07:38 PM	Aaron Marcuse-Kubitza	inputs/import.stats.xls: Added separate tab with stats for 2012-6~9. The Excel format apparently only supports 255 columns, so previous imports had been silently truncated off. Note that once the 2012-10 imports reach column 255, a new tab will need to be created with the 2012-10+ imports.
6454	11/25/2012 07:20 PM	Aaron Marcuse-Kubitza	bin/map: in_is_db: by_col: Clearing errors table: Skip this if the table has been set to None because it didn't exist (and thus was a metadata-only map spreadsheet)
6453	11/25/2012 06:54 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem_view: scientificNameWithMorphospecies: Fixed bug where need to use the specific_epithet from the accepted_taxonverbatim rather than the parsed_taxonverbatim
6452	11/25/2012 06:45 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem_view: scientificNameWithMorphospecies: Include the family any time the genus is not specified, instead of just when accepted_taxonlabel.rank = 'family'. These should have the same effect since TNRS includes the rank, but using COALESCE is clearer.
6451	11/25/2012 06:41 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem_view: scientificNameWithMorphospecies: Changed to also include morphospecies when just the family is specified
6450	11/25/2012 06:35 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem_view: Fixed bug where location.authorlocationcode needed to be used as the plotName when location.sourceaccessioncode was not provided, to ensure that plotName would be NOT NULL
6449	11/25/2012 06:20 PM	Aaron Marcuse-Kubitza	inputs/FIA/import_order.txt: Fixed bug where FIA_COND_unique needed to be explicitly included in import_order.txt now that we're using import_order.txt to import the Source metadata table before the data tables
6448	11/25/2012 06:15 PM	Aaron Marcuse-Kubitza	inputs/import.stats.xls: Updated import times
6447	11/24/2012 03:07 PM	Aaron Marcuse-Kubitza	root Makefile: PostgreSQL: $(postgresReload-Linux): Try chmoding both as your user and as the bien user
6446	11/24/2012 02:46 PM	Aaron Marcuse-Kubitza	input.Makefile: Testing: $(runTest): Ignore failed diffs when the test is compared to another test's output (e.g. in by_col mode)
6445	11/24/2012 02:41 PM	Aaron Marcuse-Kubitza	bin/map: in_is_db: If table does not exist, set table to None so that db_xml.put_table() doesn't try to access it. This fixes a bug in metadata-only map spreadsheets under column-based import.
6444	11/24/2012 02:40 PM	Aaron Marcuse-Kubitza	db_xml.py: put_table(): Support None in_table by calling put() directly
6443	11/24/2012 02:29 PM	Aaron Marcuse-Kubitza	Removed no longer used geoscrub.*.sql. Use geoscrub_output instead.
6442	11/24/2012 02:27 PM	Aaron Marcuse-Kubitza	Removed no longer used geoscrub_cleaned_unique. Use geoscrub_output instead.
6441	11/24/2012 02:25 PM	Aaron Marcuse-Kubitza	Removed no longer used geoscrub_cultivated. Use analytical_stem_view.cultivated instead.
6440	11/24/2012 02:25 PM	Aaron Marcuse-Kubitza	Removed no longer used geoscrub_cultivated. Use analytical_stem_view.cultivated instead.
6439	11/24/2012 02:23 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem_view: cultivated: Removed BIEN2's geoscrub_cultivated, which has now been replaced by the primary corresponding scripts (and never had particularly many matches to the locations in any case)
6438	11/24/2012 02:14 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem_view: cultivated: Use OR instead of _or() to combine cultivated_family_locations.country IS NOT NULL with the other values, because this field's false value should not be used in place of NULL if all the other values are NULL, as it would be with _or(). (cultivated_family_locations.country IS NOT NULL can indicate presence, but not absence, of cultivated status.)
6437	11/24/2012 02:06 PM	Aaron Marcuse-Kubitza	schemas/functions.sql, vegbien.sql: _and(), _or(): Added comment comparing the function and the corresponding logical operator
6436	11/24/2012 01:50 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: public: Added _or(), for use by analytical_stem_view
6435	11/24/2012 01:48 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem_view: cultivated: Also set if family/country combination found in cultivated_family_locations
6434	11/24/2012 01:39 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: cultivated_family_locations: Added data from nimoy:/home/boyle/bien2/geoscrub/cultivated/cult_by_taxon/flag_by_taxa.inc
6433	11/24/2012 01:33 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: Added cultivated_family_locations to store locations where various taxon families are considered cultivated
6432	11/24/2012 01:24 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped locality description fields to location.iscultivated using _locationnarrative_is_cultivated()
6431	11/24/2012 01:23 PM	Aaron Marcuse-Kubitza	xml_func.py: Simplifying functions: Added passthru entries for _and, _or
6430	11/24/2012 01:06 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: Added _locationnarrative_is_cultivated()
6429	11/24/2012 12:57 PM	Aaron Marcuse-Kubitza	lib/PostgreSQL-MySQL.csv: Change text to varchar(255) because text columns can't be used in indexes in MySQL
6428	11/24/2012 12:51 PM	Aaron Marcuse-Kubitza	lib/PostgreSQL-MySQL.csv: Resaved in Excel, which removed unnecessary quotes around fields
6427	11/24/2012 12:22 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_aggregate: Added identifiedBy, which is no longer a scoping field (which would prevent scientificNameWithMorphospecies from being unique) now that there is only one taxondetermination for each taxonoccurrence
6426	11/24/2012 12:05 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem_view: dateCollected: For plots data, use the locationevent obsstartdate instead of the collectiondate in order to group taxonoccurrences/stems from the same locationevent together
6425	11/24/2012 11:59 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_* pkeys: Added dateCollected because the records are actually unique within the locationevent, not the location
6424	11/24/2012 11:57 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem_view: Exclude records with no collectiondate or obsstartdate, which is required to uniquely identify a record
6423	11/24/2012 11:54 AM	Aaron Marcuse-Kubitza	analytical_stem_view: dateCollected: Use locationevent.obsstartdate when aggregateoccurrence.collectiondate is not provided
6422	11/24/2012 11:37 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem_view: Include only the current taxondetermination for each taxonoccurrence, to avoid cross-joining taxondeterminations with stems and thus multiplying the number of rows for datasources that have multiple taxondeterminations per taxonoccurrence
6421	11/24/2012 11:33 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: taxondetermination: Added AFTER trigger to set the current taxondetermination for the taxonoccurrence
6420	11/24/2012 11:11 AM	Aaron Marcuse-Kubitza	lib/PostgreSQL-MySQL.csv: Statements ending in ";": When matching any character, use .? (with the (?s) flag) instead of [^;] in order to allow embedded ; to be matched. This fixes a bug where a CREATE VIEW statement was not removed because it contained an embedded ; .
6419	11/24/2012 11:06 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: taxondetermination: Added unique index to ensure that there is only one current determination for each taxonoccurrence
6418	11/24/2012 11:05 AM	Aaron Marcuse-Kubitza	lib/PostgreSQL-MySQL.csv: Remove indexes with WHERE clauses
6417	11/24/2012 10:34 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_aggregate: Added primary key on institutionCode, plotName, scientificNameWithMorphospecies, recordNumber. Note that this makes these fields NOT NULL, which should not be a problem because there are inner joins instead of LEFT JOINs on most of the tables which provide them, and LEFT JOINed tables have their identifying fields combined to create a NOT NULL value.
6416	11/24/2012 10:27 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem_view: recordNumber: Combine identifying fields in taxonoccurrence, plantobservation, and stemobservation to ensure that this field is unique within the plot and not NULL
6415	11/24/2012 10:23 AM	Aaron Marcuse-Kubitza	lib/PostgreSQL-MySQL.csv: Only match a statement-terminating ; when it's at the end of a line
6414	11/24/2012 10:02 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_aggregate: Added primary key on institutionCode, plotName, scientificNameWithMorphospecies. Note that this makes these fields NOT NULL, which should not be a problem because there are inner joins instead of LEFT JOINs on the tables which provide them.
6413	11/24/2012 09:21 AM	Aaron Marcuse-Kubitza	db_xml.py: put(): _setDefault(): Delay the evaluation of each col_default's value until the col_default is actually retrieved. This fixes a bug in the source table mappings where the explicit source entry was being created after the col_default source entry, causing the initial entry, which did not have the additional fields populated, to be used instead.
6412	11/24/2012 09:14 AM	Aaron Marcuse-Kubitza	dicts.py: Added WrapDict, a dict that runs a function on each value retrieved
6411	11/24/2012 08:59 AM	Aaron Marcuse-Kubitza	db_xml.py: put(): _setDefault(): Fixed bug where need to copy col_defaults before calling update() on it, to avoid modifying the input value (which may be reused by the caller, expecting it to be unmodified)
6410	11/24/2012 08:54 AM	Aaron Marcuse-Kubitza	db_xml.py: put(): col_defaults param: Fixed bug where need to use None as default value, because col_defaults will be modified by put() and the {} default value is a global instance
6409	11/24/2012 08:29 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: source table mappings: Set shortname to env var $source when it's not explicitly specified, because shortname is a required field of source
6408	11/24/2012 08:16 AM	Aaron Marcuse-Kubitza	db_xml.py: put(): Pass through the values of nodes which are text nodes
6407	11/24/2012 08:15 AM	Aaron Marcuse-Kubitza	db_xml.py: put(): put_(): Support setDefault() values which are text nodes, by passing text strings through when put() is run on all col_defaults entries
6406	11/24/2012 07:50 AM	Aaron Marcuse-Kubitza	db_xml.py: put(): _setDefault(): Support setting multiple col_defaults at once by using the param names themselves as the column names
6405	11/24/2012 07:47 AM	Aaron Marcuse-Kubitza	dicts.py: DictProxy: Implemented delitem()
6404	11/24/2012 07:32 AM	Aaron Marcuse-Kubitza	bin/map: update_in_label(): Removed hardcoded source_id col_default, which is now set in mappings/VegCore-VegBIEN.csv's output root
6403	11/24/2012 07:29 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Set the source_id col_default to the datasource name using the new _setDefault() built-in function and _env()
6402	11/24/2012 07:25 AM	Aaron Marcuse-Kubitza	db_xml.py: put(): Added _setDefault() built-in function, which adds an entry to col_defaults
6401	11/24/2012 07:23 AM	Aaron Marcuse-Kubitza	xml_func.py: _env(): Fixed bug where need to retrieve actual string value of name param using xml_dom.NodeTextEntryIter instead of NodeEntryIter
6400	11/24/2012 07:20 AM	Aaron Marcuse-Kubitza	xml_func.py: _env(): Fixed bug where need to use xml_dom.replace_with_text() instead of xml_dom.replace() because replace() requires a DOM node
6399	11/24/2012 06:44 AM	Aaron Marcuse-Kubitza	bin/map: update_in_label(): Set $source env var to the in_label (datasource name), to make it available to _env()
6398	11/24/2012 06:43 AM	Aaron Marcuse-Kubitza	xml_func.py: Simplifying functions: Added _env()
6397	11/24/2012 06:05 AM	Aaron Marcuse-Kubitza	Added inputs/VegBank/Source/, containing referenceType metadata
6396	11/24/2012 06:00 AM	Aaron Marcuse-Kubitza	Added inputs/SpeciesLink/Source/, containing referenceType metadata
6395	11/24/2012 05:55 AM	Aaron Marcuse-Kubitza	Added inputs/SALVIAS*/Source/, containing referenceType metadata
6394	11/24/2012 05:47 AM	Aaron Marcuse-Kubitza	Added inputs/REMIB/Source/, containing referenceType metadata
6393	11/24/2012 05:41 AM	Aaron Marcuse-Kubitza	Added inputs/GBIF/Source/, containing referenceType metadata
6392	11/24/2012 05:34 AM	Aaron Marcuse-Kubitza	Added inputs/TEAM/Source/, containing referenceType metadata
6391	11/24/2012 05:33 AM	Aaron Marcuse-Kubitza	Placed inputs/TEAM/_src/Vegetation-Tree-and-Liana-Metadata-1.5.pdf under version control
6390	11/24/2012 05:27 AM	Aaron Marcuse-Kubitza	inputs/FIA/import_order.txt: Added Source, which needs to come before Organism
6389	11/24/2012 05:22 AM	Aaron Marcuse-Kubitza	Added inputs/Madidi/Source/, containing referenceType metadata
6388	11/24/2012 05:19 AM	Aaron Marcuse-Kubitza	Added inputs/FIA/Source/, containing referenceType metadata
6387	11/24/2012 05:14 AM	Aaron Marcuse-Kubitza	Added inputs/CVS/Source/, containing referenceType metadata
6386	11/24/2012 05:07 AM	Aaron Marcuse-Kubitza	Added inputs/CTFS/Source/, containing referenceType metadata
6385	11/24/2012 05:05 AM	Aaron Marcuse-Kubitza	bin/map: Support map spreadsheets containing only metadata mappings (with no corresponding staging table), by falling back to an empty table when the named table does not exist
6384	11/24/2012 04:19 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: institutionCode: Also map to the sourcename's matched source, which identifies whether the source is a herbarium
6383	11/24/2012 04:08 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: source: Made shortname NOT NULL to ensure that all datasources have a globally-unique short name
6382	11/24/2012 03:33 AM	Aaron Marcuse-Kubitza	import_all: Added import of inputs/.herbaria/ before the main import
6381	11/24/2012 03:28 AM	Aaron Marcuse-Kubitza	Added inputs/.herbaria/
6380	11/24/2012 03:25 AM	Aaron Marcuse-Kubitza	input.Makefile: SVN: add: Also run %/add on all data subdirs
6379	11/24/2012 03:21 AM	Aaron Marcuse-Kubitza	input.Makefile: Existing maps discovery: Moved tables discovery to its own section, above SVN so it can be used by SVN
6378	11/24/2012 03:11 AM	Aaron Marcuse-Kubitza	mappings/VegCore.csv: referenceType: Fixed sort order
6377	11/24/2012 03:09 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped referenceType
6376	11/24/2012 03:06 AM	Aaron Marcuse-Kubitza	mappings/VegCore.csv: Added referenceType
6375	11/24/2012 02:10 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: institutionCode: Remap to source.shortname when specimen information is not provided, as is the case for geoscrub.herbaria on nimoy
6374	11/24/2012 01:47 AM	Aaron Marcuse-Kubitza	inputs/bien_web/observation/map.csv: Mapped observationID->occurrenceID
6373	11/24/2012 01:20 AM	Aaron Marcuse-Kubitza	README.TXT: Datasource setup: Add input data for each table present in the datasource: Added step to run `make inputs/<datasrc>/<table>/install` if the table is in a .sql export
6372	11/24/2012 01:17 AM	Aaron Marcuse-Kubitza	README.TXT: Datasource setup: MySQL inputs: Added step to install the export, which needs to happen before mapping individual tables
6371	11/24/2012 01:13 AM	Aaron Marcuse-Kubitza	README.TXT: Datasource setup: Add input data for each table present in the datasource: Replaced "CSV" with "CSV" because there can be multiple CSV part files for one table
6370	11/24/2012 01:11 AM	Aaron Marcuse-Kubitza	README.TXT: Datasource setup: Add input data for each table present in the datasource: Don't add a CSV or create.sql file for tables that are in a .sql export
6369	11/24/2012 01:06 AM	Aaron Marcuse-Kubitza	README.TXT: Schema changes: Sync ERD with vegbien.sql schema: Changed instructions to just select tables with arrows next to them rather than all tables, because each table that's updated will have its lines reset and the number of lines that need to be fixed should be minimized
6368	11/24/2012 01:02 AM	Aaron Marcuse-Kubitza	README.TXT: Datasource setup: Accept the test cases: `make inputs/<datasrc>/test by_col=1`: Clarified that errors could indicate bugs in the VegBIEN unique constraints
6367	11/24/2012 12:59 AM	Aaron Marcuse-Kubitza	README.TXT: Data import: To remake analytical DB: Added explicit public schema setting since the analytical DB is often manually remade after the public schema has been renamed. Removed warnings that certain commands must be run after running make_analytical_db, because the "remake analytical DB" instructions no longer require this.
6366	11/24/2012 12:48 AM	Aaron Marcuse-Kubitza	README.TXT: Datasource setup: MySQL inputs: Added steps to export the database to a PostgreSQL-compatible .sql file, which can be directly used by the install process without the need to export each table as CSV

Project

General

Profile