/ - Changes - BIEN 3 - NCEAS Projects

root @ 5947

#	Date	Author	Comment
5947	11/01/2012 09:29 AM	Aaron Marcuse-Kubitza	mappings/VegCore.csv: Renamed Binomial to TaxonName because this field can store more ranks than just the genus+specificEpithet binomial (that goes in speciesBinomial)
5946	11/01/2012 09:22 AM	Aaron Marcuse-Kubitza	mappings/VegCore.csv: Renamed taxonName to taxonNameOrEpithet for clarity
5945	11/01/2012 09:16 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_aggregate_view: Added growthForm, cultivated, cultivatedBasis from analytical_stem. Note that this will create multiple entries for a taxon name when some taxonoccurrences are marked as cultivated and others aren't, or different taxonoccurrences have different growthform entries.
5944	11/01/2012 09:01 AM	Aaron Marcuse-Kubitza	import_all: Added geoscrub import, which can happen concurrently with NCBI/TNRS but must come before the main datasources for the matched places to link up properly
5943	11/01/2012 08:59 AM	Aaron Marcuse-Kubitza	import_all: Documented that TNRS import must come after NCBI for cross links to be made
5942	11/01/2012 08:53 AM	Aaron Marcuse-Kubitza	input.Makefile: Staging tables installation: `%.sql: _MySQL/%.sql`, etc.: Only run if target does not exist, to avoid regenerating the target when a .make script which generates the target's prerequisite is checked out from svn (causing its mod time to be newer than the target)
5941	11/01/2012 08:24 AM	Aaron Marcuse-Kubitza	Added inputs/.geoscrub/
5940	11/01/2012 08:21 AM	Aaron Marcuse-Kubitza	input.Makefile: SVN: $(svnFilesGlob): Added /, _MySQL/ subdirs
5939	11/01/2012 08:20 AM	Aaron Marcuse-Kubitza	lib/common.Makefile: SVN: $(add*): Fixed bug where need to use --depth=empty to ensure directory contents are not added unless explicitly listed
5938	11/01/2012 08:09 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped georeferenceValid, latLongInCountry, latLongInStateProvince
5937	11/01/2012 08:05 AM	Aaron Marcuse-Kubitza	schemas/functions.sql: Added _and()
5936	11/01/2012 07:40 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped distanceToCountry_km, distanceToStateProvince_km
5935	11/01/2012 07:39 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: Added _km_to_m()
5934	11/01/2012 07:36 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: Added _km_to_m()
5933	11/01/2012 07:31 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: place: Added georeference_valid, distance_to_country_m, distance_to_state_m
5932	11/01/2012 07:18 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: matched taxonlabel's ancestors: Only create the cross links to NCBI if the name is accepted (taxonIsCanonical)
5931	11/01/2012 06:55 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: accepted* taxonomic terms: If is canonical name, don't populate a separate accepted taxonlabel because the taxonlabel itself is accepted. If is not canonical name, populate matched_label_id directly when the row is created instead of waiting until after all rows are created and using _taxonlabel_set_matched_label_id(). This is possible now that accepted taxonlabels occur only in TNRS's tnrs_accepted table, which is specially marked as taxonIsCanonical. Avoiding _taxonlabel_set_matched_label_id() should fix the slowdown that occurs when it's used on large tables.
5930	11/01/2012 06:48 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: taxonIsCanonical: Moved then=0 into non-identifying _if statement attributes
5929	11/01/2012 06:41 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped taxonIsCanonical
5928	11/01/2012 06:41 AM	Aaron Marcuse-Kubitza	bin/map: map_rows(): map_table(): Fixed bug where metadata values were being removed prematurely, by passing them through
5927	11/01/2012 06:40 AM	Aaron Marcuse-Kubitza	bin/map: map_rows(): Fixed bug where metadata values were being passed to functions that expected columns, by placing them directly in the XML import tree and then removing them from the mappings
5926	11/01/2012 05:51 AM	Aaron Marcuse-Kubitza	input.Makefile: Maps validation: %/new_terms.csv: Filter out metadata values, which are not new terms even though they are in the same map column as the input columns
5925	11/01/2012 05:46 AM	Aaron Marcuse-Kubitza	inputs/.TNRS/tnrs_accepted/map.csv: Set taxonIsCanonical to true, using new metadata value syntax
5924	11/01/2012 05:40 AM	Aaron Marcuse-Kubitza	mappings/VegCore.csv: Added taxonIsCanonical
5923	11/01/2012 05:37 AM	Aaron Marcuse-Kubitza	input.Makefile: SVN: $(svnFiles): Added _MySQL files that would otherwise be excluded with $(filter-out _% ...)
5922	11/01/2012 05:34 AM	Aaron Marcuse-Kubitza	inputs/*/: Placed all logs/ subdirs and import_order.txt under version control, using `make inputs/.../add`
5921	11/01/2012 05:31 AM	Aaron Marcuse-Kubitza	input.Makefile: SVN: add: Factored list of files to add out into $(svnFiles) var
5920	11/01/2012 05:30 AM	Aaron Marcuse-Kubitza	Added inputs/.TNRS/tnrs/header.csv
5919	11/01/2012 05:28 AM	Aaron Marcuse-Kubitza	input.Makefile: SVN: $(svnFilesGlob): Added *.make
5918	11/01/2012 05:21 AM	Aaron Marcuse-Kubitza	input.Makefile: SVN: add: Fixed bug where _-prefixed dirs (such as _MySQL) were incorrectly being included in the list of files to put under version control
5917	11/01/2012 05:15 AM	Aaron Marcuse-Kubitza	Calls to `make inputs/.TNRS/cleanup`: Do `make inputs/.TNRS/tnrs_accepted/reinstall; make inputs/.TNRS/tnrs_other/reinstall` instead to use new split TNRS tables
5916	11/01/2012 05:04 AM	Aaron Marcuse-Kubitza	inputs/.TNRS/schema.sql: Removed no longer needed tnrs_accepted_names index, now that the accepted names have been split into a separate table
5915	11/01/2012 05:00 AM	Aaron Marcuse-Kubitza	inputs/.TNRS/: Split tnrs table into tnrs_accepted and tnrs_other, with tnrs_accepted imported first. This will eventually allow TNRS data to be imported without needing to use _taxonlabel_set_matched_label_id(), which is extremely slow due to an unknown bug in PostgreSQL's handling of UPDATEs on large tables (possibly limited to UPDATEs inside functions).
5914	11/01/2012 05:00 AM	Aaron Marcuse-Kubitza	input.Makefile: SVN: add: Fixed bug where logs/*.log.sql was incorrectly being included in the list of files to put under version control
5913	11/01/2012 04:53 AM	Aaron Marcuse-Kubitza	lib/common.Makefile: Filesystem: $(wildcard/): Fixed bug where {} exprs weren't handled correctly when some branches contained *, by using `shopt -s nullglob` and bash directly instead of trying to emulate it with grep
5912	11/01/2012 04:44 AM	Aaron Marcuse-Kubitza	input.Makefile: Staging tables installation: `%/install: %/create.sql`: Also provide the schema to create.sql in :schema
5911	11/01/2012 04:16 AM	Aaron Marcuse-Kubitza	bin/map: Added support for including literal metadata values in the map spreadsheet, by prefixing them with ':'
5910	11/01/2012 03:56 AM	Aaron Marcuse-Kubitza	mappings/VegCore.csv: Added distanceToCountry_km, distanceToStateProvince_km
5909	11/01/2012 03:45 AM	Aaron Marcuse-Kubitza	mappings/VegCore.csv: Added distanceToCountry_m, distanceToStateProvince_m
5908	11/01/2012 03:36 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped acceptedCountry, acceptedStateProvince, acceptedDecimalLatitude/Longitude. Mapped decimalLatitude/Longitude to matched place's coordinates when acceptedDecimalLatitude/Longitude not provided (as is the case for the geoscrub table).
5907	11/01/2012 03:22 AM	Aaron Marcuse-Kubitza	input.Makefile: SVN: $(svnFilesGlob): Added schema.sql
5906	11/01/2012 02:57 AM	Aaron Marcuse-Kubitza	mappings/VegCore.csv: Replaced georeferenceInvalid with latLongDomainValid, latLongDomainInvalid because this field is actually just whether the coordinates are in the range of valid decimal values, not whether they are within the named place reference. Re-sourced georeferenceValid to bien_web.observation#isGeovalid rather than isValidLatLong. VegCore-VegBIEN.csv: Replaced now-removed georeferenceInvalid with latLongInCountry for locationID redirection checks.
5905	11/01/2012 01:54 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Map locationID to place.placecode instead when geovalidation columns are provided
5904	11/01/2012 01:06 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: accepted* taxonlabel: Removed ancestor hierarchy because this is populated, in much greater detail, when the accepted name is imported as an input name and the TNRS-parsed components are available
5903	11/01/2012 12:55 AM	Aaron Marcuse-Kubitza	sql.py: distinct_table(): Use DISTINCT ON instead of a unique index and insert_select()'s ignore mode to remove duplicate rows. This uses whichever sorting method PostgreSQL deems to be fastest instead of requiring the use of a B-tree index. Since most of the slower operations in TNRS's import are distinct_table() calls, this should speed up the TNRS import, which is a bottleneck for the DB import as a whole because the TNRS import must complete before other datasources can be imported.
5902	11/01/2012 12:36 AM	Aaron Marcuse-Kubitza	sql.py: distinct_table(): Changed comment about distinct_on column index to include just the input table, so that the function does not guarantee a unique index on the output table's distinct_on columns
5901	11/01/2012 12:15 AM	Aaron Marcuse-Kubitza	mappings/VegCore.csv: Added acceptedCountry, acceptedStateProvince, acceptedDecimalLatitude/Longitude
5900	10/31/2012 11:57 PM	Aaron Marcuse-Kubitza	mappings/VegCore.csv: Renamed latLongValid, latLongInvalid to georeferenceValid, georeferenceInvalid to correspond to DwC term georeferenceVerificationStatus
5899	10/31/2012 11:45 PM	Aaron Marcuse-Kubitza	mappings/VegCore.csv: Added latLongValid, latLongInvalid, latLongInCountry, latLongInStateProvince
5898	10/31/2012 11:14 PM	Aaron Marcuse-Kubitza	input.Makefile: Staging tables installation: Treat any .sql file whose name contains (not just ends with) "schema" as a schema file and sort it before other .sql files
5897	10/31/2012 10:17 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: Functions containing UPDATE statements: Use quote_nullable() instead of quote_literal() to properly encode NULL values
5896	10/31/2012 10:10 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: Functions containing UPDATE statements: Use PL/pgSQL's EXECUTE statement to avoid caching query plans. This is necessary because as the table grows over time, the optimal query plan may change.
5895	10/31/2012 10:05 PM	Aaron Marcuse-Kubitza	sql_io.py: put_table(): ensure_cond(): When deleting rows rows that do not satisfy the condition, handle sql.DoesNotExistExceptions caused by columns in the condition that were not replaced with NULL. These occur when out_table is a function, and the columns of the table the condition relates to therefore can't be found using out_table.
5894	10/31/2012 09:59 PM	Aaron Marcuse-Kubitza	sql_io.py: put_table(): Calling function: Do not cache the function call, because it may be retried after error handling
5893	10/31/2012 09:58 PM	Aaron Marcuse-Kubitza	sql_gen.py: NotCond: Treat a condition that evaluates to NULL as false instead, so that the boolean effect of the condition is completely inverted
5892	10/31/2012 09:42 PM	Aaron Marcuse-Kubitza	sql_gen.py: null_as_str: Use new null instead of hardcoding 'NULL'
5891	10/31/2012 09:41 PM	Aaron Marcuse-Kubitza	sql_gen.py: Added null
5890	10/31/2012 09:40 PM	Aaron Marcuse-Kubitza	sql.py: run_query(): Give failed EXPLAIN approximately the log_level of its query, so that queries which produce an error in the EXPLAIN before the query itself is even run will still be logged
5889	10/31/2012 08:45 PM	Aaron Marcuse-Kubitza	sql_io.py: put_table(): sql.DatabaseErrors: Factored exception-handling code out into handle_unknown_exc(), for use by other exception handlers
5888	10/31/2012 08:39 PM	Aaron Marcuse-Kubitza	sql_io.py: put_table(): ensure_cond(): is_function: Fixed bug where can't replace out_table_cols with NULL because out_table is a function, not a table
5887	10/30/2012 04:59 PM	Aaron Marcuse-Kubitza	my2pg*: Turn off escape_string_warning because \-escaped strings are standard in MySQL
5886	10/30/2012 04:58 PM	Aaron Marcuse-Kubitza	my2pg.data: Turn off standard_conforming_strings like in my2pg
5885	10/30/2012 04:42 PM	Aaron Marcuse-Kubitza	my2pg: Also remove any CHARACTER SET modifier on a column definition
5884	10/30/2012 04:26 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_aggregate_view: Make size classes cumulative ranges (stems above a certain DBH) rather than bins, per Brad's request
5883	10/30/2012 04:26 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_aggregate_view: Make size classes cumulative ranges (stems above a certain DBH) rather than bins, per Brad's request
5882	10/30/2012 04:18 PM	Aaron Marcuse-Kubitza	input.Makefile: SVN: add: Add header override files with any extension, not just .csv
5881	10/30/2012 04:15 PM	Aaron Marcuse-Kubitza	README.TXT: Datasource setup: Replaced manual `svn add` commands with one `make inputs/<datasrc>/add` before committing to add newly-created files
5880	10/30/2012 04:00 PM	Aaron Marcuse-Kubitza	input.Makefile: SVN: add: Also add any *.sql, when it's in a subdir**. This applies to create.sql, cleanup.sql, etc.
5879	10/30/2012 03:58 PM	Aaron Marcuse-Kubitza	lib/common.Makefile: SVN: Added $(add*)
5878	10/30/2012 03:55 PM	Aaron Marcuse-Kubitza	input.Makefile: SVN: add: Also add any newly-created files which should be under version control
5877	10/30/2012 03:35 PM	Aaron Marcuse-Kubitza	input.Makefile: Fixed bug where _MySQL/%.sql files weren't being built from associated .make files by adding special `%.sql: .sql.make` rule to override `.sql: _MySQL/%.sql`
5876	10/30/2012 03:33 PM	Aaron Marcuse-Kubitza	input.Makefile: `%: .make`: Factored $(if $(wildcard $@)... test out into $(make_script) so all `: %.make`-like rules could use it directly
5875	10/30/2012 03:09 PM	Aaron Marcuse-Kubitza	lib/forwarding.Makefile: $(subdirs): Use all folders other than ../ ./ .svn/ instead of listing folders that start with . explicitly
5874	10/30/2012 02:31 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem_view: Use accepted_taxonlabel.taxonomicname instead of accepted_taxonverbatim.taxonomicname in order to have the family prepended
5873	10/30/2012 12:41 PM	Aaron Marcuse-Kubitza	Regenerated vegbien.ERD exports
5872	10/30/2012 12:38 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: placerank: Reordered in path order, using <http://rs.tdwg.org/dwc/terms/#dcindex> and <http://vegbank.org/vegbank/views/dba_fielddescription_detail.jsp?view=detail&wparam=1415&entity=dba_fielddescription&params=1415> as a guide. Documented the source of the values.
5871	10/30/2012 12:26 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: placename: Removed non-name-related fields, because placename is designed only to store a hierarchy of placenames, not additional place information
5870	10/30/2012 12:23 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: Moved placedescription from placename to place (and renamed it to description), because it applies to the place itself, not the name for the place
5869	10/30/2012 12:16 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_aggregate_view: Added coverPercent, which is the sum of all coverPercents for that species
5868	10/30/2012 12:13 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_aggregate_view: Added coverPercent, which is the sum of all coverPercents for that species
5867	10/30/2012 12:03 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_aggregate_view: Include all analytical_stem species, not just those whose stems have non-NULL DBH
5866	10/30/2012 11:57 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: Renamed aggregated_analytical_db to analytical_aggregate to match the name of analytical_stem
5865	10/30/2012 11:55 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: Renamed analytical_db to analytical_stem since this contains just the individual stems, not the aggregated data in the main analytical DB
5864	10/30/2012 11:52 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: Renamed analytical_db to analytical_stem since this contains just the individual stems, not the aggregated data in the main analytical DB
5863	10/30/2012 11:38 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: Removed no longer used locationcoords
5862	10/30/2012 11:35 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_db_view: Use new coordinates instead of locationcoords
5861	10/30/2012 11:23 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Remapped latitude/longitude to new coordinates table
5860	10/30/2012 11:15 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: place: Added coordinates_id
5859	10/30/2012 11:01 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: Added coordinates table
5858	10/30/2012 10:40 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: place: Removed municipality, site because they are not used in the geoscrubbing
5857	10/30/2012 10:19 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: place: Place custom hierarchy of placenames in placename table instead of in otherranks field
5856	10/30/2012 10:04 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: place.matched_place_id: Changed comment to say that places are linked in a three-level (instead of two-level) hierarchy of datasource place -> verbatim place -> accepted place, and this field contains the closest match
5855	10/30/2012 09:54 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: Renamed placepath to place since this contains primary information about the place, including the reference to the canonical place
5854	10/30/2012 09:42 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: Renamed place to placename since it refers just to a name for a place, without coordinates
5853	10/30/2012 07:18 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_db_view: Exclude original taxondeterminations, so that there is only one taxondetermination for each taxonoccurrence
5852	10/30/2012 07:03 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: make_analytical_db(): Also make new aggregated_analytical_db
5851	10/30/2012 07:02 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: sync_analytical_db_to_view(): DROP TABLE: Use IF EXISTS in case analytical_db table has already been deleted, or not yet created
5850	10/30/2012 07:01 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: Added aggregated_analytical_db_view and materialized table aggregated_analytical_db (synced using sync_aggregated_analytical_db_to_view())
5849	10/30/2012 07:01 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: Added aggregated_analytical_db_view and materialized table aggregated_analytical_db (synced using sync_aggregated_analytical_db_to_view())
5848	10/30/2012 06:56 AM	Aaron Marcuse-Kubitza	lib/PostgreSQL-MySQL.csv: custom types: Also match column names enclosed in ``

Project

General

Profile