/ - Changes - BIEN 3 - NCEAS Projects

root @ 4541

#	Date	Author	Comment
4541	09/10/2012 09:23 PM	Aaron Marcuse-Kubitza	inputs/VegBank/plantconcept_/header.csv: Updated for new renames in vegbank.~.clean_up.sql
4540	09/10/2012 09:11 PM	Aaron Marcuse-Kubitza	inputs/VegBank/plantconcept_/create.sql: Use new plantconcept_plantnames()
4539	09/10/2012 09:09 PM	Aaron Marcuse-Kubitza	inputs/VegBank/vegbank.~.utils.sql: plantconcept_plantnames(): Use SQL SELECT query and WITH clause (http://www.postgresql.org/docs/8.4/static/queries-with.html) instead of temp table, because PostgreSQL does not support using temp tables inside functions that are called repeatedly (http://archives.postgresql.org/pgsql-general/2006-02/msg00516.php; it results in an "out of shared memory" error)
4538	09/10/2012 08:30 PM	Aaron Marcuse-Kubitza	inputs/VegBank/vegbank.~.utils.sql: Removed hardcoded schema name, which is set dynamically by input.Makefile using `SET search_path`
4537	09/10/2012 08:26 PM	Aaron Marcuse-Kubitza	inputs/VegBank/vegbank.~.utils.sql: Added plantconcept_plantnames()
4536	09/10/2012 07:28 PM	Aaron Marcuse-Kubitza	inputs/VegBank/vegbank.~.utils.sql: plantconcept_ancestors(): Made function STABLE instead of IMMUTABLE because it accesses DB tables
4535	09/10/2012 07:21 PM	Aaron Marcuse-Kubitza	inputs/VegBank/vegbank.~.clean_up.sql: Fixed bug where the original plantconcept table's columns needed to be renamed, rather than the derived table plantconcept_'s. Note that this script runs before any derived tables are created, so this would be the wrong place for these statements if the derived table's columns did need to be renamed.
4534	09/10/2012 07:05 PM	Aaron Marcuse-Kubitza	input.Makefile: Staging tables installation: $(dbExports): Sort each group of .sql files in lexical order, since $(wildcard) apparently does not sort them that way automatically on vegbiendev
4533	09/10/2012 06:53 PM	Aaron Marcuse-Kubitza	inputs/import.stats.xls: Updated with stats from latest import. Corrected input row count of CTFS.TaxonOccurrence, which had been set to the inserted row count (which is right above it in the log file).
4532	09/10/2012 06:35 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: taxonrank: Added comment documenting source of values
4531	09/07/2012 04:57 PM	Aaron Marcuse-Kubitza	inputs/VegBank/taxonobservation_/map.csv: Mapped observation_id to eventID
4530	09/07/2012 04:49 PM	Aaron Marcuse-Kubitza	inputs/TEAM/: Added VL
4529	09/07/2012 04:43 PM	Aaron Marcuse-Kubitza	inputs/VegBank/: Added taxonobservation_/
4528	09/07/2012 04:43 PM	Aaron Marcuse-Kubitza	inputs/VegBank/: Added plantconcept_/
4527	09/07/2012 04:22 PM	Aaron Marcuse-Kubitza	input.Makefile: Staging tables installation: `%/install: %/create.sql`: Ignore errors if create.sql already added a primary key
4526	09/07/2012 04:12 PM	Aaron Marcuse-Kubitza	input.Makefile: Staging tables installation: `%/install: %/create.sql`: Provide the table name as a var (:table) to the query
4525	09/07/2012 03:56 PM	Aaron Marcuse-Kubitza	inputs/VegBank/vegbank.~.clean_up.sql: Prevent "column name specified more than once" errors when tables are joined
4524	09/07/2012 03:55 PM	Aaron Marcuse-Kubitza	to_do/timeline.doc: Updated to reflect additional time that validations will take, and analytical DB's dependency on it
4523	09/07/2012 02:54 PM	Aaron Marcuse-Kubitza	Added validation/
4522	09/07/2012 12:56 PM	Aaron Marcuse-Kubitza	input.Makefile: Staging tables installation: `%/install: %/create.sql`: Time the install
4521	09/07/2012 12:54 PM	Aaron Marcuse-Kubitza	inputs/VegBank/: Added plantconcept_/
4520	09/07/2012 12:35 PM	Aaron Marcuse-Kubitza	inputs/VegBank/vegbank.~.utils.sql: plantconcept_ancestors(): Renamed ancestor_id output param to plantconcept_id for clarity and so it can be directly USING-joined with plantconcept on plantconcept_id
4519	09/07/2012 12:24 PM	Aaron Marcuse-Kubitza	inputs/VegBank/: Added vegbank.~.utils.sql (which runs after vegbank.sql), for use by tables' create.sql scripts
4518	09/07/2012 10:57 AM	Aaron Marcuse-Kubitza	inputs/import.stats.xls: Updated with stats from latest import
4517	09/07/2012 10:43 AM	Aaron Marcuse-Kubitza	inputs/VegBank/: Added observation_/
4516	09/07/2012 10:31 AM	Aaron Marcuse-Kubitza	inputs/VegBank/: Added vegbank.~.clean_up.sql (which runs after vegbank.sql), to prevent "cannot alter type of a column used by a view or rule" errors
4515	09/07/2012 10:14 AM	Aaron Marcuse-Kubitza	inputs/VegBank/: Added plot_/
4514	09/07/2012 10:13 AM	Aaron Marcuse-Kubitza	inputs/VegBank/: Added plot_/
4513	09/07/2012 10:13 AM	Aaron Marcuse-Kubitza	inputs/VegBank/: Added logs
4512	09/07/2012 10:12 AM	Aaron Marcuse-Kubitza	input.Makefile: Staging tables installation: `%/install: %/create.sql`: Log the output to the install log, just like for other %/install targets
4511	09/07/2012 10:06 AM	Aaron Marcuse-Kubitza	vegbien_dest: schemas: Added public explicitly, even though it's already in the default search_path, in order to shadow any datasource's tables of the same name as a VegBIEN table (such as in VegBank). (VegBIEN tables are referenced without a schema, while datasource tables are referenced with a schema, so collisions are not a problem after this fix.)
4510	09/07/2012 09:55 AM	Aaron Marcuse-Kubitza	input.Makefile: Staging tables installation: sql/install: Fixed bug where needed space before \ at end of line, because one is not automatically added in a recipe command (although it's added elsewhere)
4509	09/07/2012 09:51 AM	Aaron Marcuse-Kubitza	sql.py: run_query(): DuplicateException: Also match "of relation" part of error message, so that parsed column name does not contain "of relation"
4508	09/07/2012 09:24 AM	Aaron Marcuse-Kubitza	subtract: Made it case- and punctuation-insensitive
4507	09/07/2012 09:18 AM	Aaron Marcuse-Kubitza	mappings/: Removed no longer needed Veg+.cs-VegBIEN.csv, which is now the same as Veg+-VegBIEN.csv which was derived from it
4506	09/07/2012 09:16 AM	Aaron Marcuse-Kubitza	join: Documented that it's case- and punctuation-insensitive.
4505	09/07/2012 09:16 AM	Aaron Marcuse-Kubitza	bin/map: map_table(): Refactored to map simplified to original column names first and then determine column index for each original name, in order to avoid trying to recover the original name from a simplified name where multiple original names might collide onto the same simplified name. Documented that it's case- and punctuation-insensitive.
4504	09/07/2012 09:11 AM	Aaron Marcuse-Kubitza	intersect, union: Made case- and punctuation-insensitive. mappings/Veg+-VegBIEN.csv: Removed no longer needed duplicate entries for each first letter case, which must now be removed for case- and punctuation-insensitive intersect/union to work. Note that the SpeciesLink `svn diff` hides _alt entry 0, which contains one of the removed duplicate columns that appears in the diff.
4503	09/07/2012 08:42 AM	Aaron Marcuse-Kubitza	bin/map: map_table(): Resolve all mappings and prefixes after applying maps.simplify()
4502	09/07/2012 08:37 AM	Aaron Marcuse-Kubitza	inputs/SpeciesLink/Specimen/map.csv: _alt all scientificNameAuthorship synonyms together in one _alt
4501	09/07/2012 08:27 AM	Aaron Marcuse-Kubitza	schemas/functions.sql: _alt(): Added extra numbered parameters. Eventually these will need to be converted to variadic args, but this will require special support from column-based import.
4500	09/07/2012 07:26 AM	Aaron Marcuse-Kubitza	join: Use new maps.simplify()
4499	09/07/2012 07:26 AM	Aaron Marcuse-Kubitza	maps.py: Added simplify()
4498	09/07/2012 07:23 AM	Aaron Marcuse-Kubitza	join: Match terms with non-alphanumeric chars removed
4497	09/07/2012 07:15 AM	Aaron Marcuse-Kubitza	join: Match terms case-insensitively
4496	09/06/2012 11:17 PM	Aaron Marcuse-Kubitza	Added inputs/TEAM/
4495	09/06/2012 10:55 PM	Aaron Marcuse-Kubitza	sql_io.py: put_table(): Creating the into table: into_out_pkey: If is_function, just use "result" as the output column name, without prefixing the function name. This shortens the table names of function calls on function calls, which need a fixed column name to detect which columns are function results and use just the table names for those columns.
4494	09/06/2012 10:32 PM	Aaron Marcuse-Kubitza	input.Makefile: Documentation: $(steps): Fixed bug where import make target needed to be changed to new single-table import target
4493	09/06/2012 09:38 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_db_view: Changed LEFT JOINs to JOINs where tables contain information that's required for the analytical DB. This should also enable the PostgreSQL query planner to make additional join optimizations, in the hopes of avoiding disk-space-intensive hash joins.
4492	09/06/2012 08:42 PM	Aaron Marcuse-Kubitza	Replaced repr() with strings.urepr() (or equivalent) everywhere needed, to avoid future UnicodeEncodeErrors
4491	09/06/2012 08:30 PM	Aaron Marcuse-Kubitza	Replaced str() with strings.ustr() (or equivalent) everywhere needed, to avoid future UnicodeEncodeErrors
4490	09/06/2012 08:03 PM	Aaron Marcuse-Kubitza	sql.py: map_expr(): Replacing without quotes: Don't match unquoted name where it's preceded or followed by '.', because this could be a '.' embedded in a punctuation-containing column name, such as those frequently used by column-based import. Note that because database-internal names currently do not contain punctuation, this situation only occurs when a database-internal expression (such as a check constraint condition) is replaced in two steps, and the first step introduces punctuation-containing column names into the expression.
4489	09/06/2012 07:19 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: project: Don't require projectname to be specified when sourceaccessioncode is provided
4488	09/06/2012 07:14 PM	Aaron Marcuse-Kubitza	sql_gen.py: ensure_not_null(): If type_ is set, cast the column to it if needed
4487	09/06/2012 06:56 PM	Aaron Marcuse-Kubitza	README.TXT: Data import: Added testing steps to perform on local machine before running the import
4486	09/06/2012 06:49 PM	Aaron Marcuse-Kubitza	README.TXT: Documentation: Redmine-formatted list of steps for column-based import: Updated make command for new table subdir name
4485	09/06/2012 06:27 PM	Aaron Marcuse-Kubitza	sql.py: run_query(): Parse "types cannot be matched" error as MissingCastException to type text
4484	09/06/2012 06:10 PM	Aaron Marcuse-Kubitza	sql_io.py: put_table(): Creating the into table: Fixed bug where in_pkey and out_pkey names would collide if the output and input pkeys have the same name (as is the case for SALVIAS.projects). This entails changing out_pkey to new into_out_pkey wherever the into table's out_pkey is created or referenced.
4483	09/06/2012 05:06 PM	Aaron Marcuse-Kubitza	sql_io.py: put_table(): Combining output and input pkeys in inserted order: Changed sql_gen.Table to sql_gen.Col when creating the column references (they have a similar effect, so using the wrong type did not cause any tests to fail)
4482	09/06/2012 04:49 PM	Aaron Marcuse-Kubitza	README.TXT: Added steps before the import to `svn up` and update the schemas
4481	09/06/2012 04:47 PM	Aaron Marcuse-Kubitza	README.TXT: Merged Backups > After a new import and Data import sections into one Data import section that contains the steps to perform and back up an import. Note that many `svn diff` lines result from a change in indentation.
4480	09/06/2012 04:35 PM	Aaron Marcuse-Kubitza	sql_io.py: put_table(): Combining output and input pkeys in inserted order: Fixed bug where column references would be ambiguous if the output and input pkeys have the same name (as is the case for SALVIAS.projects)
4479	09/06/2012 04:21 PM	Aaron Marcuse-Kubitza	schemas/functions.sql: Added _nullIf() overload where the type param has type text, to handle cases where row-based import auto-casts all args to text in response to a 'could not determine polymorphic type because input has type "unknown"' error
4478	09/06/2012 04:18 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: party: Removed party_datasource unique index because it was causing problems with column-based import (due to multiple unique indexes covering the same columns in different ways), and because it prevented creation of more than one party per organization
4477	09/06/2012 03:54 PM	Aaron Marcuse-Kubitza	xml_func.py: _if(): Documented that it must be run to remove conditions that functions._if() can't handle
4476	09/06/2012 03:42 PM	Aaron Marcuse-Kubitza	README.TXT: Datasource setup: Testing: Added step to test column-based import (by_col=1), because it is stricter about types than row-based import and sometimes fails when row-based import succeeds
4475	09/05/2012 09:18 AM	Aaron Marcuse-Kubitza	schemas/functions.sql: _nullIf(): Polymorphically support other datatypes besides text
4474	09/05/2012 09:09 AM	Aaron Marcuse-Kubitza	bin/map: Clearing errors table: Fixed bug where needed to check if sql_io.errors_table() returned None (indicating that the errors table didn't exist) before calling sql.drop_table()
4473	09/05/2012 09:04 AM	Aaron Marcuse-Kubitza	bin/map: Clearing errors table: Fixed bug where needed to use sql.drop_table() instead of sql.truncate() now that errors tables are not created until column-based import runs
4472	09/05/2012 08:54 AM	Aaron Marcuse-Kubitza	input.Makefile: Maps validation: $(missingMappingsCmd): Fixed bug where need to use system's sort, not bin/sort, now that bin/ is added to the PATH by this makefile
4471	09/05/2012 08:34 AM	Aaron Marcuse-Kubitza	inputs/SALVIAS/verify/plots.ref: Regenerated on PostgreSQL staging tables. The orders have changed slightly because this is derived from a PostgreSQL translation of the queries, with corresponding changes in collations and NULL sort orders. The counts have also changed slightly, possibly due to the changes Brad made to the salvias_plots database on nimoy after the initial version was downloaded. (The current counts are correct according to the current salvias_plots database.)
4470	09/05/2012 08:31 AM	Aaron Marcuse-Kubitza	inputs/SALVIAS/verify/plots.ref.sql: # locations: Fixed bug where a NULL value in LatDec or LongDec would propagate to the concatenated value, reducing its uniqueness
4469	09/05/2012 08:14 AM	Aaron Marcuse-Kubitza	inputs/SALVIAS/verify/plots.ref.sql: Retrofitted to work with PostgreSQL staging tables
4468	09/05/2012 07:51 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: project: Added project_unique_name_date unique index for projects that don't have a sourceaccessioncode
4467	09/05/2012 07:46 AM	Aaron Marcuse-Kubitza	inputs/SALVIAS/plotMetadata/map.csv: Remapped project_id to project.sourceaccessioncode
4466	09/05/2012 07:37 AM	Aaron Marcuse-Kubitza	inputs/SALVIAS/: Added projects/
4465	09/05/2012 07:32 AM	Aaron Marcuse-Kubitza	input.Makefile: Sources: $(catSrcs): Fixed bug where needed to use cat_csv even if subdir was not actually a CSV table, because this also cats the header.csv file created for a subdir that references an already-installed staging table
4464	09/05/2012 07:26 AM	Aaron Marcuse-Kubitza	input.Makefile: Existing maps discovery: Fixed bug where top-level logs dir needed to be excluded from list of subdirs that are treated as tables
4463	09/05/2012 07:00 AM	Aaron Marcuse-Kubitza	my2pg: Prepend 'SET standard_conforming_strings = off;' because this defaults to on starting with PostgreSQL 9.1
4462	09/05/2012 06:41 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: locationevent: Made location_id optional when sourceaccessioncode is provided, since a sourceaccessioncode is globally unique and does not require a location to scope it
4461	09/05/2012 06:36 AM	Aaron Marcuse-Kubitza	input.Makefile: Staging tables installation: Store install logs for full-DB exports in new logs subdir of main dir. This also fixes a bug where the install log itself was considered a DB export, because its extension was .log.sql.
4460	09/05/2012 06:33 AM	Aaron Marcuse-Kubitza	Added inputs/SALVIAS/logs/
4459	09/05/2012 06:33 AM	Aaron Marcuse-Kubitza	input.Makefile: SVN: add: Also add logs subdir of main dir, to store install logs for full-DB exports
4458	09/05/2012 06:23 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: if subplot: Also forward locationID and plotName to the location of the parent locationevent (in addition to the parent location of the location), in order to "complete the diamond" connecting subplot locationevent -> (parent plot locationevent, subplot location) -> parent plot location
4457	09/05/2012 06:09 AM	Aaron Marcuse-Kubitza	sql_io.py: cleanup_table(): NullValueException: Log the caught exception so it's clear that the update is being retried
4456	09/05/2012 06:05 AM	Aaron Marcuse-Kubitza	input.Makefile: Staging tables installation: %/install: Fixed bug where $(if $(isRef)) needed to be checked before $(if $(nonXml)) because a subdir referencing an already-installed staging table must be treated specially by ignoring its autogenerated header.csv file, and not trying to install that file as if it were itself CSV data
4455	09/05/2012 05:49 AM	Aaron Marcuse-Kubitza	my2pg, my2pg.data: Fixed bug where replacement for '0000-00-00' date needed to be wrapped in single quotes
4454	09/05/2012 05:45 AM	Aaron Marcuse-Kubitza	input.Makefile: sql/install: Log the installation of a full-DB export to a log file in the main dir
4453	09/05/2012 05:38 AM	Aaron Marcuse-Kubitza	input.Makefile: Staging tables installation: %/install: Factored out stderr logging into $(logInstall)
4452	09/05/2012 05:35 AM	Aaron Marcuse-Kubitza	input.Makefile: Support empty subdirs referencing an already-installed staging table everywhere, by replacing $(isCsv) with new $(nonXml) where needed
4451	09/05/2012 05:22 AM	Aaron Marcuse-Kubitza	inputs/SALVIAS/: Switched to using the DB export's staging tables instead of the exported CSVs
4450	09/05/2012 05:08 AM	Aaron Marcuse-Kubitza	input.Makefile: Staging tables installation: Treat empty subdirs as referencing an already-installed staging table, and run cleanup and header export operations on them
4449	09/05/2012 04:48 AM	Aaron Marcuse-Kubitza	input.Makefile: Staging tables installation: `%/install: %/create.sql`: Factored out cleanup and header export operations for reuse in other types of table subdirs
4448	09/05/2012 04:23 AM	Aaron Marcuse-Kubitza	input.Makefile: Staging tables installation: `%/install: %/create.sql`: Removed deprecated (but benign) errors_table_only option to csv2db. Run csv2db without a command in order to clean up the created staging table.
4447	09/05/2012 03:57 AM	Aaron Marcuse-Kubitza	sql_io.py: cleanup_table(): Removed no longer used cols param
4446	09/05/2012 03:56 AM	Aaron Marcuse-Kubitza	csv2db: When no command is specified, just clean up the specified table
4445	09/05/2012 03:55 AM	Aaron Marcuse-Kubitza	sql_io.py: cleanup_table(): Always clean up all columns in the table
4444	09/05/2012 03:43 AM	Aaron Marcuse-Kubitza	sql_io.py: cleanup_table(): Handle NullValueExceptions (due to setting values to NULL in a NOT NULL column) by dropping the NOT NULL constraint
4443	09/05/2012 03:32 AM	Aaron Marcuse-Kubitza	sql.py: Added drop_not_null()
4442	09/05/2012 03:29 AM	Aaron Marcuse-Kubitza	sql_gen.py: is_text_col(): Also consider character varying to be a text type

Project

General

Profile