Project

General

Profile

Statistics
| Revision:

# Date Author Comment
4485 09/06/2012 06:27 PM Aaron Marcuse-Kubitza

sql.py: run_query(): Parse "types cannot be matched" error as MissingCastException to type text

4484 09/06/2012 06:10 PM Aaron Marcuse-Kubitza

sql_io.py: put_table(): Creating the into table: Fixed bug where in_pkey and out_pkey names would collide if the output and input pkeys have the same name (as is the case for SALVIAS.projects). This entails changing out_pkey to new into_out_pkey wherever the into table's out_pkey is created or referenced.

4483 09/06/2012 05:06 PM Aaron Marcuse-Kubitza

sql_io.py: put_table(): Combining output and input pkeys in inserted order: Changed sql_gen.Table to sql_gen.Col when creating the column references (they have a similar effect, so using the wrong type did not cause any tests to fail)

4482 09/06/2012 04:49 PM Aaron Marcuse-Kubitza

README.TXT: Added steps before the import to `svn up` and update the schemas

4481 09/06/2012 04:47 PM Aaron Marcuse-Kubitza

README.TXT: Merged Backups > After a new import and Data import sections into one Data import section that contains the steps to perform and back up an import. Note that many `svn diff` lines result from a change in indentation.

4480 09/06/2012 04:35 PM Aaron Marcuse-Kubitza

sql_io.py: put_table(): Combining output and input pkeys in inserted order: Fixed bug where column references would be ambiguous if the output and input pkeys have the same name (as is the case for SALVIAS.projects)

4479 09/06/2012 04:21 PM Aaron Marcuse-Kubitza

schemas/functions.sql: Added _nullIf() overload where the type param has type text, to handle cases where row-based import auto-casts all args to text in response to a 'could not determine polymorphic type because input has type "unknown"' error

4478 09/06/2012 04:18 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: party: Removed party_datasource unique index because it was causing problems with column-based import (due to multiple unique indexes covering the same columns in different ways), and because it prevented creation of more than one party per organization

4477 09/06/2012 03:54 PM Aaron Marcuse-Kubitza

xml_func.py: _if(): Documented that it must be run to remove conditions that functions._if() can't handle

4476 09/06/2012 03:42 PM Aaron Marcuse-Kubitza

README.TXT: Datasource setup: Testing: Added step to test column-based import (by_col=1), because it is stricter about types than row-based import and sometimes fails when row-based import succeeds

4475 09/05/2012 09:18 AM Aaron Marcuse-Kubitza

schemas/functions.sql: _nullIf(): Polymorphically support other datatypes besides text

4474 09/05/2012 09:09 AM Aaron Marcuse-Kubitza

bin/map: Clearing errors table: Fixed bug where needed to check if sql_io.errors_table() returned None (indicating that the errors table didn't exist) before calling sql.drop_table()

4473 09/05/2012 09:04 AM Aaron Marcuse-Kubitza

bin/map: Clearing errors table: Fixed bug where needed to use sql.drop_table() instead of sql.truncate() now that errors tables are not created until column-based import runs

4472 09/05/2012 08:54 AM Aaron Marcuse-Kubitza

input.Makefile: Maps validation: $(missingMappingsCmd): Fixed bug where need to use system's sort, not bin/sort, now that bin/ is added to the PATH by this makefile

4471 09/05/2012 08:34 AM Aaron Marcuse-Kubitza

inputs/SALVIAS/verify/plots.ref: Regenerated on PostgreSQL staging tables. The orders have changed slightly because this is derived from a PostgreSQL translation of the queries, with corresponding changes in collations and NULL sort orders. The counts have also changed slightly, possibly due to the changes Brad made to the salvias_plots database on nimoy after the initial version was downloaded. (The current counts are correct according to the current salvias_plots database.)

4470 09/05/2012 08:31 AM Aaron Marcuse-Kubitza

inputs/SALVIAS/verify/plots.ref.sql: # locations: Fixed bug where a NULL value in LatDec or LongDec would propagate to the concatenated value, reducing its uniqueness

4469 09/05/2012 08:14 AM Aaron Marcuse-Kubitza

inputs/SALVIAS/verify/plots.ref.sql: Retrofitted to work with PostgreSQL staging tables

4468 09/05/2012 07:51 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: project: Added project_unique_name_date unique index for projects that don't have a sourceaccessioncode

4467 09/05/2012 07:46 AM Aaron Marcuse-Kubitza

inputs/SALVIAS/plotMetadata/map.csv: Remapped project_id to project.sourceaccessioncode

4466 09/05/2012 07:37 AM Aaron Marcuse-Kubitza

inputs/SALVIAS/: Added projects/

4465 09/05/2012 07:32 AM Aaron Marcuse-Kubitza

input.Makefile: Sources: $(catSrcs): Fixed bug where needed to use cat_csv even if subdir was not actually a CSV table, because this also cats the header.csv file created for a subdir that references an already-installed staging table

4464 09/05/2012 07:26 AM Aaron Marcuse-Kubitza

input.Makefile: Existing maps discovery: Fixed bug where top-level logs dir needed to be excluded from list of subdirs that are treated as tables

4463 09/05/2012 07:00 AM Aaron Marcuse-Kubitza

my2pg: Prepend 'SET standard_conforming_strings = off;' because this defaults to on starting with PostgreSQL 9.1

4462 09/05/2012 06:41 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: locationevent: Made location_id optional when sourceaccessioncode is provided, since a sourceaccessioncode is globally unique and does not require a location to scope it

4461 09/05/2012 06:36 AM Aaron Marcuse-Kubitza

input.Makefile: Staging tables installation: Store install logs for full-DB exports in new logs subdir of main dir. This also fixes a bug where the install log itself was considered a DB export, because its extension was .log.sql.

4460 09/05/2012 06:33 AM Aaron Marcuse-Kubitza

Added inputs/SALVIAS/logs/

4459 09/05/2012 06:33 AM Aaron Marcuse-Kubitza

input.Makefile: SVN: add: Also add logs subdir of main dir, to store install logs for full-DB exports

4458 09/05/2012 06:23 AM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: if subplot: Also forward locationID and plotName to the location of the parent locationevent (in addition to the parent location of the location), in order to "complete the diamond" connecting subplot locationevent -> (parent plot locationevent, subplot location) -> parent plot location

4457 09/05/2012 06:09 AM Aaron Marcuse-Kubitza

sql_io.py: cleanup_table(): NullValueException: Log the caught exception so it's clear that the update is being retried

4456 09/05/2012 06:05 AM Aaron Marcuse-Kubitza

input.Makefile: Staging tables installation: %/install: Fixed bug where $(if $(isRef)) needed to be checked before $(if $(nonXml)) because a subdir referencing an already-installed staging table must be treated specially by ignoring its autogenerated header.csv file, and not trying to install that file as if it were itself CSV data

4455 09/05/2012 05:49 AM Aaron Marcuse-Kubitza

my2pg, my2pg.data: Fixed bug where replacement for '0000-00-00' date needed to be wrapped in single quotes

4454 09/05/2012 05:45 AM Aaron Marcuse-Kubitza

input.Makefile: sql/install: Log the installation of a full-DB export to a log file in the main dir

4453 09/05/2012 05:38 AM Aaron Marcuse-Kubitza

input.Makefile: Staging tables installation: %/install: Factored out stderr logging into $(logInstall)

4452 09/05/2012 05:35 AM Aaron Marcuse-Kubitza

input.Makefile: Support empty subdirs referencing an already-installed staging table everywhere, by replacing $(isCsv) with new $(nonXml) where needed

4451 09/05/2012 05:22 AM Aaron Marcuse-Kubitza

inputs/SALVIAS/: Switched to using the DB export's staging tables instead of the exported CSVs

4450 09/05/2012 05:08 AM Aaron Marcuse-Kubitza

input.Makefile: Staging tables installation: Treat empty subdirs as referencing an already-installed staging table, and run cleanup and header export operations on them

4449 09/05/2012 04:48 AM Aaron Marcuse-Kubitza

input.Makefile: Staging tables installation: `%/install: %/create.sql`: Factored out cleanup and header export operations for reuse in other types of table subdirs

4448 09/05/2012 04:23 AM Aaron Marcuse-Kubitza

input.Makefile: Staging tables installation: `%/install: %/create.sql`: Removed deprecated (but benign) errors_table_only option to csv2db. Run csv2db without a command in order to clean up the created staging table.

4447 09/05/2012 03:57 AM Aaron Marcuse-Kubitza

sql_io.py: cleanup_table(): Removed no longer used cols param

4446 09/05/2012 03:56 AM Aaron Marcuse-Kubitza

csv2db: When no command is specified, just clean up the specified table

4445 09/05/2012 03:55 AM Aaron Marcuse-Kubitza

sql_io.py: cleanup_table(): Always clean up all columns in the table

4444 09/05/2012 03:43 AM Aaron Marcuse-Kubitza

sql_io.py: cleanup_table(): Handle NullValueExceptions (due to setting values to NULL in a NOT NULL column) by dropping the NOT NULL constraint

4443 09/05/2012 03:32 AM Aaron Marcuse-Kubitza

sql.py: Added drop_not_null()

4442 09/05/2012 03:29 AM Aaron Marcuse-Kubitza

sql_gen.py: is_text_col(): Also consider character varying to be a text type

4441 09/05/2012 03:07 AM Aaron Marcuse-Kubitza

csv2db: Removed no longer used errors_table_only option

4440 09/05/2012 03:00 AM Aaron Marcuse-Kubitza

README.TXT: Schema changes: Removed step to reinstall errors tables, because they are now created automatically by column-based import

4439 09/05/2012 02:59 AM Aaron Marcuse-Kubitza

csv2db: Removed no longer needed creation of errors table, because it is now created automatically by column-based import

4438 09/05/2012 02:58 AM Aaron Marcuse-Kubitza

input.Makefile: Staging tables installation: $(dbExports): Fixed bug where it would be non-empty even when the input contains no DB exports, because += adds extra whitespace. This caused sql/install to be incorrectly included as part of $(allInstalls).

4437 09/05/2012 02:49 AM Aaron Marcuse-Kubitza

db_xml.py: put_table(): Create errors table if it doesn't exist

4436 09/05/2012 02:48 AM Aaron Marcuse-Kubitza

sql_io.py: Added mk_errors_table()

4435 09/05/2012 02:05 AM Aaron Marcuse-Kubitza

inputs/Makefile: Input data: $(rsyncSrcs): Also exclude logs subdirs located at more than one level below the root, which occurs for example when a table subdir is moved into _archive/

4434 09/05/2012 01:56 AM Aaron Marcuse-Kubitza

input.Makefile: Staging tables installation: sql/install: Fixed bug where _always was part of $+, causing cat to try to cat this nonexistent file

4433 09/05/2012 01:51 AM Aaron Marcuse-Kubitza

Added inputs/SALVIAS/salvias_plots.schema.sql

4432 09/05/2012 01:50 AM Aaron Marcuse-Kubitza

Added inputs/SALVIAS/_MySQL/

4431 09/05/2012 01:47 AM Aaron Marcuse-Kubitza

input.Makefile: Staging tables installation: MySQL exports: Run all non-data-only exports through my2pg, not just schema-only exports. This supports transforming a combined schema+data export.

4430 09/05/2012 01:42 AM Aaron Marcuse-Kubitza

my2pg: Also perform data-only replacements, since default values can contain data-specific replacements. This also allows my2pg to transform a combined schema+data export.

4429 09/05/2012 01:39 AM Aaron Marcuse-Kubitza

input.Makefile: Staging tables installation: Also translate MySQL data to PostgreSQL

4428 09/05/2012 01:38 AM Aaron Marcuse-Kubitza

Added my2pg.data

4427 09/05/2012 01:28 AM Aaron Marcuse-Kubitza

input.Makefile: Staging tables installation: Place MySQL exports in separate _MySQL/ subdir so they don't clutter up the main dir, which will contain PostgreSQL translations

4426 09/05/2012 01:03 AM Aaron Marcuse-Kubitza

Added my2pg

4425 09/05/2012 01:02 AM Aaron Marcuse-Kubitza

input.Makefile: Staging tables installation: DB exports: Concatenate all exports together, with schemas first, so that any config options which were applied only in the schema export will remain active when the data is imported. Changed `%.pg.sql: .my.sql` to `.schema.sql: %.schema.my.sql` so there doesn't need to be a .pg suffix for PostgreSQL schemas and only the schema gets translated.

4424 09/05/2012 12:15 AM Aaron Marcuse-Kubitza

input.Makefile: Staging tables installation: $(dbExports): Don't consider MySQL DB exports as part of the DB exports that get installed, because they are not directly installable

4423 09/05/2012 12:13 AM Aaron Marcuse-Kubitza

input.Makefile: Staging tables installation: Added `%.pg.sql: %.my.sql` to translate MySQL DB schemas to PostgreSQL

4422 09/04/2012 09:20 PM Aaron Marcuse-Kubitza

inputs/SALVIAS/_src/: Added salvias_plots.sql.url to provide a link to where salvias_plots.sql was exported from (it was not a raw file given to us by the data provider)

4421 09/04/2012 08:57 PM Aaron Marcuse-Kubitza

Added cc_tty

4420 09/04/2012 08:57 PM Aaron Marcuse-Kubitza

inputs/input.Makefile: `%: %.make`: Don't automatically redirect stderr to a log file, because some .make scripts need to display password prompts, etc. on the TTY and output them to stderr instead of /dev/tty

4419 09/04/2012 08:49 PM Aaron Marcuse-Kubitza

inputs/REMIB/nodes.make: Fixed bin dir path for new subdir layout

4418 09/04/2012 08:48 PM Aaron Marcuse-Kubitza

inputs/SpeciesLink/tapir.make: Write log messages to a log file ($0.log) instead of to stderr, because the verbose log messages should not fill up stderr. To view the progress, you should instead tail the created log file.

4417 09/04/2012 08:41 PM Aaron Marcuse-Kubitza

inputs/REMIB/nodes.make: Updated path to node exports to use new subdir layout (in Specimen subdir, and without .specimens suffix)

4416 09/04/2012 08:38 PM Aaron Marcuse-Kubitza

inputs/REMIB/nodes.make: Fixed lib dir path in sys.path.append() for new subdir layout

4415 09/04/2012 08:37 PM Aaron Marcuse-Kubitza

inputs/REMIB/nodes.make: Write log messages to a log file ($0.log) instead of to sys.stderr, because the verbose log messages should not fill up stderr. To view the progress, you should instead tail the created log file.

4414 09/04/2012 08:23 PM Aaron Marcuse-Kubitza

input.Makefile: Add the bin folder to the PATH so .make scripts can easily use programs in it

4413 09/04/2012 08:06 PM Aaron Marcuse-Kubitza

input.Makefile: Staging tables installation: Support installing a DB export directly into the staging schema, without needing to first export it as CSVs

4412 09/04/2012 07:52 PM Aaron Marcuse-Kubitza

inputs/SALVIAS/: Added _src/ subdir to store original DB export (before re-export in a PostgreSQL-compatible form)

4411 09/04/2012 07:31 PM Aaron Marcuse-Kubitza

input.Makefile: `%: %.make`: Only remake if doesn't exist. This prevents unintentional remaking when the make script is newly checked out from svn (which sets the mod time to now) but the output is synced externally.

4410 09/04/2012 07:23 PM Aaron Marcuse-Kubitza

input.Makefile: `%: .make`: Removed no longer applicable comment, which applied when there were two separate `: %.make`-related rules

4409 09/04/2012 06:55 PM Aaron Marcuse-Kubitza

input.Makefile: Use $(inDatasrc) wherever its value was used

4408 09/04/2012 06:54 PM Aaron Marcuse-Kubitza

input.Makefile: Added $(inDatasrc)

4407 09/04/2012 06:40 PM Aaron Marcuse-Kubitza

sql_io.py: cleanup_table(): Only clean up text columns, to support staging tables with other column types

4406 09/04/2012 06:40 PM Aaron Marcuse-Kubitza

sql_gen.py: Added is_text_col()

4405 09/04/2012 06:29 PM Aaron Marcuse-Kubitza

sql_io.py: cleanup_table(): Add table to each column so its type can later be determined from the DB

4404 09/04/2012 06:13 PM Aaron Marcuse-Kubitza

inputs/NY/verify/specimens.ref: Regenerated from specimens.ref.sql. The counts have changed slightly because this is derived directly from the NY CSV file, rather than from the nybg_raw BIEN2 staging table.

4403 09/04/2012 06:11 PM Aaron Marcuse-Kubitza

inputs/NY/verify/specimens.ref.sql: Retrofitted to use PostgreSQL instead of MySQL syntax, since this now runs on the PostgreSQL staging tables

4402 09/04/2012 06:09 PM Aaron Marcuse-Kubitza

input.Makefile: Verification of import: Added `%.ref: %.ref.sql` rule to make datasource's summary statistics from its staging tables. (This was previously run on a MySQL installation of the datasource, and thus limited to MySQL inputs, but we are now able to use the staging tables for this.)

4401 09/04/2012 06:04 PM Aaron Marcuse-Kubitza

input.Makefile: Verification of import: $(verify): Factored psql command with output format settings into separate $(psqlExport) var

4400 09/04/2012 05:57 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: analytical_db_view: Switched join order of location and party (datasource) tables, to facilitate using a nested loop join to fill in the datasource names

4399 09/04/2012 05:55 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: party: Added party_datasource index on just the organizationname to facilitate querying just the datasources

4398 09/04/2012 04:25 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: make_analytical_db(): Removed explicit schema reference so that the function can be redirected to use the current (rotated) schema using the search_path

4397 08/31/2012 08:32 PM Aaron Marcuse-Kubitza

schemas/Makefile: Removed no longer needed analytical_db, which has been replaced by bin/make_analytical_db

4396 08/31/2012 08:31 PM Aaron Marcuse-Kubitza

README.TXT: After a new import: Use bin/make_analytical_db instead of `make schemas/analytical_db`, and run it asynchronously because it takes a long time

4395 08/31/2012 08:29 PM Aaron Marcuse-Kubitza

Added make_analytical_db

4394 08/31/2012 08:22 PM Aaron Marcuse-Kubitza

schemas/Makefile: Analytical DB: analytical_db: Time the creation of the analytical DB

4393 08/31/2012 08:18 PM Aaron Marcuse-Kubitza

README.TXT: After a new import: Added command to make the analytical DB

4392 08/31/2012 08:15 PM Aaron Marcuse-Kubitza

schemas/Makefile: Added analytical_db target

4391 08/31/2012 08:09 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: Added make_analytical_db() and helper view analytical_db_view. Note that adding a view which depends on other tables will cause those tables to be reordered in dependency order to appear before the view, causing the svn diff to change completely even though the DB structure has only been added to.

4390 08/31/2012 08:05 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: Removed OIDs from tables because we don't use them (tables have primary keys instead)

4389 08/31/2012 02:23 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import. This now includes CTFS.TaxonOccurrence (presence-only observations), FIA (11 million rows!), and Madidi.Organism. The addition of FIA almost doubles the # of rows to 26 million and increases the import time from 9.5 to 11.5 hours.

4388 08/30/2012 04:54 PM Aaron Marcuse-Kubitza

sql_io.py: null_strs: Added 'UNKNOWN'

4387 08/30/2012 04:02 PM Aaron Marcuse-Kubitza

Added inputs/FIA/

4386 08/30/2012 12:45 PM Aaron Marcuse-Kubitza

inputs/: Renamed subfolders to VegCSV names, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV_subfolders#Rename-subfolders-to-VegCSV-names>