/ - Repository - BIEN 3 - NCEAS Projects

Name	Size	Revision	Age	Author	Comment
_archive		1598	almost 13 years	Aaron Marcuse-Kubitza	Moved _archive/tapir2flatClient/trunk/client/ t...
analysis		3076	over 12 years	Aaron Marcuse-Kubitza	Added top-level analysis dir for range modeling
bin		3271	over 12 years	Aaron Marcuse-Kubitza	csv2db: verbosity defaults to 3 so that detaile...
config		272	about 13 years	Aaron Marcuse-Kubitza	Moved bien_password to new config dir
inputs		3282	over 12 years	Aaron Marcuse-Kubitza	inputs/import.stats.xls: Fixed date for most re...
lib		3287	over 12 years	Aaron Marcuse-Kubitza	sql_io.py: put_table(): Save default values for...
mappings		3229	over 12 years	Aaron Marcuse-Kubitza	mappings/VegX-VegBIEN.stems.csv: Sort the plant...
schemas		3284	over 12 years	Aaron Marcuse-Kubitza	schemas/vegbien.sql: taxondetermination: Fixed ...
to_do		2547	almost 13 years	Aaron Marcuse-Kubitza	to_do/timeline.doc: Updated to reflect the mont...
Makefile	10.5 KB	3249	over 12 years	Aaron Marcuse-Kubitza	root Makefile: VegBIEN DB: Schemas: Added schem...
README.TXT	2.96 KB	3205	over 12 years	Aaron Marcuse-Kubitza	README.TXT: Data import: Import data into VegBI...
map	1.21 KB	3140	over 12 years	Aaron Marcuse-Kubitza	top-level map: Added support for custom public ...

#	Date	Author	Comment
3287	07/10/2012 04:36 PM	Aaron Marcuse-Kubitza	sql_io.py: put_table(): Save default values for all rows in new temp table full_in_table since in_table may have rows deleted
3286	07/10/2012 04:13 PM	Aaron Marcuse-Kubitza	sql.py: Added mk_delete() and delete()
3285	07/10/2012 03:36 PM	Aaron Marcuse-Kubitza	sql_io.py: put_table(): mk_main_select(): Turned off unnecessary ORDER BY to avoid sorting the entire table every time it's used. (PostgreSQL has no concept of reordering a table and re-using that ordering, so it just re-sorts the table each time. Index scans on the pkey do not appear to be used in practice, according to EXPLAIN results from live imports.) Document that we instead assume that identical SELECT queries retrieve rows in the same order.
3284	07/10/2012 01:56 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: taxondetermination: Fixed bug where taxondetermination_taxonoccurrence_id_fkey trigger was applied before the NOT NULL constraint on taxonoccurrence_id was checked, causing the trigger to fail on NULL taxonoccurrence_ids, by making it an AFTER trigger. (An AFTER trigger will still roll back the entire insert if it fails, even though it runs after the insert itself.)
3283	07/09/2012 05:45 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: specimenreplicate: institution_id: Fixed typo in comment
3282	07/09/2012 05:26 PM	Aaron Marcuse-Kubitza	inputs/import.stats.xls: Fixed date for most recent import
3281	07/09/2012 05:26 PM	Aaron Marcuse-Kubitza	sql.py: DbConn.run_query(): Put the data source comment on a separate line in the log file instead of using a carriage return, which sometimes had the desired effect of overwriting the src comment with the first line of the query but sometimes the line lengths weren't right and there wasn't enough overlap
3280	07/09/2012 04:53 PM	Aaron Marcuse-Kubitza	schemas/vegbien.ERD.mwb: Synced with schema
3279	07/09/2012 04:42 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: Removed per-column indexes, which are no longer needed by either row-based or column-based import because they are able to do a merge join or lookup using the table's UNIQUE INDEX. Instead of forcing the database to build and maintain large indexes (15+ GB!) that are not used, optimization-only (non-UNIQUE) indexes should be added as needed only once the database is actually used for queries. In most cases it will not even be necessary to add additional indexes then, because most UNIQUE indexes can be reused for broad lookups (rather than just duplicate elimination). Even the foreign key covering indexes (fki_*) are not needed because we virtually never delete rows in the DB, and even if we were to start doing that regularly, the cost of maintaining the indexes on import is most likely not worth the speed improvements for cascading deletes.
3278	07/09/2012 04:32 PM	Aaron Marcuse-Kubitza	schemas/py_functions.sql: Removed per-column indexes on relational functions, which are no longer needed by row-based import because it is able to do a merge join-style lookup using the table's UNIQUE INDEX. (Note that column-based import doesn't use the (slower) relational functions at all anymore, and instead calls the corresponding SQL function directly using named arguments.)

Project

General

Profile

Latest revisions

Project

General

Profile

root @ 3287

Latest revisions