Project

General

Profile

Activity

From 11/07/2012 to 12/06/2012

12/06/2012

11:18 PM Revision 6673: dict2redmine: Generate an outline instead of a table so each term will be indexed in the page's table of contents
Aaron Marcuse-Kubitza
11:13 PM Revision 6672: schemas/vegbien.sql: coordinates: coordinates_unique: Removed md5() around verbatimcoordinates because functions within unique indexes (other than the standard COALESCE()) are not yet supported by the import algorithm
Aaron Marcuse-Kubitza
11:10 PM Revision 6671: exc.py: e_msg(): Emit a warning instead of an AssertionError if e.args[0] isn't a string, to assist in debugging malformed exceptions
Aaron Marcuse-Kubitza
11:02 PM Revision 6670: mappings/VegCore.csv: sampleType: Re-sourced to bien_web.observationType
Aaron Marcuse-Kubitza
10:34 PM Revision 6669: schemas/vegbien.sql: analytical_stem_view: scientificNameWithMorphospecies: Fixed bug where need to use the taxonomicname in accepted_taxonlabel instead of accepted_taxonverbatim, because taxonverbatim only contains fields provided by the data provider (in this case, TNRS), but TNRS does not provide the taxonomic name (taxon name+author), only the taxon name and author components separately
Aaron Marcuse-Kubitza
10:09 PM Revision 6668: schemas/vegbien.sql: coordinates: coordinates_unique: Use md5() on verbatimcoordinates so that it doesn't cause the index row size to be exceeded. This should fix a bug in the HIBG import where long verbatimcoordinates values were causing the error 'OperationalError: index row size 2784 exceeds maximum 2712 for index "coordinates_unique"'.
Aaron Marcuse-Kubitza
09:56 PM Revision 6667: backups/Makefile: Synchronization: Replaced download target, which downloads all backups, with %/download, which downloads just a specific backup, because you would generally only want to extract a single backup from the archive for reinstallation
Aaron Marcuse-Kubitza
09:47 PM Revision 6666: backups/Makefile: Synchronization: Sync with jupiter instead of vegbiendev. This requires running `make backups/upload` on vegbiendev to archive the files, instead of `make backups/download` to download them to your local machine.
Aaron Marcuse-Kubitza
08:58 PM Revision 6665: inputs/.geoscrub/geoscrub_output/map.csv: Removed no longer accurate comment that county is not yet used by VegBIEN
Aaron Marcuse-Kubitza
08:56 PM Revision 6664: inputs/.geoscrub/geoscrub_output/map.csv: *validity: Remapped 2 ("Point is <=5km from putative GADM polygon, but still outside it") to true instead of false, because 5km is close enough to the polygon that the mismatch could result from shapefile simplifying, boundary changes, or other factors that don't affect geovalidity
Aaron Marcuse-Kubitza
08:52 PM Revision 6663: inputs/.geoscrub/geoscrub_output/map.csv: *validity: Remapped 0 ("Complete name provided, but couldn't be scrubbed to GADM") to NULL instead of false, because the absence of a name match does not mean the coordinates are invalid
Aaron Marcuse-Kubitza
08:51 PM Revision 6662: inputs/.{NCBI,TNRS}/import_order.txt: Added Source
Aaron Marcuse-Kubitza
08:50 PM Revision 6661: input.Makefile: SVN: add: Add a Source table to store datasource metadata. This adds a Source table to all herbaria which are listed in .herbaria, and therefore didn't previously need a Source table to indicate their referenceType and sampleType.
Aaron Marcuse-Kubitza
08:44 PM Revision 6660: input.Makefile: SVN: add: Add a Source table to store datasource metadata. This adds a Source table to all herbaria which are listed in .herbaria, and therefore didn't previously need a Source table to indicate their referenceType and sampleType.
Aaron Marcuse-Kubitza
08:43 PM Revision 6659: inputs/input.Makefile: SVN: add: verify/: Added *.xls to svn:ignore
Aaron Marcuse-Kubitza
08:33 PM Revision 6658: inputs/.geoscrub/geoscrub_output/postprocess.sql: Added index on decimallatitude, decimallongitude
Aaron Marcuse-Kubitza
08:30 PM Revision 6657: Added inputs/.geoscrub/geoscrub_output/postprocess.sql, which adds NOT NULL constraints on decimallatitude, decimallongitude
Aaron Marcuse-Kubitza
08:21 PM Task #474 (Rejected): use svn to figure out when a map file has changed and needs to be cleaned up
This would prevent forcing a cleanup by touching the map file Aaron Marcuse-Kubitza
08:16 PM Task #309 (Rejected): mapping and export utility from VegBank to VegX
We are importing VegBank directly into VegBIEN rather than via VegX Aaron Marcuse-Kubitza
08:13 PM Task #299 (Resolved): Mapping from NVS to VegX and VegBIEN
Done for spreadsheet extract provided by Nick Spencer, on vegbiendev in @/home/bien/svn/inputs/NVS/_src/@. We aren't ... Aaron Marcuse-Kubitza
08:13 PM Task #477 (Rejected): allow putting specimens data directly in the top level of the datasource directory
This would prevent adding a separate metadata table later on, so we don't want to allow this Aaron Marcuse-Kubitza
08:11 PM Task #544 (New): integrate creation of analytical DB into automated testing
Aaron Marcuse-Kubitza
06:55 PM Revision 6656: schemas/vegbien.sql: analytical_*: Changed type of boolean columns to integer so that they will be exported as 1/0 instead of t/f by export_analytical_db. This will enable MySQL's LOAD DATA INFILE to import the values correctly.
Aaron Marcuse-Kubitza
06:07 PM Revision 6655: backups/Makefile: Checksums: %.md5/test: Only use md5sum's -v option on Mac, because it's not supported on Linux (there, verbose mode is the default)
Aaron Marcuse-Kubitza
05:57 PM Revision 6654: mappings/VegCore.csv: cultivated* source: Added picklist value to URL
Aaron Marcuse-Kubitza
05:46 PM Revision 6653: README.TXT: Data import: On nimoy: Creating analytical_aggregate table: publish_analytical_db: Rewrapped line
Aaron Marcuse-Kubitza
05:45 PM Revision 6652: README.TXT: Data import: On nimoy: Creating analytical_aggregate table: Changed name to analytical_aggregate_r<revision> to allow storing different versions simultaneously
Aaron Marcuse-Kubitza
05:26 PM Revision 6651: publish_analytical_db: Require caller to specify the name of the table to load data into. This allows appending a revision to analytical_aggregate, or publishing a table other than analytical_aggregate.
Aaron Marcuse-Kubitza
05:24 PM Revision 6650: publish_analytical_db: Require caller to specify the name of the table to load data into. This allows appending a revision to analytical_aggregate, or publishing a table other than analytical_aggregate.
Aaron Marcuse-Kubitza
05:23 PM Revision 6649: inputs/input.Makefile: SVN: add: verify/: Added *.xls to svn:ignore
Aaron Marcuse-Kubitza
04:33 PM Revision 6648: backups/Makefile: SQL: Full DB: vegbien.%.backup: Also generate MD5 sum
Aaron Marcuse-Kubitza
04:18 PM Revision 6647: inputs/import.stats.xls: Updated import times
Aaron Marcuse-Kubitza

12/05/2012

10:57 AM Revision 6646: README.TXT: Data import: Delete previous imports based on the full DB backup file
Aaron Marcuse-Kubitza
10:56 AM Revision 6645: backups/Makefile: Support removing public schema versions based on the version of a full DB backup
Aaron Marcuse-Kubitza
10:52 AM Revision 6644: mappings/VegCore.csv, Veg+-VegCore.csv: Removed the additional dict namespace for the SALVIAS sources. This removes the extra "dict:" namespace on the generate Redmine source term names.
Aaron Marcuse-Kubitza
10:49 AM Revision 6643: mappings/VegCore.csv, Veg+-VegCore.csv: Added TNRS provider namespace, inserting it before BIEN in the sort order
Aaron Marcuse-Kubitza
10:43 AM Revision 6642: mappings/VegCore.csv: Changed + to _ in URL fragments
Aaron Marcuse-Kubitza
10:41 AM Revision 6641: mappings/VegCore.csv, Veg+-VegCore.csv: Removed the additional BIEN namespace for the BIEN sources, and use just BIEN2 and VegBIEN as the sub-namespaces. This removes the extra "BIEN:" namespace on the generate Redmine source term names.
Aaron Marcuse-Kubitza
10:37 AM Revision 6640: mappings/VegCore.csv, Veg+-VegCore.csv: Removed the "terms" text in the current DwC terms' provider, and leave just the sort order. This removes the extra "terms:" namespace on the generate Redmine source term names.
Aaron Marcuse-Kubitza
10:33 AM Revision 6639: dict2redmine: url_term(): Remove empty URL comments
Aaron Marcuse-Kubitza
10:32 AM Revision 6638: dict2redmine: url_comment_text(): Interpret a URL comment containing just a number as a sort order without text
Aaron Marcuse-Kubitza
10:29 AM Revision 6637: dict2redmine: url_term(): Prefix any provider in the URL to the term name, to create a namespace. Each hierarchical component of the provider is stored in a URL comment.
Aaron Marcuse-Kubitza
10:27 AM Revision 6636: dict2redmine: Added url_comment_re
Aaron Marcuse-Kubitza
10:27 AM Revision 6635: dict2redmine: Added url_comment_text()
Aaron Marcuse-Kubitza
10:26 AM Revision 6634: dict2redmine: Call simplify_url() just on the first source so that source2redmine_url() can use the raw URL (to extract comments, etc.)
Aaron Marcuse-Kubitza
09:09 AM Revision 6633: dict2redmine: Removed no longer used explicit Definition column #
Aaron Marcuse-Kubitza
09:06 AM Revision 6632: dict2redmine: Use the input spreadsheet's column names and order, and pass through columns other than the term and sources columns
Aaron Marcuse-Kubitza
09:05 AM Revision 6631: mappingsf/VegCore.csv, Veg+-VegCore.csv: Renamed Comments to Definition to match Redmine table
Aaron Marcuse-Kubitza
09:04 AM Revision 6630: mappings/VegCore.csv, Veg+-VegCore.csv: Reversed order of Comments, Sources columns to match Redmine table order
Aaron Marcuse-Kubitza
08:58 AM Revision 6629: mappings/VegCore.csv, Veg+-VegCore.csv: Reversed order of Comments, Sources columns to match Redmine table order
Aaron Marcuse-Kubitza
08:56 AM Revision 6628: dict2redmine: Store term_str in a var before using it, like sources_str
Aaron Marcuse-Kubitza
08:43 AM Revision 6627: dict2redmine: Added Definition column
Aaron Marcuse-Kubitza
08:32 AM Revision 6626: dict2redmine: Take term and sources col #s as args instead of hardcoding them by column name or position
Aaron Marcuse-Kubitza
08:25 AM Revision 6625: dict2redmine: url_term(): Also match any namespace that's part of the term
Aaron Marcuse-Kubitza
08:21 AM Revision 6624: dict2redmine: Sources: Use source2redmine_url() to extract the term from each source URL
Aaron Marcuse-Kubitza
08:20 AM Revision 6623: dict2redmine: source2redmine_url(): Support empty URLs
Aaron Marcuse-Kubitza
08:15 AM Revision 6622: dict2redmine: url_term(): Fixed bug where need to use match.group() instead of match.groups()
Aaron Marcuse-Kubitza
08:02 AM Revision 6621: mappings/Makefile: Create VegCore.redmine from VegCore.csv
Aaron Marcuse-Kubitza
08:01 AM Revision 6620: Added dict2redmine
Aaron Marcuse-Kubitza
07:26 AM Revision 6619: mappings/VegCore.csv, Veg+-VegCore.csv: Renamed Source column to Sources because it can contain multiple sources
Aaron Marcuse-Kubitza
07:12 AM Revision 6618: mappings/VegCore.csv, Veg+-VegCore.csv: Source: DwC terms: Scoped sort order by category, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCore_refactoring#Scope-DwC-sort-order-by-category>
Aaron Marcuse-Kubitza
06:35 AM Revision 6617: mappings/VegCore.csv, Veg+-VegCore.csv: Source: VegX terms: Split combined field group/field sort order into separate sort orders for field and field group
Aaron Marcuse-Kubitza
06:22 AM Revision 6616: mappings/VegCore.csv, Veg+-VegCore.csv: Source: VegX terms: Added top-level table sort order
Aaron Marcuse-Kubitza
06:07 AM Revision 6615: mappings/VegCore.csv: taxonName: Reordered sources so it would sort with *TaxonName and scientificName
Aaron Marcuse-Kubitza
06:04 AM Revision 6614: mappings/VegCore.csv: Source: DwC Taxon: Added sort order so it would sort together with its fields
Aaron Marcuse-Kubitza
05:58 AM Revision 6613: mappings/VegCore.csv, Veg+-VegCore.csv: Source: DwC occurrenceID: Corrected sort order to 019 instead of 000
Aaron Marcuse-Kubitza
05:55 AM Revision 6612: mappings/VegCore.csv, Veg+-VegCore.csv: Source: DwC terms: Added category, with category sort order, as URL comment. This will allow terms to be sorted just within their category rather than globally for DwC.
Aaron Marcuse-Kubitza
05:49 AM Revision 6611: mappings/Veg+-VegCore.csv: Source: DwC: dcterms: Added back "dcterms:" prefix to URL fragment
Aaron Marcuse-Kubitza
05:31 AM Revision 6610: mappings/VegCore.csv: Source: TNRS terms: Added sort order to web page fragment (simple_download, detailed_download)
Aaron Marcuse-Kubitza
05:25 AM Revision 6609: mappings/VegCore.csv, Veg+-VegCore.csv: Removed no longer used Order within table column. Instead, embed the sort order in the URL using a () comment.
Aaron Marcuse-Kubitza
05:23 AM Revision 6608: mappings/VegCore.csv, Veg+-VegCore.csv: Merged the Order within table column with the Source URL, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCore_refactoring#Merging-the-Order-within-table-column-with-the-Source-URL>. Sorting on the Source column now groups related terms together according to their sort order in the source they came from.
Aaron Marcuse-Kubitza
05:11 AM Revision 6607: mappings/Veg+-VegCore.csv: Order within table: Filled in missing sort orders
Aaron Marcuse-Kubitza
04:51 AM Revision 6606: mappings/VegCore.csv, Veg+-VegCore.csv: Source: Web pages: Use / instead of . to separate nested elements of URL fragment. Use _ instead of + to represent space.
Aaron Marcuse-Kubitza
04:19 AM Revision 6605: mappings/VegCore.csv: Order within table: Filled in missing sort orders
Aaron Marcuse-Kubitza
03:58 AM Revision 6604: mappings/VegCore.csv: Source: Removed trailing whitespace
Aaron Marcuse-Kubitza
03:43 AM Revision 6603: mappings/VegCore.csv: Order within table: Fixed to include one entry for every URL, including when the Order field is empty and there are multiple URLs
Aaron Marcuse-Kubitza
03:33 AM Revision 6602: mappings/VegCore.csv: Order within table: Fixed to include one entry for every URL
Aaron Marcuse-Kubitza
02:03 AM Revision 6601: mappings/VegCore.csv: Source: "dcterms:" terms: Fixed URL fragments to use : instead of # after dcterms
Aaron Marcuse-Kubitza
01:42 AM Revision 6600: mappings/VegCore.csv, Veg+-VegCore.csv: Sources: BIEN2: Moved DB sort order right before the DB name in the URL to avoid duplicating the DB name in the comment
Aaron Marcuse-Kubitza
01:35 AM Revision 6599: mappings/VegCore.csv, Veg+-VegCore.csv: Sources: Added sort order comments to URLs so they sort in the order indicated at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCore#Sources>. URL comments are enclosed in (), and the sort order element of a comment is a number right after the ( .
Aaron Marcuse-Kubitza
12:37 AM Revision 6598: mappings/Makefile: .Veg+-VegCore.csv.last_cleanup: Sort by the source URL instead of the VegCore term
Aaron Marcuse-Kubitza
12:35 AM Revision 6597: mappings/Makefile: Split .Veg+-VegCore.csv.last_cleanup and .VegX-VegCore.csv.last_cleanup into separate targets so their recipes can be different
Aaron Marcuse-Kubitza
12:17 AM Revision 6596: mappings/VegCore-VegBIEN.csv: Mapped dcterms:rights
Aaron Marcuse-Kubitza

12/04/2012

11:52 PM Revision 6595: backups/Makefile: Synchronization: Also sync *.md5
Aaron Marcuse-Kubitza
09:52 PM Revision 6594: import_all: Fixed bug where need to wait for *all* asynchronous commands started before the main import, not just the first
Aaron Marcuse-Kubitza
09:51 PM Revision 6593: import_all: Import all Source tables before the herbaria list, so that any custom metadata will override the info in the herbaria list
Aaron Marcuse-Kubitza
09:43 PM Revision 6592: input.Makefile: Tables discovery: $(dontImport): Don't import the Source table when $import_source env var is set to ""
Aaron Marcuse-Kubitza
09:33 PM Revision 6591: input.Makefile: SVN: add: Add a Source table to store datasource metadata. This adds a Source table to all herbaria which are listed in .herbaria, and therefore didn't previously need a Source table to indicate their referenceType and sampleType.
Aaron Marcuse-Kubitza
09:22 PM Revision 6590: Added inputs/VASCAN/Source/
Aaron Marcuse-Kubitza
09:18 PM Revision 6589: csvs.py: stream_info(): Use the Excel dialect and an empty header if the CSV file is empty
Aaron Marcuse-Kubitza
08:29 PM Revision 6588: pg_dump_limit: Also remove CREATE DATABASE statements
Aaron Marcuse-Kubitza
08:09 PM Revision 6587: Added inputs/JBM/Source/
Aaron Marcuse-Kubitza
08:07 PM Revision 6586: mappings/Veg+-VegCore.csv: Removed type->dcterms:type automapping because this term can have many different meanings
Aaron Marcuse-Kubitza
08:06 PM Revision 6585: mappings/Veg+-VegCore.csv: Removed type->dcterms:type automapping because this term can have many different meanings
Aaron Marcuse-Kubitza
08:03 PM Revision 6584: Added inputs/NVS/Source/
Aaron Marcuse-Kubitza
08:02 PM Revision 6583: Added inputs/IUCN/European_Red_List_Plants/header.csv
Aaron Marcuse-Kubitza
08:02 PM Revision 6582: Added inputs/CVS/_src/
Aaron Marcuse-Kubitza
08:01 PM Revision 6581: input.Makefile: SVN: $(svnFilesGlob): Include test.xml.ref instead of all test*.xml* to avoid including test outputs
Aaron Marcuse-Kubitza
07:57 PM Revision 6580: inputs/*/verify/: Updated svn:ignore
Aaron Marcuse-Kubitza
07:55 PM Revision 6579: mappings/VegCore-VegBIEN.csv: Mapped verbatimCoordinates
Aaron Marcuse-Kubitza
07:54 PM Revision 6578: Updated inputs/HIBG/Specimen/new_terms.csv
Aaron Marcuse-Kubitza
07:50 PM Revision 6577: Added inputs/HIBG/Source/
Aaron Marcuse-Kubitza
07:49 PM Revision 6576: inputs/HIBG/verify/: Updated svn:ignore
Aaron Marcuse-Kubitza
07:47 PM Revision 6575: Added inputs/NCU-NCSC/Source/
Aaron Marcuse-Kubitza
07:47 PM Revision 6574: inputs/NCU-NCSC/verify/: Updated svn:ignore
Aaron Marcuse-Kubitza
07:07 PM Revision 6573: backups/Makefile: Checksums: %.md5/test: Made it an _always target
Aaron Marcuse-Kubitza
07:05 PM Revision 6572: backups/Makefile: Checksums: Added %.md5/test to test generated checksums
Aaron Marcuse-Kubitza
07:01 PM Revision 6571: backups/Makefile: Moved md5-related targets to separate Checksums section
Aaron Marcuse-Kubitza
06:59 PM Revision 6570: backups/Makefile: %.md5: Removed not applicable comment which had been copied from %.sql
Aaron Marcuse-Kubitza

12/03/2012

07:37 PM Revision 6569: inputs/VegBank/plot_/map.csv: Mapped confidentialitystatus to coordinateUncertaintyInMeters, overriding locationaccuracy when the confidentialitystatus indicates fuzzing
Aaron Marcuse-Kubitza
07:24 PM Revision 6568: inputs/NVS/*/map.csv: Mapped Taxon Growth Form values to growthform enum
Aaron Marcuse-Kubitza
07:08 PM Revision 6567: Added inputs/NVS/Source/
Aaron Marcuse-Kubitza
07:05 PM Revision 6566: inputs/NVS/import_order.txt: Specified import order
Aaron Marcuse-Kubitza
07:02 PM Revision 6565: inputs/input.Makefile: Staging tables installation: $(allInstalls): Exclude the Source table, which contains only (metadata) mappings, not data
Aaron Marcuse-Kubitza
06:54 PM Revision 6564: Regenerated vegbien.ERD exports
Aaron Marcuse-Kubitza
06:46 PM Revision 6563: Added inputs/NVS/TaxonOccurrence.Understory/
Aaron Marcuse-Kubitza
06:38 PM Revision 6562: Added inputs/NVS/Coordinates/
Aaron Marcuse-Kubitza
06:38 PM Revision 6561: inputs/NVS/Plot/map.csv: Mapped Plot
Aaron Marcuse-Kubitza
06:33 PM Revision 6560: mappings/VegCore-VegBIEN.csv: Mapped verbatimCoordinates
Aaron Marcuse-Kubitza
06:27 PM Revision 6559: mappings/Veg+-VegCore.csv: Removed type->dcterms:type automapping because this term can have many different meanings
Aaron Marcuse-Kubitza
06:07 PM Revision 6558: inputs/NVS/: Renamed Organism to AggregateOccurrence because this actually contains aggregated samplings
Aaron Marcuse-Kubitza
06:04 PM Revision 6557: inputs/NVS/StemObservation/map.csv: Mapped Item ID, Item Obs ID
Aaron Marcuse-Kubitza
05:58 PM Revision 6556: Added inputs/NVS/StemObservation/
Aaron Marcuse-Kubitza
05:52 PM Revision 6555: Added inputs/NVS/TaxonOccurrence/
Aaron Marcuse-Kubitza
05:51 PM Revision 6554: Added inputs/NVS/map.csv (global mappings)
Aaron Marcuse-Kubitza
05:47 PM Revision 6553: inputs/NVS/Project/map.csv: Remapped Project Abbreviation to projectName instead of Name, because Project Abbreviation is what's used throughout the tables to link to the project
Aaron Marcuse-Kubitza
05:34 PM Revision 6552: inputs/NVS/Plot/map.csv: Mapped Physiography
Aaron Marcuse-Kubitza
05:31 PM Revision 6551: inputs/NVS/Plot/map.csv: Mapped Area
Aaron Marcuse-Kubitza
05:29 PM Revision 6550: inputs/NVS/Plot/map.csv: Altitude: Provided rationale for units determination
Aaron Marcuse-Kubitza
05:25 PM Revision 6549: Added inputs/NVS/Organism/
Aaron Marcuse-Kubitza
05:06 PM Revision 6548: Added inputs/NVS/Project/
Aaron Marcuse-Kubitza
05:05 PM Revision 6547: mappings/VegCore-VegBIEN.csv: Mapped projectStartDate, projectEndDate
Aaron Marcuse-Kubitza
05:02 PM Revision 6546: mappings/VegCore.csv: Added projectStartDate, projectEndDate
Aaron Marcuse-Kubitza
05:02 PM Revision 6545: mappings/VegCore.csv: Renamed plotName to locationName because this term also applies to the location of a specimen. This replaces CTFS's definition of locationName as locality.
Aaron Marcuse-Kubitza
04:37 PM Revision 6544: root Makefile: apt-get: Use --yes to allow unattended installations
Aaron Marcuse-Kubitza
03:58 PM Revision 6543: schemas/vegbien.sql: analytical_*: Renamed plotName to locationName to match the new VegCore term name
Aaron Marcuse-Kubitza
03:51 PM Revision 6542: mappings/VegCore.csv: Renamed plotName to locationName because this term also applies to the location of a specimen. This replaces CTFS's definition of locationName as locality.
Aaron Marcuse-Kubitza
03:30 PM Revision 6541: mappings/VegCore.csv: Added subInstitutionCode
Aaron Marcuse-Kubitza
03:25 PM Revision 6540: schemas/vegbien.sql: analytical_stem_view: locationevent info: Fixed bug where need to use project.sourceaccessioncode instead of locationevent.project_id for the projectID
Aaron Marcuse-Kubitza
03:21 PM Revision 6539: schemas/vegbien.sql: analytical_stem_view: locationevent info: Fixed bug where need to use the parent locationevent's obsstartdate instead when the subevent does not provide it
Aaron Marcuse-Kubitza
03:19 PM Revision 6538: schemas/vegbien.sql: analytical_stem_view: locationevent info: Fixed bug where need to use the parent locationevent's project and method instead when the subevent does not provide them, because they are often attached to it instead
Aaron Marcuse-Kubitza
03:07 PM Revision 6537: schemas/vegbien.sql: analytical_stem_view: geolocation info: Fixed bug where need to use the parent location instead when provided, because lat/long and placenames are attached to it instead of the subplot's location
Aaron Marcuse-Kubitza
02:47 PM Revision 6536: backups/Makefile: %.md5: Fixed bug where md5sum does not have a -q option like md5
Aaron Marcuse-Kubitza
02:43 PM Revision 6535: backups/Makefile: %.md5: Fixed bug where need to use md5sum instead of md5 on Linux
Aaron Marcuse-Kubitza
02:39 PM Revision 6534: schemas/vegbien.sql: analytical_stem_view: Filter out non-current taxondeterminations (occurrences with no taxondetermination are preserved)
Aaron Marcuse-Kubitza
02:10 PM Revision 6533: schemas/vegbien.sql: Removed no longer needed darwin_core table. Use analytical_stem instead, which is now identical.
Aaron Marcuse-Kubitza
02:02 PM Revision 6532: schemas/vegbien.sql: sync_analytical_*_to_view(): Creating analytical_* table: Fixed bug where need LIMIT 0 so that it can be used on a full DB, which will have data in the tables used by analytical_stem_view
Aaron Marcuse-Kubitza
01:40 PM Revision 6531: schemas/vegbien.sql: Merged darwin_core into analytical_stem
Aaron Marcuse-Kubitza
01:21 PM Revision 6530: schemas/vegbien.sql: darwin_core_view, analytical_stem_view: Updated now that newWorldCountries.isoCode is a text field
Aaron Marcuse-Kubitza
12:35 PM Revision 6529: README.TXT: Data import: backups: Step to copy backups to jupiter: Added full path to aaronmk/ (/data/dev/aaronmk)
Aaron Marcuse-Kubitza
12:10 PM Task #539 (Rejected): get analytical_stem_view to use merge joins instead of hash joins
* This will speed up the joins, which used to take 0.5 hour but now take 2.5 hours
* It will also reduce the disk sp...
Aaron Marcuse-Kubitza
12:00 PM Revision 6528: inputs/newWorld/geoscrub.schema.~.changes.sql: Reversed order of adding unique constraints and changing types
Aaron Marcuse-Kubitza
11:57 AM Revision 6527: inputs/newWorld/geoscrub.schema.~.changes.sql: Changed isoCode type to text. Added unique constraint on isoCode.
Aaron Marcuse-Kubitza
11:06 AM Revision 6526: backups/Makefile: Added md5s target to generate .md5 files for all backups
Aaron Marcuse-Kubitza
11:05 AM Revision 6525: inputs/import.stats.xls: Updated import times
Aaron Marcuse-Kubitza
10:48 AM Revision 6524: backups/Makefile: %.md5: Run with `nice -n +5` to avoid slowing down the UI
Aaron Marcuse-Kubitza
10:46 AM Revision 6523: backups/: svn:ignore: Added *.md5. Removed no longer applicable *.log.
Aaron Marcuse-Kubitza
10:42 AM Revision 6522: backups/Makefile: Changed paths to be relative to the Makefile rather than the current directory, so this Makefile can be used in other directories as well (such as jupiter:/aaronmk/VegBIEN.backups/)
Aaron Marcuse-Kubitza
10:34 AM Revision 6521: backups/Makefile: %.backup: Also create MD5 of backup
Aaron Marcuse-Kubitza
10:31 AM Revision 6520: backups/Makefile: Added %.md5 target to create checksums of each backup
Aaron Marcuse-Kubitza
10:17 AM Revision 6519: README.TXT: Data import: backups: Added step to copy backups to jupiter in /aaronmk/VegBIEN.backups/ . The jupiter folder, which has several TB of space available, will replace local backup drives as the location for archived backups.
Aaron Marcuse-Kubitza
10:00 AM Revision 6518: README.TXT: Data import: Removed additional backup of just the public schema, which is not needed because the public schema is included in the full DB backup. The additional public schema backup increased the total backup size by 60-70%, so this will help conserve limited disk space on vegbiendev as well as on local archives of the backups.
Aaron Marcuse-Kubitza
09:52 AM Revision 6517: README.TXT: Backups: Full DB: Updated steps to match Data import steps, which add the date to the backup filename when it's created rather than afterwards
Aaron Marcuse-Kubitza
09:42 AM Revision 6516: README.TXT: Backups: Archived imports: Back up: Added instructions for archiving the last import before backing it up
Aaron Marcuse-Kubitza
09:10 AM Revision 6515: Regenerated vegbien.ERD exports
Aaron Marcuse-Kubitza
09:08 AM Revision 6514: schemas/vegbien.sql: analytical_*: Removed NOT NULL constraint on dateCollected
Aaron Marcuse-Kubitza
09:07 AM Revision 6513: schemas/vegbien.sql: source: Added sampletype field to indicate a plot or specimen datasource
Aaron Marcuse-Kubitza
09:00 AM Revision 6512: schemas/vegbien.sql: analytical_*: Removed NOT NULL constraint on dateCollected
Aaron Marcuse-Kubitza
08:55 AM Revision 6511: schemas/vegbien.sql: sync_analytical_*_to_view(): Added NOT NULL constraints
Aaron Marcuse-Kubitza

11/30/2012

05:20 PM Revision 6510: make_analytical_db: Added step to create darwin_core materialized view
Aaron Marcuse-Kubitza
05:09 PM Revision 6509: inputs/*/Source/map.csv for non-herbaria: Mapped sampleType
Aaron Marcuse-Kubitza
05:02 PM Revision 6508: inputs/.herbaria/herbaria/map.csv: Set sampleType to "specimen"
Aaron Marcuse-Kubitza
05:02 PM Revision 6507: mappings/VegCore-VegBIEN.csv: Mapped sampleType
Aaron Marcuse-Kubitza
05:00 PM Revision 6506: mappings/VegCore.csv: Added sampleType
Aaron Marcuse-Kubitza
04:57 PM Revision 6505: schemas/vegbien.sql: source: Added sampletype field to indicate a plot or specimen datasource
Aaron Marcuse-Kubitza
04:55 PM Revision 6504: schemas/vegbien.sql: Added sampletype enum
Aaron Marcuse-Kubitza
04:46 PM Revision 6503: root Makefile: $(postgresReload-*): Confirm the operation before continuing, since it involves changing PostgreSQL config files in nontrivial ways. Added instructions for setting kernel.shmmax to at least 4GB minus 1 byte on Linux, to work with the shared_buffers setting in postgresql.conf.
Aaron Marcuse-Kubitza
04:03 PM Revision 6502: schemas/postgresql.conf: shared_buffers: Documented that it must be less than ~95% of SHMMAX
Aaron Marcuse-Kubitza
03:58 PM Revision 6501: schemas/vegbien.sql: analytical_stem_view: identifiedBy: Fixed bug where need to use party.fullname instead of name components because the name is now mapped to fullname
Aaron Marcuse-Kubitza
03:28 PM Revision 6500: schemas/vegbien.sql: analytical_stem_view, darwin_core_view: dateCollected: Use the parent plot event's obsstartdate when the subplot event does not have its own obsstartdate
Aaron Marcuse-Kubitza
01:56 PM Revision 6499: schemas/vegbien.sql: analytical_stem_view: Don't filter out rows without a date or non-current taxondeterminations
Aaron Marcuse-Kubitza
01:54 PM Revision 6498: schemas/vegbien.sql: analytical_stem_view: Don't filter out rows without a date
Aaron Marcuse-Kubitza
01:28 PM Revision 6497: schemas/vegbien.sql: Added darwin_core_view
Aaron Marcuse-Kubitza
12:56 PM Revision 6496: schemas/vegbien.sql: analytical_stem_view: identifiedBy: Fixed bug where need to use party.fullname instead of name components because the name is now mapped to fullname
Aaron Marcuse-Kubitza
12:40 PM Revision 6495: schemas/vegbien.sql: sync_analytical_*_to_view(): Added CREATE INDEX statements
Aaron Marcuse-Kubitza
12:31 PM Revision 6494: README.TXT: Data import: Added steps to publish analytical DB on nimoy.bien_web
Aaron Marcuse-Kubitza
10:46 AM Revision 6493: schemas/vegbien.sql: analytical_stem_view: Changed JOINs to LEFT JOINs to include occurrences without taxondeterminations
Aaron Marcuse-Kubitza
10:21 AM Revision 6492: export_analytical_db: Use 'NULL' as the NULL value instead of \N, because MySQL has problems with \N
Aaron Marcuse-Kubitza
09:57 AM Revision 6491: publish_analytical_db: Load to bien3_adb instead of bien_web
Aaron Marcuse-Kubitza

11/29/2012

05:41 PM Revision 6490: README.TXT: Data import: Added step to export analytical DB
Aaron Marcuse-Kubitza
01:11 PM Revision 6489: root Makefile: $(postgres-Linux): Fixed bug where need $(asAdmin) before commands to rename existing *.conf
Aaron Marcuse-Kubitza
01:01 PM Revision 6488: root Makefile: $(postgres-Linux): Also install postgresql-contrib, which contains the hstore extension
Aaron Marcuse-Kubitza

11/28/2012

06:18 PM Revision 6487: Added inputs/NVS/
Aaron Marcuse-Kubitza
06:04 PM Revision 6486: inputs/CVS/Organism/map.csv: Mapped accordingTo to "Weakley 2006"
Aaron Marcuse-Kubitza
06:02 PM Revision 6485: inputs/NY/Specimen/map.csv: Omit UniqueNYInternalRecordNumber to avoid confusion since this is an internal-only ID. This makes InstitutionCode+CollectionCode+CatalogNumber the globally unique identifier instead.
Aaron Marcuse-Kubitza
06:00 PM Revision 6484: README.TXT: Added Datasource refreshing section with instructions for refreshing VegBank
Aaron Marcuse-Kubitza
05:57 PM Revision 6483: schemas/vegbien.sql: Renamed taxonconcept.concept_source_id back to concept_reference_id
Aaron Marcuse-Kubitza
05:52 PM Revision 6482: schemas/vegbien.sql: Renamed soilobs to soilsample per working group discussion
Aaron Marcuse-Kubitza
05:27 PM Revision 6481: input.Makefile: SVN: add: verify: Fixed bug where need to use $ prefix before string to parse newline
Aaron Marcuse-Kubitza
05:27 PM Revision 6480: input.Makefile: SVN: add: verify: Fixed bug where need to use $ prefix before string to parse newline
Aaron Marcuse-Kubitza
05:25 PM Revision 6479: inputs/NY/verify/: svn:ignore .csv files
Aaron Marcuse-Kubitza
05:25 PM Revision 6478: input.Makefile: SVN: add: Also svn:ignore .csv files
Aaron Marcuse-Kubitza
02:47 PM Revision 6477: export_analytical_db: Export NULL as \N to work with MySQL
Aaron Marcuse-Kubitza
01:22 PM Revision 6476: schemas/vegbien.sql: analytical_*: Added index on NOT NULL columns, starting with institutionCode
Aaron Marcuse-Kubitza
01:19 PM Revision 6475: schemas/vegbien.sql: analytical_*: Removed primary keys and NOT NULL constraints on columns that sometimes have NULL values
Aaron Marcuse-Kubitza
01:08 PM Revision 6474: publish_analytical_db: Added CSV dialect information
Aaron Marcuse-Kubitza
12:42 PM Revision 6473: root Makefile: PostgreSQL: $(postgresReload-*): Rename existing *.conf to *.conf.old
Aaron Marcuse-Kubitza

11/27/2012

06:44 PM Revision 6472: publish_analytical_db: Use LOAD DATA *LOCAL* INFILE instead of LOAD DATA INFILE to avoid needing FILE permissions on bien_web
Aaron Marcuse-Kubitza
01:17 PM Revision 6471: Added publish_analytical_db
Aaron Marcuse-Kubitza
12:43 PM Revision 6470: export_analytical_db: Append the public schema version to the CSV filename
Aaron Marcuse-Kubitza
12:27 PM Revision 6469: backups/Makefile: $(rsyncBackups): Added *.csv
Aaron Marcuse-Kubitza

11/26/2012

06:12 PM Revision 6468: Added export_analytical_db
Aaron Marcuse-Kubitza
06:10 PM Revision 6467: backups/: Ignore _* and *.csv
Aaron Marcuse-Kubitza
01:35 PM Revision 6466: make_analytical_db: mk_analytical_table(): Use explicit schema references everywhere. This fixes a bug where the TRUNCATE/INSERT steps on the public schema's table would reference the analytical_db view instead because they were not schema-scoped.
Aaron Marcuse-Kubitza
01:33 PM Revision 6465: make_analytical_db: mk_analytical_table(): Factored table references in different schemas out into vars
Aaron Marcuse-Kubitza

11/25/2012

09:31 PM Revision 6464: schemas/vegbien.sql: analytical_stem_view: recordNumber: Combine identifying fields in taxonoccurrence, plantobservation, and stemobservation to ensure that this field is unique within the plot and not NULL
Aaron Marcuse-Kubitza
09:13 PM Revision 6463: Regenerated vegbien.ERD exports
Aaron Marcuse-Kubitza
08:52 PM Revision 6462: make_analytical_db: Moved set -x () around just psql_verbose_vegbien so embedded $() expressions wouldn't also be in set -x (verbose) mode
Aaron Marcuse-Kubitza
08:49 PM Revision 6461: make_analytical_db: Fixed bug where need to use bash instead of sh because vegbien_dest requires it
Aaron Marcuse-Kubitza
08:37 PM Revision 6460: make_analytical_db: Factored analytical_* table creation code out into mk_analytical_table() function
Aaron Marcuse-Kubitza
08:28 PM Revision 6459: make_analytical_db: Create analytical_db views pointing to the analytical_* versions in the public schema
Aaron Marcuse-Kubitza
08:21 PM Revision 6458: vegbien_dest: $schemas: Removed analytical_db because views that will be added to it were shadowing public schema tables with the same names during population of those tables in make_analytical_db
Aaron Marcuse-Kubitza
07:47 PM Revision 6457: vegbien_dest: Export $public, to make sure it's available to any invoked scripts as an env var
Aaron Marcuse-Kubitza
07:45 PM Revision 6456: vegbien_dest: $schemas: Added analytical_db
Aaron Marcuse-Kubitza
07:38 PM Revision 6455: inputs/import.stats.xls: Added separate tab with stats for 2012-6~9. The Excel format apparently only supports 255 columns, so previous imports had been silently truncated off. Note that once the 2012-10 imports reach column 255, a new tab will need to be created with the 2012-10+ imports.
Aaron Marcuse-Kubitza
07:20 PM Revision 6454: bin/map: in_is_db: by_col: Clearing errors table: Skip this if the table has been set to None because it didn't exist (and thus was a metadata-only map spreadsheet)
Aaron Marcuse-Kubitza
06:54 PM Revision 6453: schemas/vegbien.sql: analytical_stem_view: scientificNameWithMorphospecies: Fixed bug where need to use the specific_epithet from the accepted_taxonverbatim rather than the parsed_taxonverbatim
Aaron Marcuse-Kubitza
06:45 PM Revision 6452: schemas/vegbien.sql: analytical_stem_view: scientificNameWithMorphospecies: Include the family any time the genus is not specified, instead of just when accepted_taxonlabel.rank = 'family'. These should have the same effect since TNRS includes the rank, but using COALESCE() is clearer.
Aaron Marcuse-Kubitza
06:41 PM Revision 6451: schemas/vegbien.sql: analytical_stem_view: scientificNameWithMorphospecies: Changed to also include morphospecies when just the family is specified
Aaron Marcuse-Kubitza
06:35 PM Revision 6450: schemas/vegbien.sql: analytical_stem_view: Fixed bug where location.authorlocationcode needed to be used as the plotName when location.sourceaccessioncode was not provided, to ensure that plotName would be NOT NULL
Aaron Marcuse-Kubitza
06:20 PM Revision 6449: inputs/FIA/import_order.txt: Fixed bug where FIA_COND_unique needed to be explicitly included in import_order.txt now that we're using import_order.txt to import the Source metadata table before the data tables
Aaron Marcuse-Kubitza
06:15 PM Revision 6448: inputs/import.stats.xls: Updated import times
Aaron Marcuse-Kubitza

11/24/2012

03:07 PM Revision 6447: root Makefile: PostgreSQL: $(postgresReload-Linux): Try chmoding both as your user and as the bien user
Aaron Marcuse-Kubitza
02:46 PM Revision 6446: input.Makefile: Testing: $(runTest): Ignore failed diffs when the test is compared to another test's output (e.g. in by_col mode)
Aaron Marcuse-Kubitza
02:41 PM Revision 6445: bin/map: in_is_db: If table does not exist, set table to None so that db_xml.put_table() doesn't try to access it. This fixes a bug in metadata-only map spreadsheets under column-based import.
Aaron Marcuse-Kubitza
02:40 PM Revision 6444: db_xml.py: put_table(): Support None in_table by calling put() directly
Aaron Marcuse-Kubitza
02:29 PM Revision 6443: Removed no longer used geoscrub.*.sql. Use geoscrub_output instead.
Aaron Marcuse-Kubitza
02:27 PM Revision 6442: Removed no longer used geoscrub_cleaned_unique. Use geoscrub_output instead.
Aaron Marcuse-Kubitza
02:25 PM Revision 6441: Removed no longer used geoscrub_cultivated. Use analytical_stem_view.cultivated instead.
Aaron Marcuse-Kubitza
02:25 PM Revision 6440: Removed no longer used geoscrub_cultivated. Use analytical_stem_view.cultivated instead.
Aaron Marcuse-Kubitza
02:23 PM Revision 6439: schemas/vegbien.sql: analytical_stem_view: cultivated: Removed BIEN2's geoscrub_cultivated, which has now been replaced by the primary corresponding scripts (and never had particularly many matches to the locations in any case)
Aaron Marcuse-Kubitza
02:14 PM Revision 6438: schemas/vegbien.sql: analytical_stem_view: cultivated: Use OR instead of _or() to combine cultivated_family_locations.country IS NOT NULL with the other values, because this field's false value should not be used in place of NULL if all the other values are NULL, as it would be with _or(). (cultivated_family_locations.country IS NOT NULL can indicate presence, but not absence, of cultivated status.)
Aaron Marcuse-Kubitza
02:06 PM Revision 6437: schemas/functions.sql, vegbien.sql: _and(), _or(): Added comment comparing the function and the corresponding logical operator
Aaron Marcuse-Kubitza
01:50 PM Revision 6436: schemas/vegbien.sql: public: Added _or(), for use by analytical_stem_view
Aaron Marcuse-Kubitza
01:48 PM Revision 6435: schemas/vegbien.sql: analytical_stem_view: cultivated: Also set if family/country combination found in cultivated_family_locations
Aaron Marcuse-Kubitza
01:39 PM Revision 6434: schemas/vegbien.sql: cultivated_family_locations: Added data from nimoy:/home/boyle/bien2/geoscrub/cultivated/cult_by_taxon/flag_by_taxa.inc
Aaron Marcuse-Kubitza
01:33 PM Revision 6433: schemas/vegbien.sql: Added cultivated_family_locations to store locations where various taxon families are considered cultivated
Aaron Marcuse-Kubitza
01:24 PM Revision 6432: mappings/VegCore-VegBIEN.csv: Mapped locality description fields to location.iscultivated using _locationnarrative_is_cultivated()
Aaron Marcuse-Kubitza
01:23 PM Revision 6431: xml_func.py: Simplifying functions: Added passthru entries for _and, _or
Aaron Marcuse-Kubitza
01:06 PM Revision 6430: schemas/vegbien.sql: Added _locationnarrative_is_cultivated()
Aaron Marcuse-Kubitza
12:57 PM Revision 6429: lib/PostgreSQL-MySQL.csv: Change text to varchar(255) because text columns can't be used in indexes in MySQL
Aaron Marcuse-Kubitza
12:51 PM Revision 6428: lib/PostgreSQL-MySQL.csv: Resaved in Excel, which removed unnecessary quotes around fields
Aaron Marcuse-Kubitza
12:22 PM Revision 6427: schemas/vegbien.sql: analytical_aggregate: Added identifiedBy, which is no longer a scoping field (which would prevent scientificNameWithMorphospecies from being unique) now that there is only one taxondetermination for each taxonoccurrence
Aaron Marcuse-Kubitza
12:05 PM Revision 6426: schemas/vegbien.sql: analytical_stem_view: dateCollected: For plots data, use the locationevent obsstartdate instead of the collectiondate in order to group taxonoccurrences/stems from the same locationevent together
Aaron Marcuse-Kubitza
11:59 AM Revision 6425: schemas/vegbien.sql: analytical_* pkeys: Added dateCollected because the records are actually unique within the location*event*, not the location
Aaron Marcuse-Kubitza
11:57 AM Revision 6424: schemas/vegbien.sql: analytical_stem_view: Exclude records with no collectiondate or obsstartdate, which is required to uniquely identify a record
Aaron Marcuse-Kubitza
11:54 AM Revision 6423: analytical_stem_view: dateCollected: Use locationevent.obsstartdate when aggregateoccurrence.collectiondate is not provided
Aaron Marcuse-Kubitza
11:37 AM Revision 6422: schemas/vegbien.sql: analytical_stem_view: Include only the current taxondetermination for each taxonoccurrence, to avoid cross-joining taxondeterminations with stems and thus multiplying the number of rows for datasources that have multiple taxondeterminations per taxonoccurrence
Aaron Marcuse-Kubitza
11:33 AM Revision 6421: schemas/vegbien.sql: taxondetermination: Added AFTER trigger to set the current taxondetermination for the taxonoccurrence
Aaron Marcuse-Kubitza
11:11 AM Revision 6420: lib/PostgreSQL-MySQL.csv: Statements ending in ";": When matching any character, use .*? (with the (?s) flag) instead of [^;]* in order to allow embedded ; to be matched. This fixes a bug where a CREATE VIEW statement was not removed because it contained an embedded ; .
Aaron Marcuse-Kubitza
11:06 AM Revision 6419: schemas/vegbien.sql: taxondetermination: Added unique index to ensure that there is only one current determination for each taxonoccurrence
Aaron Marcuse-Kubitza
11:05 AM Revision 6418: lib/PostgreSQL-MySQL.csv: Remove indexes with WHERE clauses
Aaron Marcuse-Kubitza
10:34 AM Revision 6417: schemas/vegbien.sql: analytical_aggregate: Added primary key on institutionCode, plotName, scientificNameWithMorphospecies, recordNumber. Note that this makes these fields NOT NULL, which should not be a problem because there are inner joins instead of LEFT JOINs on most of the tables which provide them, and LEFT JOINed tables have their identifying fields combined to create a NOT NULL value.
Aaron Marcuse-Kubitza
10:27 AM Revision 6416: schemas/vegbien.sql: analytical_stem_view: recordNumber: Combine identifying fields in taxonoccurrence, plantobservation, and stemobservation to ensure that this field is unique within the plot and not NULL
Aaron Marcuse-Kubitza
10:23 AM Revision 6415: lib/PostgreSQL-MySQL.csv: Only match a statement-terminating ; when it's at the end of a line
Aaron Marcuse-Kubitza
10:02 AM Revision 6414: schemas/vegbien.sql: analytical_aggregate: Added primary key on institutionCode, plotName, scientificNameWithMorphospecies. Note that this makes these fields NOT NULL, which should not be a problem because there are inner joins instead of LEFT JOINs on the tables which provide them.
Aaron Marcuse-Kubitza
09:21 AM Revision 6413: db_xml.py: put(): _setDefault(): Delay the evaluation of each col_default's value until the col_default is actually retrieved. This fixes a bug in the source table mappings where the explicit source entry was being created *after* the col_default source entry, causing the initial entry, which did not have the additional fields populated, to be used instead.
Aaron Marcuse-Kubitza
09:14 AM Revision 6412: dicts.py: Added WrapDict, a dict that runs a function on each value retrieved
Aaron Marcuse-Kubitza
08:59 AM Revision 6411: db_xml.py: put(): _setDefault(): Fixed bug where need to copy col_defaults before calling update() on it, to avoid modifying the input value (which may be reused by the caller, expecting it to be unmodified)
Aaron Marcuse-Kubitza
08:54 AM Revision 6410: db_xml.py: put(): col_defaults param: Fixed bug where need to use None as default value, because col_defaults will be modified by put() and the {} default value is a global instance
Aaron Marcuse-Kubitza
08:29 AM Revision 6409: mappings/VegCore-VegBIEN.csv: source table mappings: Set shortname to env var $source when it's not explicitly specified, because shortname is a required field of source
Aaron Marcuse-Kubitza
08:16 AM Revision 6408: db_xml.py: put(): Pass through the values of nodes which are text nodes
Aaron Marcuse-Kubitza
08:15 AM Revision 6407: db_xml.py: put(): put_(): Support _setDefault() values which are text nodes, by passing text strings through when put_() is run on all col_defaults entries
Aaron Marcuse-Kubitza
07:50 AM Revision 6406: db_xml.py: put(): _setDefault(): Support setting multiple col_defaults at once by using the param names themselves as the column names
Aaron Marcuse-Kubitza
07:47 AM Revision 6405: dicts.py: DictProxy: Implemented __delitem__()
Aaron Marcuse-Kubitza
07:32 AM Revision 6404: bin/map: update_in_label(): Removed hardcoded source_id col_default, which is now set in mappings/VegCore-VegBIEN.csv's output root
Aaron Marcuse-Kubitza
07:29 AM Revision 6403: mappings/VegCore-VegBIEN.csv: Set the source_id col_default to the datasource name using the new _setDefault() built-in function and _env()
Aaron Marcuse-Kubitza
07:25 AM Revision 6402: db_xml.py: put(): Added _setDefault() built-in function, which adds an entry to col_defaults
Aaron Marcuse-Kubitza
07:23 AM Revision 6401: xml_func.py: _env(): Fixed bug where need to retrieve actual string value of name param using xml_dom.NodeTextEntryIter instead of NodeEntryIter
Aaron Marcuse-Kubitza
07:20 AM Revision 6400: xml_func.py: _env(): Fixed bug where need to use xml_dom.replace_with_text() instead of xml_dom.replace() because replace() requires a DOM node
Aaron Marcuse-Kubitza
06:44 AM Revision 6399: bin/map: update_in_label(): Set $source env var to the in_label (datasource name), to make it available to _env()
Aaron Marcuse-Kubitza
06:43 AM Revision 6398: xml_func.py: Simplifying functions: Added _env()
Aaron Marcuse-Kubitza
06:05 AM Revision 6397: Added inputs/VegBank/Source/, containing referenceType metadata
Aaron Marcuse-Kubitza
06:00 AM Revision 6396: Added inputs/SpeciesLink/Source/, containing referenceType metadata
Aaron Marcuse-Kubitza
05:55 AM Revision 6395: Added inputs/SALVIAS*/Source/, containing referenceType metadata
Aaron Marcuse-Kubitza
05:47 AM Revision 6394: Added inputs/REMIB/Source/, containing referenceType metadata
Aaron Marcuse-Kubitza
05:41 AM Revision 6393: Added inputs/GBIF/Source/, containing referenceType metadata
Aaron Marcuse-Kubitza
05:34 AM Revision 6392: Added inputs/TEAM/Source/, containing referenceType metadata
Aaron Marcuse-Kubitza
05:33 AM Revision 6391: Placed inputs/TEAM/_src/Vegetation-Tree-and-Liana-Metadata-1.5.pdf under version control
Aaron Marcuse-Kubitza
05:27 AM Revision 6390: inputs/FIA/import_order.txt: Added Source, which needs to come before Organism
Aaron Marcuse-Kubitza
05:22 AM Revision 6389: Added inputs/Madidi/Source/, containing referenceType metadata
Aaron Marcuse-Kubitza
05:19 AM Revision 6388: Added inputs/FIA/Source/, containing referenceType metadata
Aaron Marcuse-Kubitza
05:14 AM Revision 6387: Added inputs/CVS/Source/, containing referenceType metadata
Aaron Marcuse-Kubitza
05:07 AM Revision 6386: Added inputs/CTFS/Source/, containing referenceType metadata
Aaron Marcuse-Kubitza
05:05 AM Revision 6385: bin/map: Support map spreadsheets containing only metadata mappings (with no corresponding staging table), by falling back to an empty table when the named table does not exist
Aaron Marcuse-Kubitza
04:19 AM Revision 6384: mappings/VegCore-VegBIEN.csv: institutionCode: Also map to the sourcename's matched source, which identifies whether the source is a herbarium
Aaron Marcuse-Kubitza
04:08 AM Revision 6383: schemas/vegbien.sql: source: Made shortname NOT NULL to ensure that all datasources have a globally-unique short name
Aaron Marcuse-Kubitza
03:33 AM Revision 6382: import_all: Added import of inputs/.herbaria/ before the main import
Aaron Marcuse-Kubitza
03:28 AM Revision 6381: Added inputs/.herbaria/
Aaron Marcuse-Kubitza
03:25 AM Revision 6380: input.Makefile: SVN: add: Also run %/add on all data subdirs
Aaron Marcuse-Kubitza
03:21 AM Revision 6379: input.Makefile: Existing maps discovery: Moved tables discovery to its own section, above SVN so it can be used by SVN
Aaron Marcuse-Kubitza
03:11 AM Revision 6378: mappings/VegCore.csv: referenceType: Fixed sort order
Aaron Marcuse-Kubitza
03:09 AM Revision 6377: mappings/VegCore-VegBIEN.csv: Mapped referenceType
Aaron Marcuse-Kubitza
03:06 AM Revision 6376: mappings/VegCore.csv: Added referenceType
Aaron Marcuse-Kubitza
02:10 AM Revision 6375: mappings/VegCore-VegBIEN.csv: institutionCode: Remap to source.shortname when specimen information is not provided, as is the case for geoscrub.herbaria on nimoy
Aaron Marcuse-Kubitza
01:47 AM Revision 6374: inputs/bien_web/observation/map.csv: Mapped observationID->occurrenceID
Aaron Marcuse-Kubitza
01:20 AM Revision 6373: README.TXT: Datasource setup: Add input data for each table present in the datasource: Added step to run `make inputs/<datasrc>/<table>/install` if the table is in a .sql export
Aaron Marcuse-Kubitza
01:17 AM Revision 6372: README.TXT: Datasource setup: MySQL inputs: Added step to install the export, which needs to happen before mapping individual tables
Aaron Marcuse-Kubitza
01:13 AM Revision 6371: README.TXT: Datasource setup: Add input data for each table present in the datasource: Replaced "CSV" with "CSV(s)" because there can be multiple CSV part files for one table
Aaron Marcuse-Kubitza
01:11 AM Revision 6370: README.TXT: Datasource setup: Add input data for each table present in the datasource: Don't add a CSV or create.sql file for tables that are in a .sql export
Aaron Marcuse-Kubitza
01:06 AM Revision 6369: README.TXT: Schema changes: Sync ERD with vegbien.sql schema: Changed instructions to just select tables with arrows next to them rather than all tables, because each table that's updated will have its lines reset and the number of lines that need to be fixed should be minimized
Aaron Marcuse-Kubitza
01:02 AM Revision 6368: README.TXT: Datasource setup: Accept the test cases: `make inputs/<datasrc>/test by_col=1`: Clarified that errors could indicate bugs in the *VegBIEN* unique constraints
Aaron Marcuse-Kubitza
12:59 AM Revision 6367: README.TXT: Data import: To remake analytical DB: Added explicit public schema setting since the analytical DB is often manually remade *after* the public schema has been renamed. Removed warnings that certain commands must be run after running make_analytical_db, because the "remake analytical DB" instructions no longer require this.
Aaron Marcuse-Kubitza
12:48 AM Revision 6366: README.TXT: Datasource setup: MySQL inputs: Added steps to export the database to a PostgreSQL-compatible .sql file, which can be directly used by the install process without the need to export each table as CSV
Aaron Marcuse-Kubitza
12:36 AM Revision 6365: README.TXT: Datasource setup: Choosing a table name: Documented that for .sql exports, you must use the name of the table in the DB export, not a suggested or custom name
Aaron Marcuse-Kubitza
12:34 AM Revision 6364: input.Makefile: Staging tables installation: $(dbExports): Also include the files that would be generated by running _MySQL/*.make and creating the corresponding PostgreSQL translations
Aaron Marcuse-Kubitza
12:18 AM Revision 6363: input.Makefile: Staging tables installation: Moved .sql export downloading and translation to separate Input data retrieval section
Aaron Marcuse-Kubitza

11/23/2012

11:41 PM Revision 6362: Added lib/MySQL.{data,schema}.sql.make templates to use in datasources' _MySQL/ dirs
Aaron Marcuse-Kubitza
10:38 PM Revision 6361: inputs/import.stats.xls: Updated import times
Aaron Marcuse-Kubitza

11/21/2012

11:13 PM Revision 6360: schemas/vegbien.sql: analytical_stem_view: scientificNameWithMorphospecies: Changed to use Brad's formula, which concatenates genus and specific_epithet/morphospecies, and uses family if just the family is present, rather than using the full taxonomic name
Aaron Marcuse-Kubitza
11:05 PM Revision 6359: mappings/VegCore-VegBIEN.csv: Concatenated taxonlabel: Don't prepend family if the taxonName/scientificName itself is the family, so that the family is not duplicated in the concatenated taxonomic name
Aaron Marcuse-Kubitza
10:19 PM Revision 6358: schemas/functions.sql: _nullIf(): Removed NOT NULL constraint on null param, to support use a (nullable) column rather than a literal as the null-equivalent value
Aaron Marcuse-Kubitza
09:08 PM Revision 6357: xml_func.py: Simplifying functions: Added _nullIf(), to remove calls with no null value
Aaron Marcuse-Kubitza
09:00 PM Revision 6356: xml_dom.py: Added prune_parent()
Aaron Marcuse-Kubitza
08:51 PM Revision 6355: schemas/functions.sql: Added _or()
Aaron Marcuse-Kubitza
08:20 PM Revision 6354: schemas/functions.sql: Added _merge_words()
Aaron Marcuse-Kubitza
08:04 PM Revision 6353: schemas/vegbien.sql: analytical_*: Renamed geosourceValid to geovalid. (It had gotten renamed in the reference -> source rename.)
Aaron Marcuse-Kubitza
08:00 PM Revision 6352: mappings/VegCore.csv: Renamed georeferenceValid to geovalid
Aaron Marcuse-Kubitza
07:48 PM Revision 6351: inputs/import.stats.xls: Updated import times. This now includes the Canadensys plants-related datasources HIBG, JBM, QFA, TRT, TRTE, UBC, VASCAN, and WIN.
Aaron Marcuse-Kubitza

11/20/2012

09:59 PM Revision 6350: inputs/import.stats.xls: Updated import times
Aaron Marcuse-Kubitza
09:42 PM Revision 6349: Added inputs/HIBG/
Aaron Marcuse-Kubitza
09:33 PM Revision 6348: Added inputs/JBM/
Aaron Marcuse-Kubitza
09:29 PM Revision 6347: Added inputs/VASCAN/
Aaron Marcuse-Kubitza
09:22 PM Revision 6346: Added inputs/WIN/
Aaron Marcuse-Kubitza
09:18 PM Revision 6345: Added inputs/UBC/
Aaron Marcuse-Kubitza
09:14 PM Revision 6344: Added inputs/TRTE/Specimen/
Aaron Marcuse-Kubitza
09:11 PM Revision 6343: Added inputs/QFA/
Aaron Marcuse-Kubitza
09:06 PM Revision 6342: Added inputs/TRT/
Aaron Marcuse-Kubitza
08:21 PM Revision 6341: schemas/vegbien.sql: Allow bien_read to SELECT from all tables in the public schema
Aaron Marcuse-Kubitza
08:10 PM Revision 6340: schemas/vegbien.sql: Allow bien_read to SELECT from analytical_aggregate, analytical_stem
Aaron Marcuse-Kubitza
08:09 PM Revision 6339: lib/PostgreSQL-MySQL.csv: Removed GRANT/REVOKE because SCHEMA GRANTs are not supported in MySQL
Aaron Marcuse-Kubitza
07:57 PM Revision 6338: pg_dump_vegbien: non-$owners mode: Removed --no-privileges in order to include GRANTs to other users
Aaron Marcuse-Kubitza
07:49 PM Revision 6337: root Makefile: PostgreSQL: $(postgresReload-Linux): Making schemas/*.conf world-readable: Fixed bug where need to do this as the bien user, which owns the files
Aaron Marcuse-Kubitza
07:46 PM Revision 6336: root Makefile: PostgreSQL: $(postgresReload-*): Make schemas/*.conf world-readable so it's readable by the postgres user, which the .conf installation is run as
Aaron Marcuse-Kubitza
07:43 PM Revision 6335: root Makefile: PostgreSQL: $(postgresReload-*): Also install pg_hba.conf
Aaron Marcuse-Kubitza
07:36 PM Revision 6334: root Makefile: PostgreSQL: Added postgres_reload to reload postgresql.conf and restart the DB
Aaron Marcuse-Kubitza
07:30 PM Revision 6333: root Makefile: PostgreSQL: postgres-*: Factored postgresql.conf installation out in to $(postgresReload-*)
Aaron Marcuse-Kubitza
07:15 PM Revision 6332: schemas/: Synced pg_hba.conf and pg_hba.Mac.conf's bien entries, which adds phpPgAdmin support (template1 access) on the Mac and bien_read access on Linux
Aaron Marcuse-Kubitza
06:56 PM Revision 6331: root Makefile: VegBIEN DB: DB and users: Also create bien_read user for read-only access to the DB
Aaron Marcuse-Kubitza
06:53 PM Revision 6330: schemas/pg_hba.Mac.conf: Allow access to the bien group rather than just the bien user, which will include bien_read
Aaron Marcuse-Kubitza
06:35 PM Revision 6329: schemas/pg_hba.Mac.conf: Fixed bug where also need to allow password-based logins from the same machine, in order to work with pgAdmin
Aaron Marcuse-Kubitza
06:06 PM Revision 6328: schemas/vegbien.ERD.poster.pdf: Updated to 33x51in poster size and 0.25in margins
Aaron Marcuse-Kubitza
05:35 PM Revision 6327: README.TXT: Schema changes: Creating a poster of the ERD: Added section with the State St FedEx Kinkos' rates for posters ($10.25/sq ft laminated)
Aaron Marcuse-Kubitza
05:29 PM Revision 6326: README.TXT: Schema changes: Creating a poster of the ERD: Changed "Measure the fractional height of the text onscreen" to "Determine the poster size"
Aaron Marcuse-Kubitza
05:19 PM Revision 6325: Added schemas/vegbien.ERD.poster.pdf
Aaron Marcuse-Kubitza
04:10 PM Revision 6324: Added schemas/vegbien.ERD.poster.core.print_options.png
Aaron Marcuse-Kubitza
04:01 PM Revision 6323: Added schemas/vegbien.ERD.poster.core.pdf
Aaron Marcuse-Kubitza
03:29 PM Revision 6322: schemas/pg_hba.Mac.conf: Fixed bug where needed ident entry for postgres superuser
Aaron Marcuse-Kubitza
03:18 PM Revision 6321: Added config/bien_read_password
Aaron Marcuse-Kubitza
02:53 PM Revision 6320: README.TXT: Schema changes: Added instructions to calculate the minimum VegBIEN poster size (to make the text as least as big as on the VegBank ERD poster), which is 35x54in portrait
Aaron Marcuse-Kubitza

11/19/2012

08:01 PM Revision 6319: schemas/vegbien.sql: analytical_stem_view: cultivated: Use location.iscultivated when taxonoccurrence.iscultivated is not available
Aaron Marcuse-Kubitza
07:55 PM Revision 6318: Added inputs/FIA/FIA_COND_unique/, which contains the oldgrowth flag
Aaron Marcuse-Kubitza
07:53 PM Revision 6317: mappings/VegCore-VegBIEN.csv: Mapped oldGrowth
Aaron Marcuse-Kubitza
07:48 PM Revision 6316: schemas/functions.sql: Added _not()
Aaron Marcuse-Kubitza
07:43 PM Revision 6315: mappings/VegCore.csv: Added oldGrowth
Aaron Marcuse-Kubitza
07:36 PM Revision 6314: mappings/VegCore-VegBIEN.csv: Remapped cultivated to location when a TaxonOccurrence is not provided, indicating that the record is a plot
Aaron Marcuse-Kubitza
07:35 PM Revision 6313: mappings/VegCore-VegBIEN.csv: Remapped cultivated to location when a TaxonOccurrence is not provided, indicating that the record is a plot
Aaron Marcuse-Kubitza
07:25 PM Revision 6312: schemas/vegbien.sql: location: Added iscultivated for cases when entire plots rather than individual taxonoccurrences are marked as cultivated
Aaron Marcuse-Kubitza
07:17 PM Revision 6311: inputs/FIA/: Added FIA_COND table from nimoy.geoscrub and code to generate a unique plot table from it, including the oldgrowth calculated field
Aaron Marcuse-Kubitza
06:46 PM Revision 6310: Added inputs/FIA/Organism/postprocess.sql to cast PlotCD to a bigint
Aaron Marcuse-Kubitza
06:22 PM Revision 6309: my2pg: Also remove (#) after bigint
Aaron Marcuse-Kubitza
06:05 PM Revision 6308: Regenerated vegbien.ERD exports
Aaron Marcuse-Kubitza
06:03 PM Revision 6307: schemas/vegbien.ERD.mwb: Fixed lines
Aaron Marcuse-Kubitza
05:54 PM Revision 6306: schemas/vegbien.ERD.mwb: Fixed lines
Aaron Marcuse-Kubitza
05:54 PM Revision 6305: schemas/vegbien.sql: source: Renamed fulltext to citation because according to the VegBank data dictionary <http://vegbank.org/vegbank/views/dba_tabledescription_detail.jsp?view=detail&wparam=reference&entity=dba_tabledescription&where=where_tablename#fulltext> this is actually the full text *of the reference citation*, not of the reference itself (it would be unusual to store that in VegBank)
Aaron Marcuse-Kubitza
05:48 PM Revision 6304: schemas/vegbien.sql: Removed no longer needed sourcejournal, which can be stored in source and pointed to via parent_id instead of sourcejournal_id. sourcejournal.journal maps to source.fulltext, issn to isbn, and abbreviation to shortname.
Aaron Marcuse-Kubitza
05:48 PM Revision 6303: mappings/VegCore-VegBIEN.csv: Mapped acceptedCounty, county to the matched place
Aaron Marcuse-Kubitza
05:41 PM Revision 6302: schemas/vegbien.sql: source: Added matched_source_id
Aaron Marcuse-Kubitza
05:34 PM Revision 6301: sql.py: parse_exception(): function MissingCastException: If 1st param is hstore, only perform the cast on the value param. This fixes a bug in _map() calls whose value is a non-text type, such as SALVIAS.plotMetadata.AccessCode.
Aaron Marcuse-Kubitza
05:32 PM Revision 6300: sql_io.py: cast(): Use sql_gen.Cast() to generate the cast, in order to take advantage of its support for casts to unknown
Aaron Marcuse-Kubitza
05:30 PM Revision 6299: sql_gen.py: Cast: Support casts to unknown by casting to text first
Aaron Marcuse-Kubitza
04:59 PM Revision 6298: schemas/postgresql.conf: Turn on the error log
Aaron Marcuse-Kubitza
04:58 PM Revision 6297: schemas/pg_hba.conf: Also grant the bien user access to template1, which is accessed by phpPgAdmin
Aaron Marcuse-Kubitza
04:24 PM Revision 6296: schemas/vegbien.sql: source: Added parent_id for nested sources, e.g. an article in a journal
Aaron Marcuse-Kubitza
04:23 PM Revision 6295: lib/forwarding.Makefile: $(subdirs): Also exclude .archive/
Aaron Marcuse-Kubitza
04:09 PM Revision 6294: mappings/VegCore-VegBIEN.csv: Mapped acceptedCounty, county to the matched place
Aaron Marcuse-Kubitza
04:08 PM Revision 6293: schemas/vegbien.ERD.mwb: Fixed lines
Aaron Marcuse-Kubitza
03:54 PM Revision 6292: Renamed inputs/_archive/ to .archive/ so it wouldn't be treated as a datasource
Aaron Marcuse-Kubitza
03:49 PM Revision 6291: README.TXT: Documentation: Redmine-formatted list of steps for column-based import: Use ACAD instead of QMOR, which was removed
Aaron Marcuse-Kubitza
03:45 PM Revision 6290: inputs/Makefile: Import logs: $(rsyncLogs): Include log files at any depth in the directory tree rather than just 1-2 levels deep. This adds log files whose containing directories have been moved to _archive/ directories.
Aaron Marcuse-Kubitza
03:29 PM Revision 6289: Added inputs/_archive/
Aaron Marcuse-Kubitza
03:27 PM Revision 6288: Removed inputs/QMOR/ because it's an insect collection
Aaron Marcuse-Kubitza
03:25 PM Revision 6287: schemas/vegbien.sql: projectcontributor: Removed surname, since this information is stored in party_id->party.surname
Aaron Marcuse-Kubitza
03:23 PM Revision 6286: schemas/vegbien.sql: projectcontributor: Removed cheatrole, since there is already a role field and this field was unused in VegBank
Aaron Marcuse-Kubitza
03:21 PM Revision 6285: schemas/vegbien.sql: role: Added values from projectcontributor.ROLE_ID <http://vegbank.org/vegbank/views/dba_tabledescription_detail.jsp?view=detail&wparam=projectcontributor&entity=dba_tabledescription&where=where_tablename#ROLE_ID>
Aaron Marcuse-Kubitza
03:17 PM Revision 6284: schemas/vegbien.sql: sourcecontributor: role: Changed type to role
Aaron Marcuse-Kubitza
03:15 PM Revision 6283: schemas/vegbien.sql: role enum: Added VegBank data dictionary values from <http://vegbank.org/vegbank/views/dba_fielddescription_detail.jsp?view=detail&wparam=1331&entity=dba_fielddescription&params=1331>
Aaron Marcuse-Kubitza
03:03 PM Revision 6282: schemas/vegbien.sql: sourcecontributor: Renamed position to order for consistency with the ERD definition <http://vegbank.org/vegbank/views/dba_tabledescription_detail.jsp?view=detail&wparam=referencecontributor&entity=dba_tabledescription&where=where_tablename#position> and disambiguation from other meanings of position which are similar to role
Aaron Marcuse-Kubitza
03:00 PM Revision 6281: schemas/vegbien.sql: sourcecontributor: Renamed roletype to role for consistency with the ERD definition <http://vegbank.org/vegbank/views/dba_tabledescription_detail.jsp?view=detail&wparam=referencecontributor&entity=dba_tabledescription&where=where_tablename#roleType>
Aaron Marcuse-Kubitza
02:53 PM Revision 6280: inputs/.geoscrub/geoscrub_output/map.csv: Mapped to county, acceptedCounty
Aaron Marcuse-Kubitza
02:52 PM Revision 6279: mappings/VegCore-VegBIEN.csv: Mapped acceptedCounty, county to the matched place
Aaron Marcuse-Kubitza
02:50 PM Revision 6278: mappings/VegCore.csv: Added acceptedCounty
Aaron Marcuse-Kubitza
02:42 PM Revision 6277: schemas/pg_hba.Mac.conf: Changed to match schemas/pg_hba.conf
Aaron Marcuse-Kubitza
02:37 PM Revision 6276: schemas/pg_hba.conf: Fixed bug where also need an IPv6 bien entry with md5 authentication, because the IPv4 md5 authentication does not apply to "localhost", which is translated to the IPv6 address ::1
Aaron Marcuse-Kubitza
02:27 PM Revision 6275: schemas/pg_hba.conf: Fixed bug where also need a *local* bien entry with md5 authentication, because the host-based md5 authentication applies only to literal IP addresses, not "localhost"
Aaron Marcuse-Kubitza
02:08 PM Revision 6274: Added schemas/pg_hba.Mac.conf
Aaron Marcuse-Kubitza
02:01 PM Revision 6273: schemas/pg_hba.conf: Restrict all accesses to the server except the bien user accessing vegbien using ident or a password, and the postgres superuser logging in using ident
Aaron Marcuse-Kubitza
01:25 PM Revision 6272: inputs/.geoscrub/geoscrub_output/map.csv: Mapped countyvalidity to latLongInCounty
Aaron Marcuse-Kubitza
01:24 PM Revision 6271: schemas/functions.sql: _map(): Fixed bug where entries that map to NULL were incorrectly being treated as if the entry didn't exist. Note that -> returns NULL both if the entry's value is NULL and if the entry doesn't exist, so ? must be used to recheck the presence of the key in the hstore.
Aaron Marcuse-Kubitza
12:48 PM Revision 6270: mappings/VegCore-VegBIEN.csv: Mapped latLongInCounty
Aaron Marcuse-Kubitza
12:46 PM Revision 6269: mappings/VegCore.csv: Added latLongInCounty
Aaron Marcuse-Kubitza
12:43 PM Revision 6268: schemas/vegbien.sql: Added distance_to_county_m. Note that this can also be used to store latLongInCounty by mapping true to 0 and false to -1.
Aaron Marcuse-Kubitza
12:22 PM Revision 6267: schemas/pg_hba.conf: Changed trust authentication back to ident/md5. Not sure how it got set to trust since I used md5 when enabling remote access to the DB for the bien user.
Aaron Marcuse-Kubitza
12:08 PM Revision 6266: Added schemas/pg_hba.conf
Aaron Marcuse-Kubitza
11:48 AM Revision 6265: schemas/vegbien.sql: place: Removed placecode to prevent datasources from creating duplicate entries for the same place, with different placecodes. This was a problem with the original BIEN2 geoscrub dataset, which contained duplicates.
Aaron Marcuse-Kubitza
10:54 AM Revision 6264: inputs/import.stats.xls: Updated import times
Aaron Marcuse-Kubitza

11/16/2012

07:06 PM Revision 6263: Regenerated vegbien.ERD exports
Aaron Marcuse-Kubitza
07:03 PM Revision 6262: schemas/vegbien.sql: analytical_stem_view: Fixed bug where need to join taxonoccurrence.collector_id to party because it's now an fkey rather than a literal name
Aaron Marcuse-Kubitza
06:58 PM Revision 6261: schemas/vegbien.sql: analytical_*: Added coordinateUncertaintyInMeters
Aaron Marcuse-Kubitza
06:34 PM Revision 6260: schemas/vegbien.sql: analytical_stem_view: Join to newWorldCountries on 2-digit ISO code instead of country name, to increase (BIEN2) newWorldCountries and GADM overlap
Aaron Marcuse-Kubitza
06:29 PM Revision 6259: psql_vegbien: Run with sh because it no longer needs bash support
Aaron Marcuse-Kubitza
06:28 PM Revision 6258: psql_script_vegbien: Fixed bug where needs to be run with bash instead of sh
Aaron Marcuse-Kubitza
06:27 PM Revision 6257: Added inputs/newWorld/iso_code_gadm/
Aaron Marcuse-Kubitza
06:16 PM Revision 6256: Added inputs/newWorld/_src/
Aaron Marcuse-Kubitza
06:15 PM Revision 6255: inputs/XAL/Specimen/map.csv: darwin:FieldNumber: Removed command to determine that field is unused, because UNUSED is a factual assertion that does not need a reason to be specified each time
Aaron Marcuse-Kubitza
06:11 PM Revision 6254: inputs/XAL/Specimen/map.csv: Remapped darwin:CoordinatePrecision to UNUSED
Aaron Marcuse-Kubitza
06:08 PM Revision 6253: inputs/NY/Specimen/map.csv: Remapped CoordinatePrecision to coordinateUncertaintyInMeters, assuming units of m based on the range and precision of values
Aaron Marcuse-Kubitza
06:03 PM Revision 6252: mappings/VegCore.csv: coordinatePrecision: Added units (degrees) to form coordinatePrecision_deg
Aaron Marcuse-Kubitza
06:00 PM Revision 6251: mappings/VegCore-VegBIEN.csv: Removed mapping for coordinatePrecision, which is not the same as coordsaccuracy_m. coordinatePrecision is instead "the precision of the coordinates" themselves in degrees (<http://rs.tdwg.org/dwc/terms/#coordinatePrecision>).
Aaron Marcuse-Kubitza
05:53 PM Revision 6250: schemas/vegbien.sql: coordinates: Changed coordinates.coordsaccuracy_deg units to m
Aaron Marcuse-Kubitza
05:51 PM Revision 6249: Regenerated inputs/bien_web/observation/test.xml.ref
Aaron Marcuse-Kubitza
05:17 PM Revision 6248: schemas/vegbien.ERD.mwb: Added projectcontributor, locationeventcontributor to ERD
Aaron Marcuse-Kubitza
05:02 PM Revision 6247: schemas/vegbien.sql: higher_plant_group_nodes: Added root->NULL mapping to store all the families that don't match any higher plant group
Aaron Marcuse-Kubitza
04:58 PM Revision 6246: schemas/vegbien.sql: higher_plant_group_nodes: Allow NULL values for higher_plant_group, to allow mapping all remaining families to NULL in family_higher_plant_group
Aaron Marcuse-Kubitza
04:09 PM Revision 6245: psql_vegbien: Fixed bug where did not display command prompt when run from command line, by moving automatic setting of search_path to psql_script_vegbien. psql_script_vegbien is now used instead of psql_vegbien wherever the search_path needs to be set, so removing this functionality from psql_vegbien is not a problem.
Aaron Marcuse-Kubitza
04:03 PM Revision 6244: input.Makefile: BIEN commands: $(psqlAsBien): Use psql_script_vegbien, which automatically adds the $(psqlOpts), instead of psql_vegbien
Aaron Marcuse-Kubitza
03:54 PM Revision 6243: schemas/functions.sql: _map(): Support any entry having the value '*' (not just the '*' entry), which passes through that value. Support an entry having the value '!', which raises an exception.
Aaron Marcuse-Kubitza
03:40 PM Revision 6242: inputs/SALVIAS/plotMetadata_/map.csv: AccessCode: Removed _map entry for 4, which does not apply to plots
Aaron Marcuse-Kubitza
03:07 PM Revision 6241: schemas/vegbien.ERD.mwb: Fixed lines
Aaron Marcuse-Kubitza
01:00 PM Revision 6240: schemas/vegbien.sql: locationevent: Added accesslevel
Aaron Marcuse-Kubitza
12:54 PM Revision 6239: inputs/SALVIAS/plotMetadata_/map.csv: Mapped AccessCode to dcterms:accessRights with appropriate _map filter
Aaron Marcuse-Kubitza
12:49 PM Revision 6238: Added inputs/.geoscrub/geoscrub_cleaned_unique/_no_import to disable geoscrub_cleaned_unique, since the new geoscrub_output supersedes it
Aaron Marcuse-Kubitza
12:47 PM Revision 6237: Added inputs/.geoscrub/geoscrub_output/
Aaron Marcuse-Kubitza
12:46 PM Revision 6236: Added inputs/.geoscrub/_src/README.TXT
Aaron Marcuse-Kubitza
12:29 PM Revision 6235: Regenerated inputs/bien_web/observation/VegBIEN.csv
Aaron Marcuse-Kubitza
12:24 PM Revision 6234: Added inputs/.geoscrub/_src/ to store Jim's geoscrub CSV
Aaron Marcuse-Kubitza
12:21 PM Revision 6233: schemas/functions.sql: _map(): Changed error message for an unmapped value to "Value not in map" rather than "Invalid map value", because an unmapped value is not necessarily explicitly invalid
Aaron Marcuse-Kubitza
12:16 PM Revision 6232: inputs/VegBank/plot_/map.csv: confidentialitystatus filter: Merged mappings for 0 with other public-equivalent fields. Note that fuzzed plots are still public, because the private columns have been removed.
Aaron Marcuse-Kubitza

11/15/2012

11:16 PM Revision 6231: inputs/VegBank/plot_/map.csv: Mapped confidentialitystatus to dcterms:accessRights with an appropriate _map filter
Aaron Marcuse-Kubitza
11:16 PM Revision 6230: mappings/VegCore-VegBIEN.csv: Mapped dcterms:accessRights
Aaron Marcuse-Kubitza
11:14 PM Revision 6229: schemas/functions.sql: _map(): Raise data_exception if value not in map and no default provided (not the same as a NULL default value)
Aaron Marcuse-Kubitza
10:54 PM Revision 6228: mappings/VegCore-VegBIEN.csv: verbatimGrowthForm: Removed _map filter, which applied only to SALVIAS and has now been moved to the applicable SALVIAS tables
Aaron Marcuse-Kubitza
10:51 PM Revision 6227: inputs/SALVIAS*/plotObservations/map.csv: Remapped Habit to growthForm with _map filter applied
Aaron Marcuse-Kubitza
10:43 PM Revision 6226: sql_io.py: put_table(): Special handling for functions with hstore params: Fixed bug where need to unwrap literal values of mapping, which might be sql_gen.Literal objects
Aaron Marcuse-Kubitza
10:43 PM Revision 6225: sql_gen.py: Added get_value()
Aaron Marcuse-Kubitza
10:42 PM Revision 6224: dicts.py: join(): Added support for unhashable types, which are passed through. This adds support for SQL literal values which are dicts (hstores).
Aaron Marcuse-Kubitza
10:25 PM Revision 6223: xml_func.py: Removed no longer used _map(), which has been replaced by a corresponding DB function
Aaron Marcuse-Kubitza
10:22 PM Revision 6222: schemas/functions.sql: Added _map(), which uses the new hstore functionality. This expands _map() functionality to column-based import.
Aaron Marcuse-Kubitza
10:20 PM Revision 6221: root Makefile: VegBIEN DB: DB and bien user: mk_db: hstore extension: Fixed bug where need to use `CREATE EXTENSION hstore SCHEMA pg_catalog` instead of createlang, because hstore must be explicitly created in pg_catalog or else it will be created in the public schema instead, causing it to get deleted every time the public schema is reinstalled and cascading the delete to everything (including in other schemas) that uses hstore
Aaron Marcuse-Kubitza
10:04 PM Revision 6220: sql_io.py: put_table(): Added special handling for functions with hstore params. Note that although _map() doesn't exist yet as a DB function, this code must be in place before _map() is created to avoid param type mismatch errors.
Aaron Marcuse-Kubitza
08:57 PM Revision 6219: root Makefile: PostgreSQL: postgres-Linux: Changed plpython to plpython3 in order to install plpython3u
Aaron Marcuse-Kubitza
08:30 PM Revision 6218: schemas/py_functions.sql: _date(): Removed features that require dateutil, which is not available under plpython3u. This includes removing the now-unused date string parameter.
Aaron Marcuse-Kubitza
08:26 PM Revision 6217: mappings/VegCore-VegBIEN.csv: Removed _date/date, because _date using a string date argument is no longer supported under plpython3u (dateutil is missing). Note that PostgreSQL's own date parsing is sufficient for most dates, so this use of _date is not strictly necessary and removing it will improve import times.
Aaron Marcuse-Kubitza
08:12 PM Revision 6216: schemas/py_functions.sql: Replaced xrange() with range() for plpython3u
Aaron Marcuse-Kubitza
08:05 PM Revision 6215: root Makefile: Python: python-Linux: Also install python3, needed by plpython3u
Aaron Marcuse-Kubitza
08:04 PM Revision 6214: schemas/py_functions.sql: Updated except clause syntax for PostgreSQL 9.1.6
Aaron Marcuse-Kubitza
08:03 PM Revision 6213: schemas/*.sql: Updated for PostgreSQL 9.1.6, which has standard_conforming_strings = on (which affects \-escapes in string literals), escape_string_warning not explicitly set, and uses ALTER TABLE ONLY instead of ALTER TABLE
Aaron Marcuse-Kubitza
07:49 PM Revision 6212: README.TXT: Removed step to manually run make_analytical_db, now that this is done automatically by import_all. Added separate instructions to remake the analytical DB.
Aaron Marcuse-Kubitza
07:45 PM Revision 6211: import_all: Change to main directory make targets are run from. Use relative paths to bin/ commands, which is possible now that the current dir is set.
Aaron Marcuse-Kubitza
07:41 PM Revision 6210: import_all: Create a background process that waits until the import is done and then runs make_analytical_db
Aaron Marcuse-Kubitza
07:36 PM Revision 6209: Added waitpid
Aaron Marcuse-Kubitza
06:52 PM Revision 6208: import_all: Documented that `wait %1` waits for asynchronous commands
Aaron Marcuse-Kubitza
06:40 PM Revision 6207: root Makefile: VegBIEN DB: DB and bien user: mk_db: Also install hstore extension. Note that this is only supported by PostgreSQL 9.1+.
Aaron Marcuse-Kubitza
06:33 PM Revision 6206: input.Makefile: Editing import: Updated queries for current schema
Aaron Marcuse-Kubitza
06:27 PM Revision 6205: inputs/.geoscrub/geoscrub_cultivated/create.sql: Fixed bug where need to filter out NULL lat/longs because primary keys can't contain NULL values
Aaron Marcuse-Kubitza
06:17 PM Revision 6204: schemas/py_functions.sql: Changed function languages to plpython3u to match the new installed version. Note that plpythonu is not available on Mac under PostgreSQL 9.1.6.
Aaron Marcuse-Kubitza
05:59 PM Revision 6203: reinstall_all: Fixed bug where also need to include datasources starting with . such as .TNRS/, by using with_all's new $all option
Aaron Marcuse-Kubitza
05:58 PM Revision 6202: with_all: Added $all option to also include datasources starting with . such as .TNRS/. This is necessary for reinstall_all, which needs to install *all* datasources.
Aaron Marcuse-Kubitza
05:18 PM Revision 6201: root Makefile: PostgreSQL: $(pg_ctl-*): Fixed bug where need to pause for a few seconds after restarting PostgreSQL, to wait for the server to be ready to accept connections
Aaron Marcuse-Kubitza
05:12 PM Revision 6200: root Makefile: Installation: uninstall: Removed inputs/uninstall because the DB will be uninstalled anyway, so the inputs don't need to be individually removed first
Aaron Marcuse-Kubitza
05:11 PM Revision 6199: schemas/postgresql.Mac.conf: Added back unix_socket_directory setting, which is apparently still needed in PostgreSQL 9.1.6
Aaron Marcuse-Kubitza
05:06 PM Revision 6198: root Makefile: PostgreSQL: postgres-Linux: Also install postgresql.conf
Aaron Marcuse-Kubitza
04:54 PM Revision 6197: root Makefile: PostgreSQL: postgres-Darwin: Also install postgresql.Mac.conf
Aaron Marcuse-Kubitza
04:40 PM Revision 6196: root Makefile: PostgreSQL: $(macUsePostgresLib): Factored out PostgreSQL dir to $(macPostgresDir)
Aaron Marcuse-Kubitza
04:38 PM Revision 6195: schemas/postgresql.Mac.conf: Updated to PostgreSQL 9.1.6's postgresql.conf
Aaron Marcuse-Kubitza
04:29 PM Revision 6194: root Makefile: Datasources: inputs/install: Fixed bug where need to `wait` after `. bin/reinstall_all` to wait for inputs to finish installing before installing the public schema. This is necessary because views in the public schema now have dependencies on some datasources, such as TNRS.
Aaron Marcuse-Kubitza
04:25 PM Revision 6193: root Makefile: PostgreSQL: $(psqlAsAdmin): Use new $(asAdmin)
Aaron Marcuse-Kubitza
04:25 PM Revision 6192: root Makefile: VegBIEN DB: Schemas: schemas/public/install: Use $(psqlNoSearchPath) instead of $(psqlAsBien) because the search_path is set by vegbien.sql
Aaron Marcuse-Kubitza
04:16 PM Revision 6191: root Makefile: Datasources: Added inputs/install override which runs `. bin/reinstall_all` instead, in order to install all datasources simultaneously
Aaron Marcuse-Kubitza
04:03 PM Revision 6190: root Makefile: Python: python-Darwin: Added instructions to install Python 3.2 (Python 2 comes with Mac OS X, but Python 3.2 is needed for plpython3u)
Aaron Marcuse-Kubitza
03:55 PM Revision 6189: root Makefile: VegBIEN DB: DB and bien user: mk_db: Updated for PostgreSQL 9.1.6 on the Mac, which only provides plpython3u (Python 3)
Aaron Marcuse-Kubitza
03:54 PM Revision 6188: root Makefile: VegBIEN DB: DB and bien user: mk_db: Updated for PostgreSQL 9.1.6, which requires the DB name to be specified on the command line instead of in the $PGDATABASE env var set by postgres_vegbien. Fixed bug where need to run createlang as postgres superuser, because plpythonu is an untrusted language (with unrestricted access to the entire DB).
Aaron Marcuse-Kubitza
03:51 PM Revision 6187: root Makefile: PostgreSQL: postgres-Darwin: Updated for PostgreSQL 9.1.6, which requires some /usr/lib/ symlinks to be changed to newer versions installed in the PostgreSQL lib/ dir
Aaron Marcuse-Kubitza
03:49 PM Revision 6186: input.Makefile: $(psqlAsBien), csv2db: Turn off the automatic search_path where needed, because when the input is installed, the schemas in it may not exist yet
Aaron Marcuse-Kubitza
02:16 PM Revision 6185: schemas/vegbien.sql: place: Renamed geosource_valid to geovalid. (It had gotten renamed in the reference -> source rename.)
Aaron Marcuse-Kubitza
02:12 PM Revision 6184: schemas/vegbien.sql: location: Renamed confidentialitystatus->accesslevel, confidentialityreason->accessconditions to match the corresponding fields in source. Note that accessconditions stores more than confidentialityreason did, because it can contain details about the accesslevel in addition to the reason for it.
Aaron Marcuse-Kubitza
02:07 PM Revision 6183: schemas/vegbien.sql: source.accesslevel, location.confidentialitystatus: Changed type to accesslevel
Aaron Marcuse-Kubitza
02:03 PM Revision 6182: schemas/vegbien.sql: Added accesslevel enum
Aaron Marcuse-Kubitza
01:51 PM Revision 6181: inputs/import.stats.xls: Updated import times
Aaron Marcuse-Kubitza

11/14/2012

06:37 PM Revision 6180: Regenerated vegbien.ERD exports
Aaron Marcuse-Kubitza
06:30 PM Revision 6179: schemas/vegbien.sql: Renamed reference -> source to make this table more broadly applicable, and because this now stores the datasource metadata
Aaron Marcuse-Kubitza
06:19 PM Revision 6178: schemas/vegbien.sql: referencename: Scope it by top-level datasource, because institutionCodes (which map to this field) are not globally unique. This involves renaming the previous reference_id field, which was for the matched reference, to matched_reference_id, to allow a scoping reference_id field.
Aaron Marcuse-Kubitza
06:16 PM Revision 6177: mappings/VegCore-VegBIEN.csv: Made taxonoccurrence.verbatimcollectorname an fkey to party, and renamed it to collector_id
Aaron Marcuse-Kubitza
05:57 PM Revision 6176: inputs/VegBank/taxonobservation_/map.csv: Mapped new givenname, surname (from collector_id's party) to recordedBy
Aaron Marcuse-Kubitza
05:54 PM Revision 6175: inputs/VegBank/taxonobservation_/create.sql: Also join to collector_id's party to include collector name
Aaron Marcuse-Kubitza
05:53 PM Revision 6174: inputs/VegBank/vegbank.~.clean_up.sql: Rename taxoninterpretation.party_id to taxoninterpretation_party_id to make it globally unique when joining taxoninterpretation to other tables
Aaron Marcuse-Kubitza
05:48 PM Revision 6173: inputs/VegBank/vegbank.~.clean_up.sql: Rename party.d_obscount to party_d_obscount to make it globally unique when joining with other tables
Aaron Marcuse-Kubitza
05:43 PM Revision 6172: inputs/VegBank/vegbank.~.clean_up.sql: Rename taxoninterpretation.party_id to taxoninterpretation_party_id to make it globally unique when joining taxoninterpretation to other tables
Aaron Marcuse-Kubitza
05:35 PM Revision 6171: mappings/VegCore-VegBIEN.csv: Made taxonoccurrence.verbatimcollectorname an fkey to party, and renamed it to collector_id
Aaron Marcuse-Kubitza
05:32 PM Revision 6170: input.Makefile: Existing maps discovery: $(allTables): Fixed bug where need to remove extra whitespace before $(tables) when there are no $(joinedTables)
Aaron Marcuse-Kubitza
05:32 PM Revision 6169: lib/mappings.Makefile: Checking if $(termsSubdirs) defined: Fixed bug where can't use ifndef because that checks if the variable is *empty*, not undefined. Need to use `ifeq ($(origin var),undefined)` instead.
Aaron Marcuse-Kubitza
05:11 PM Revision 6168: inputs/TEAM/V*/map.csv: Omit *Method, because it just contains "Derived" for a small fraction of the rows
Aaron Marcuse-Kubitza
04:47 PM Revision 6167: inputs/SALVIAS/: Updated to new salvias_plots export on nimoy, which has a different schema
Aaron Marcuse-Kubitza
04:03 PM Revision 6166: inputs/SALVIAS/salvias_plots.~.clean_up.sql: Moved Ensure globally unique column names to end to match VegBank order
Aaron Marcuse-Kubitza
03:54 PM Revision 6165: my2pg: *int types: Added mediumint
Aaron Marcuse-Kubitza
03:30 PM Revision 6164: Placed inputs/SALVIAS/_archive/ under version control
Aaron Marcuse-Kubitza
03:18 PM Revision 6163: inputs/SALVIAS/salvias_plots.~.clean_up.sql: Remove private data that should not be publicly visible, indicated by plotMetadata.AccessCode = 1
Aaron Marcuse-Kubitza
03:17 PM Revision 6162: inputs/SALVIAS/salvias_plots.~.clean_up.sql: Enable cascading deletes by adding the necessary fkeys
Aaron Marcuse-Kubitza
03:17 PM Revision 6161: Added inputs/SALVIAS/_src/salvias_data_access_controls.txt
Aaron Marcuse-Kubitza
02:26 PM Revision 6160: inputs/import.stats.xls: Updated import times
Aaron Marcuse-Kubitza
02:25 PM Revision 6159: inputs/.geoscrub/import_order.txt: Fixed bug where geoscrub_cultivated needs to be installed *after* geoscrub_cleaned_unique, not before as it would be with the default alphabetical sort order
Aaron Marcuse-Kubitza
02:24 PM Revision 6158: inputs/.geoscrub/geoscrub_cultivated/: Use _no_import file to exclude geoscrub_cultivated from the import, because it's used directly as a lookup table by analytical_stem rather than being imported. This ensures that there is no import log or input row count for geoscrub_cultivated in the import times, which would skew the import row count because the row count would be included even though no columns are mapped.
Aaron Marcuse-Kubitza
02:18 PM Revision 6157: input.Makefile: $(tables): Fixed bug where need to use $(importTables) instead of $(tables) in all places that should use only imported tables, rather than just in the import process itself
Aaron Marcuse-Kubitza
02:13 PM Revision 6156: input.Makefile: Import to VegBIEN: Added support for tables which should be installed but not imported, but which must be installed *after* tables which are imported rather than before. This currently applies to geoscrub.geoscrub_cultivated, which depends on geoscrub_cleaned_unique (and therefore must be installed after it), but which should not be imported because it's used directly as a lookup table by analytical_stem.
Aaron Marcuse-Kubitza
10:02 AM Revision 6155: inputs/VegBank/vegbank.~.clean_up.sql: Documented that plots with confidentialitystatus >= 4 are not deleted if their embargos have already expired. This applies to the Shenandoah NP data, which has confidentialitystatus = 5 but is no longer embargoed according to the embargo table
Aaron Marcuse-Kubitza

11/13/2012

08:10 PM Revision 6154: inputs/SALVIAS/: Mapped unmapped fields with a VegCore/VegBIEN equivalent. plotMetadata_/: Remapped life_zone to communityID because it is now _alt-ed together with vegetation_*, and thus not just a description with life_zone_code as its globally unique name.
Aaron Marcuse-Kubitza
07:35 PM Revision 6153: Regenerated vegbien.ERD exports
Aaron Marcuse-Kubitza
07:10 PM Revision 6152: schemas/vegbien.sql: referencetype: Added terms from reference.referencetype closed list in VegBank data dictionary. Cited sources in comment.
Aaron Marcuse-Kubitza
06:39 PM Revision 6151: schemas/vegbien.sql: reference.referencetype: Changed type to referencetype enum
Aaron Marcuse-Kubitza
06:38 PM Revision 6150: schemas/vegbien.sql: Added referencetype enum, containing VegBank's values in reference.referencetype as well as values for bien_web.datasource.aggregatorOrPrimary and bien_web.dataSourceNormalized.isHerbarium,isAggregator
Aaron Marcuse-Kubitza
06:23 PM Revision 6149: specimenreplicate: Made institution_id an fkey to referencename instead of party, to later be matched up with reference entries for each aggregator's subprovider
Aaron Marcuse-Kubitza
06:15 PM Revision 6148: schemas/vegbien.sql: referencename: Added referencename_unique unique index on name
Aaron Marcuse-Kubitza
06:00 PM Revision 6147: schemas/vegbien.sql: referencename: Made reference_id optional so it can be populated later when referencenames are scrubbed
Aaron Marcuse-Kubitza
05:58 PM Revision 6146: schemas/vegbien.sql: referencename: Renamed identifier to name because it is specifically any name for the reference, not necessarily an ID
Aaron Marcuse-Kubitza
05:53 PM Revision 6145: schemas/vegbien.sql: Renamed referencealtident to referencename to allow any verbatim reference name to go here, with reference containing the corresponding accepted reference name
Aaron Marcuse-Kubitza
05:50 PM Revision 6144: schemas/vegbien.sql: reference: Added accesslevel, accessconditions from bien_web.datasource
Aaron Marcuse-Kubitza
05:41 PM Revision 6143: schemas/vegbien.sql: address: Added street2 from bien_web.party.address2
Aaron Marcuse-Kubitza
05:38 PM Revision 6142: schemas/vegbien.sql: address: Renamed fields to bien_web.party names
Aaron Marcuse-Kubitza
05:12 PM Revision 6141: schemas/vegbien.sql: party: Added department from bien_web.party
Aaron Marcuse-Kubitza
05:06 PM Revision 6140: inputs/SALVIAS/plotMetadata_/map.csv: Mapped lookup_MethodCode_Description to new observationMeasure
Aaron Marcuse-Kubitza
05:06 PM Revision 6139: schemas/vegbien.sql: method: Made name optional when description or observationmeasure is specified
Aaron Marcuse-Kubitza
05:03 PM Revision 6138: schemas/vegbien.sql: method: method_unique: Include observationmeasure since the method name sometimes is not globally unique (e.g. in SALVIAS)
Aaron Marcuse-Kubitza
04:58 PM Revision 6137: mappings/VegCore-VegBIEN.csv: Mapped observationMeasure
Aaron Marcuse-Kubitza
04:57 PM Revision 6136: mappings/VegCore.csv: observationMeasure: Added source to DwC samplingProtocol
Aaron Marcuse-Kubitza
04:54 PM Revision 6135: mappings/VegCore.csv: Added observationMeasure
Aaron Marcuse-Kubitza
04:40 PM Revision 6134: schemas/vegbien.ERD.mwb: Added family_higher_plant_group
Aaron Marcuse-Kubitza
04:28 PM Revision 6133: schemas/vegbien.sql: Removed VegBank-internal fields starting with d_
Aaron Marcuse-Kubitza
04:19 PM Revision 6132: schemas/vegbien.ERD.mwb: Moved tables so commclass would have more room. Moved revision back to original spot.
Aaron Marcuse-Kubitza
04:07 PM Revision 6131: schemas/filter_ERD.csv: Display referencecontributor->party connection in ERD
Aaron Marcuse-Kubitza
03:56 PM Revision 6130: schemas/vegbien.sql: Removed no longer used table referenceparty
Aaron Marcuse-Kubitza
03:54 PM Revision 6129: schemas/vegbien.sql: referencecontributor: Point to party instead of referenceparty, which duplicates party
Aaron Marcuse-Kubitza
03:51 PM Revision 6128: schemas/vegbien.sql: party: Added new suffix field to party_unique unique index
Aaron Marcuse-Kubitza
03:49 PM Revision 6127: schemas/vegbien.sql: party: Added fields from referenceparty. Note that referenceparty.type is named partytype.
Aaron Marcuse-Kubitza
03:25 PM Revision 6126: inputs/SALVIAS/salvias_plots.~.clean_up.sql: Rename lookup_MethodCode.Description to lookup_MethodCode_Description to make it globally unique when joined with plotMetadata
Aaron Marcuse-Kubitza
03:24 PM Revision 6125: input.Makefile: SVN: $(svnFilesGlob): Added root-level .sql files containing ~, which run additional commands after the original data is imported
Aaron Marcuse-Kubitza
03:22 PM Revision 6124: inputs/SALVIAS/_MySQL/: Updated svn:ignore from running `make inputs/SALVIAS/add`
Aaron Marcuse-Kubitza
02:30 PM Revision 6123: mappings/VegCore-VegBIEN.csv: matched place's coordinates: Fixed bug where coordinates entry itself needed to have its datasource (reference) set to geoscrub, in addition to the place entry that uses it, in order to match up properly with geoscrub's corresponding input place (whose coordinates as well as place are owned by the geoscrub datasource)
Aaron Marcuse-Kubitza
02:22 PM Revision 6122: mappings/VegCore-VegBIEN.csv: matched place's coordinates: Fixed bug where coordinates mappings with and without matched_place_id=0 need to sort together in order to be merged, by prepending ".," to the place attrs list
Aaron Marcuse-Kubitza
02:22 PM Revision 6121: inputs/VegBank/plot_/test.xml.ref: Updated inserted row count
Aaron Marcuse-Kubitza
12:00 PM Revision 6120: inputs/import.stats.xls: Updated import times
Aaron Marcuse-Kubitza

11/09/2012

08:30 PM Revision 6119: Regenerated vegbien.ERD exports
Aaron Marcuse-Kubitza
08:20 PM Revision 6118: inputs/Makefile: Input data: $(rsyncLogs): Also include logs from the datasource's top-level logs/ dir, which contains make_analytical_db.log.sql
Aaron Marcuse-Kubitza
08:09 PM Revision 6117: inputs/VegBank/vegbank.~.clean_up.sql: Remove still-embargoed plots
Aaron Marcuse-Kubitza
08:07 PM Revision 6116: inputs/VegBank/vegbank.~.clean_up.sql: Enable cascading deletes by setting all foreign keys to ON DELETE CASCADE
Aaron Marcuse-Kubitza
07:49 PM Revision 6115: Added inputs/VegBank/_src/vegbank.schema.sql.make and vegbank.schema.sql
Aaron Marcuse-Kubitza
07:48 PM Revision 6114: input.Makefile: Staging tables installation: sql/install: Use new pg_dump_limit to remove security and schema-setting commands
Aaron Marcuse-Kubitza
07:46 PM Revision 6113: Added pg_dump_limit to filter a PostgreSQL DB dump to remove security and schema-setting commands
Aaron Marcuse-Kubitza
06:37 PM Revision 6112: inputs/.geoscrub/geoscrub_cleaned_unique/create.sql: Removed no longer needed index on latitudeDecimalVerbatim, longitudeDecimalVerbatim, which is now on geoscrub_cultivated instead
Aaron Marcuse-Kubitza
06:32 PM Revision 6111: schemas/vegbien.sql: analytical_stem_view: Fixed bug where needed to join on new geoscrub_cultivated, not geoscrub, for all geoscrub-related information. geoscrub contains many duplicate records, causing one input row to match many rows in geoscrub, when there should only be one entry for each coordinate pair.
Aaron Marcuse-Kubitza
06:26 PM Revision 6110: Added inputs/.geoscrub/geoscrub_cultivated/
Aaron Marcuse-Kubitza
06:04 PM Revision 6109: inputs/.geoscrub/geoscrub_cleaned_unique/create.sql: Added index on latitudeDecimalVerbatim, longitudeDecimalVerbatim for use by analytical_stem_view
Aaron Marcuse-Kubitza
05:34 PM Revision 6108: inputs/newWorld/geoscrub.schema.~.changes.sql: Change countryNameStd type to text to allow merge-joining with place.country in analytical_stem_view
Aaron Marcuse-Kubitza
05:28 PM Revision 6107: inputs/newWorld/geoscrub.schema.~.changes.sql: ALTER TABLE ... ALTER COLUMN statement: Reformatted to allow adding additional ALTER COLUMN clauses
Aaron Marcuse-Kubitza
05:25 PM Revision 6106: inputs/.geoscrub/geoscrub_cleaned_unique/create.sql: Change latitudeDecimalVerbatim, longitudeDecimalVerbatim types to double precision to allow merge-joining with coordinates.latitude_deg, longitude_deg in analytical_stem_view
Aaron Marcuse-Kubitza
05:12 PM Revision 6105: README.TXT: Data import: Instead of using `make schemas/rotate` and then renaming the public schema to the correct name, just rename directly to the correct name using `make schemas/rename/...`. Use new import_name to determine the import name instead of manually finding the date in the first datasource's log file name.
Aaron Marcuse-Kubitza
05:06 PM Revision 6104: Added import_name, which gets the name of an import based on its log file names
Aaron Marcuse-Kubitza
04:50 PM Revision 6103: README.TXT: Data import: Moved checking that imports were successful before running make_analytical_db
Aaron Marcuse-Kubitza
04:41 PM Revision 6102: root Makefile: Installation: Fixed bug where schemas/install needed to happen *after* inputs/install because some of the public schema's views now depend on inputs
Aaron Marcuse-Kubitza
04:07 PM Revision 6101: schemas/vegbien.sql: analytical_stem_view: cultivatedBasis: Concatenate ''::text to geoscrub.isCultivatedReason so it will be cast to a text field both on PostgreSQL 9.1.1 (local machine), which removes any explicit cast to text when creating the view, and 9.1.6 (vegbiendev), which requires an explicit cast to text
Aaron Marcuse-Kubitza
03:49 PM Revision 6100: schemas/vegbien.sql: analytical_stem_view: cultivatedBasis: Use geoscrub.isCultivatedReason instead when geoscrub.isCultivated is used as the source for cultivated
Aaron Marcuse-Kubitza
12:53 PM Revision 6099: schemas/vegbien.sql: analytical_stem_view: Use geoscrub.isCultivated when taxonoccurrence.iscultivated is not provided (joining to geoscrub on the coordinates)
Aaron Marcuse-Kubitza

11/08/2012

06:38 PM Revision 6098: root Makefile: VegBIEN DB: Schemas: Run all schema installs and uninstalls using no_search_path=1, so that the schemas in the automatic search_path are not required for the command to run
Aaron Marcuse-Kubitza
06:37 PM Revision 6097: psql_vegbien: Added $no_search_path option to turn off the automatic SET search_path directive
Aaron Marcuse-Kubitza
06:11 PM Revision 6096: schemas/vegbien.sql: taxonverbatim: Added growthform for growthform based on the taxon name rather than provided with the input data's taxonoccurrence
Aaron Marcuse-Kubitza
06:00 PM Revision 6095: schemas/vegbien.ERD.mwb: Fixed lines
Aaron Marcuse-Kubitza
05:47 PM Revision 6094: inputs/SALVIAS/plotMetadata/: LEFT JOINed with lookup_MethodCode to create plotMetadata_
Aaron Marcuse-Kubitza
04:52 PM Revision 6093: schemas/vegbien.sql: threatened_taxonlabel_view: Fixed bug where needed DISTINCT on resulting taxonlabel_id because some descendants apparently appear in multiple threatened taxonlabels' subtrees
Aaron Marcuse-Kubitza
04:42 PM Revision 6092: schemas/vegbien.sql: analytical_*: Added threatened, using new threatened_taxonlabel lookup table
Aaron Marcuse-Kubitza
04:12 PM Revision 6091: schemas/vegbien.sql: reference_by_shortname(): Fixed bug where need to use $-syntax to reference params in sql-language functions
Aaron Marcuse-Kubitza
04:07 PM Revision 6090: schemas/vegbien.sql: threatened_taxonlabel_view: Use new reference_by_shortname()
Aaron Marcuse-Kubitza
03:45 PM Revision 6089: root Makefile: VegBIEN DB: Schemas: public: schemas/public/uninstall: Fixed bug where need to run psql_vegbien without public in the search_path because it may have already been deleted
Aaron Marcuse-Kubitza
03:44 PM Revision 6088: root Makefile: VegBIEN DB: Schemas: public: schemas/public/install: Fixed bug where need to run psql_vegbien without public in the search_path because it doesn't exist, by setting public to the empty string (deleting it)
Aaron Marcuse-Kubitza
03:42 PM Revision 6087: vegbien_dest: $schemas: Don't include the , before $public if it has been set to the empty string (deleted)
Aaron Marcuse-Kubitza
03:27 PM Revision 6086: schemas/vegbien.sql: Added reference_by_shortname(). Using this function instead of the manual query should force the query planner to evaluate this expression first, rather than complexly reordering joins to place this nested select as a filter condition.
Aaron Marcuse-Kubitza
03:00 PM Revision 6085: schemas/vegbien.sql: Added threatened_taxonlabel derived table with generating view threatened_taxonlabel_view
Aaron Marcuse-Kubitza
02:48 PM Revision 6084: Updated inputs/UNCC/Specimen/test.xml.ref inserted row count
Aaron Marcuse-Kubitza
01:38 PM Revision 6083: mappings/VegCore.csv: Added threatened
Aaron Marcuse-Kubitza
01:21 PM Revision 6082: inputs/VegBank/vegbank.~.clean_up.sql: Remove private columns (plot.reallatitude, reallongitude) that should not be publicly visible
Aaron Marcuse-Kubitza
01:13 PM Revision 6081: inputs/CVS/Organism/map.csv: Removed now-dropped realLatitude, realLongitude
Aaron Marcuse-Kubitza
01:12 PM Revision 6080: inputs/CVS/Organism/map.csv: Removed now-dropped realLatitude, realLongitude
Aaron Marcuse-Kubitza
01:12 PM Revision 6079: Added inputs/CVS/Organism/postprocess.sql to drop private realLatitude, realLongitude columns
Aaron Marcuse-Kubitza
01:10 PM Revision 6078: input.Makefile: Staging tables installation: Added back postprocess.sql, which is now used for one-time dropping of private columns that should not be publicly visible
Aaron Marcuse-Kubitza
12:47 PM Revision 6077: input.Makefile: Maps building: %/.map.csv.last_cleanup: $(dict) canon/translate: Use new $(translate?)
Aaron Marcuse-Kubitza
12:45 PM Revision 6076: input.Makefile: Maps building: %/.map.csv.last_cleanup: Added $(srcDict) as a prerequisite, so that .last_cleanup will be re-run if it changes. Added empty $(srcDict) target in case it doesn't exist.
Aaron Marcuse-Kubitza
12:39 PM Revision 6075: inputs/bien_web/observation/map.csv: Omit *_index because they are placeholder columns created by the MySQL to PostgreSQL translation
Aaron Marcuse-Kubitza
12:37 PM Revision 6074: input.Makefile: Maps building: %/.map.csv.last_cleanup: Fixed bug where can only canon/translate using $(srcDict) if it exists for the datasource
Aaron Marcuse-Kubitza
12:26 PM Revision 6073: inputs/bien_web/observation/: Regenerated from actual bien_web.observation schema on nimoy, which has additional columns
Aaron Marcuse-Kubitza
12:24 PM Revision 6072: input.Makefile: SVN: $(svnFilesGlob): Added top-level map.csv, which can be used to apply a datasource-global data dictionary to all tables
Aaron Marcuse-Kubitza
12:18 PM Revision 6071: input.Makefile: Maps building: %/.map.csv.last_cleanup: Also apply any map.csv at the top level of the datasource directory. This can be used to apply a datasource-global data dictionary to all tables.
Aaron Marcuse-Kubitza
12:01 PM Revision 6070: my2pg: Also remove column comments. Note that these cannot be translated by sed, because PostgreSQL only allows setting column comments in a separate statement, not inline with the column's entry in the CREATE TABLE statement, and sed can only make replacements contiguous with the input line.
Aaron Marcuse-Kubitza
11:28 AM Revision 6069: mappings/VegCore.csv: Removed incorrect duplicate entry for verbatimSubgenus
Aaron Marcuse-Kubitza
10:58 AM Revision 6068: schemas/vegbien.sql: _taxon_family_require_std(): Fixed bug where name needed to be lowercased before checking if it ended in -aceae, to support family names that are uppercase. Note that this resulted in the family not being prepended to the TNRS input name for datasources with uppercase family names, so the next DB import will likely produce a number of unscrubbed TNRS input names which now have the uppercase family prepended.
Aaron Marcuse-Kubitza
10:17 AM Revision 6067: inputs/.TNRS/tnrs/tnrs.make: Fixed bug where need to reference the log file path relative to the make script itself, because otherwise the log file would go in inputs/.TNRS/logs/tnrs.make.log.sql
Aaron Marcuse-Kubitza
10:07 AM Revision 6066: inputs/.TNRS/tnrs/tnrs.make: Fixed bug where need to use just the basename of $0 as the log file name
Aaron Marcuse-Kubitza
09:51 AM Revision 6065: Added inputs/IUCN/
Aaron Marcuse-Kubitza
09:51 AM Revision 6064: input.Makefile: SVN: add: Added _src/ (when it exists). $(_svnFilesGlob): Added .url, .pdf files in _src/.
Aaron Marcuse-Kubitza
07:47 AM Revision 6063: psql_vegbien: Use bash because it supports substitutions
Aaron Marcuse-Kubitza
07:46 AM Revision 6062: psql_vegbien: Set the search_path to $out_schemas set by vegbien_dest. This will enable running any psql_vegbien script on a schema other than public.
Aaron Marcuse-Kubitza
07:07 AM Revision 6061: schemas/vegbien.sql: analytical_stem_view: Changed inner joins on non-datasource taxonlabels to LEFT JOINs, to ensure that an entry is included in the analytical DB even if there was no matched taxonlabel. In theory, this shouldn't be necessary, because every taxonlabel should have a canonical taxonlabel since canon_label_id is auto-populated to the taxonlabel_id if there is no matched_label_id; there should be a taxonverbatim for every datasource and accepted taxonlabel because datasources link to taxonlabel via taxonverbatim and TNRS populates a taxonverbatim for every accepted taxonlabel; and there should be a parsed taxonlabel for every datasource taxonlabel because the mappings populate it.
Aaron Marcuse-Kubitza
06:56 AM Revision 6060: schemas/vegbien.sql: analytical_stem_view: Removed join on specimenreplicate, because it isn't used in the analytical DB. Each specimen will still get an entry in analytical_*, because it gets its own location.
Aaron Marcuse-Kubitza
06:45 AM Revision 6059: README.TXT: Data import: Before starting import, added step to run `make inputs/upload` and reinstall newly-uploaded datasources
Aaron Marcuse-Kubitza
03:56 AM Task #346: add georeferencing support to schema
Georeferencing information can be stored in the geovalidation place entry, which the original place is linked to via ... Aaron Marcuse-Kubitza
03:42 AM Task #292: VegBank metadata query mechanism
Added possible strategy Aaron Marcuse-Kubitza
03:32 AM Task #343 (Resolved): integrate TNRS into VegBIEN
Aaron Marcuse-Kubitza
03:32 AM Task #308 (Resolved): do a direct transfer of some public data from VegBank
Core fields in VegBank have been mapped. A recent full export of the live VegBank DB is used as the input. Aaron Marcuse-Kubitza
03:31 AM Task #335 (Rejected): Look into using Sybase Powerbuilder or IBM Enterprise Vision to map data
We are using map spreadsheets and auto-mapping instead, which work well so far and would not be easy to translate to ... Aaron Marcuse-Kubitza
03:30 AM Task #312: Finish importing SALVIAS data
Still need to map methods, probably using LEFT JOIN Aaron Marcuse-Kubitza
03:27 AM Task #365 (Rejected): retrieve taxonomic hierarchy in analytical layer by using dynamic queries to external sources
We're using a fixed version of the NCBI tree of life instead Aaron Marcuse-Kubitza
03:26 AM Task #424: Finish translating XML functions to SQL functions for column-based import
Most translated; still left:
* _map
* _range*
* _avg
* _compass
* a few others
Aaron Marcuse-Kubitza
03:24 AM Task #440: aggregating validations of imports
Queries work for current schema Aaron Marcuse-Kubitza
03:22 AM Task #454 (Resolved): update summarizing queries for current schema
@make inputs/SALVIAS/verify/@ and @make inputs/NY/verify/@ work again Aaron Marcuse-Kubitza
03:22 AM Revision 6058: README.TXT: Schema changes: Remember to update the following files with any renamings: Added mappings/verify.*.sql
Aaron Marcuse-Kubitza
03:16 AM Task #476 (Rejected): develop map spreadsheet -> header override file translation utility
Shouldn't do this because it would prevent map spreadsheets from having multiple output locations for the same input ... Aaron Marcuse-Kubitza
03:11 AM Task #495 (Resolved): add separate datasource table rather than using party for this
Now using @reference@ for this Aaron Marcuse-Kubitza
03:07 AM Task #521 (Resolved): make place* tables use a structure similar to taxonconcept
Aaron Marcuse-Kubitza
02:21 AM Revision 6057: README.TXT: Data import: make_analytical_db: Documented how to view progress in log file
Aaron Marcuse-Kubitza
02:18 AM Revision 6056: make_analytical_db: Run all commands synchronously so the log file output doesn't become jumbled
Aaron Marcuse-Kubitza
02:16 AM Revision 6055: make_analytical_db: Fixed bug where log file needed to be appended to instead of overwritten
Aaron Marcuse-Kubitza
02:15 AM Revision 6054: make_analytical_db: Wrap each individual command in `set -x` to avoid echoing low-level commands such as sleep, wait
Aaron Marcuse-Kubitza
02:02 AM Revision 6053: make_analytical_db: Moved log file to inputs/analytical_db/logs/make_analytical_db.log.sql so it would be synced along with the other import logs
Aaron Marcuse-Kubitza
01:57 AM Revision 6052: inputs/.TNRS/tnrs/tnrs.make: Output the time at which it's run, so this is included in the log file
Aaron Marcuse-Kubitza
01:53 AM Revision 6051: inputs/.TNRS/tnrs/tnrs.make: Moved log file to logs/tnrs.make.log.sql so it would automatically be synced along with the other import logs
Aaron Marcuse-Kubitza
01:49 AM Revision 6050: make_analytical_db: Moved log file to inputs/analytical_db/logs/make_analytical_db.log.sql so it would be synced along with the other import logs
Aaron Marcuse-Kubitza
01:40 AM Revision 6049: inputs/Makefile: Import logs: $(rsyncLogs): Always download the TNRS daemon log, rather than requiring tnrs_log=1 to be specified to download it
Aaron Marcuse-Kubitza
01:37 AM Revision 6048: make_analytical_db: Output the time at which it's run, so this is included in the log file
Aaron Marcuse-Kubitza
01:36 AM Revision 6047: make_analytical_db: Store output in schemas/make_analytical_db.log
Aaron Marcuse-Kubitza
01:24 AM Revision 6046: schemas/vegbien.sql: Removed no longer used make_analytical_db(). Use bin/make_analytical_db instead.
Aaron Marcuse-Kubitza
01:23 AM Revision 6045: make_analytical_db: Use new psql_verbose_vegbien
Aaron Marcuse-Kubitza
01:22 AM Revision 6044: Added psql_verbose_vegbien
Aaron Marcuse-Kubitza
01:18 AM Revision 6043: make_analytical_db: Use psql_script_vegbien, which contains the necessary psq options, instead of setting those options manually
Aaron Marcuse-Kubitza
01:15 AM Revision 6042: make_analytical_db: Run the SQL commands directly with psql instead of using the make_analytical_db() function. This provides incremental results and avoids running all commands in one transaction, thus preventing pgAdmin from freezing when the user attempts to access a table used in analytical DB creation (because the TRUNCATE statement fully locks the table until the entire analytical DB is built).
Aaron Marcuse-Kubitza
12:46 AM Revision 6041: schemas/vegbien.sql: make_analytical_db(): Added make_family_higher_plant_group()
Aaron Marcuse-Kubitza
12:17 AM Revision 6040: inputs/import.stats.xls: Updated import times. Fixed input row counts and import times to include derived data, such as TNRS and geoscrub, which adds to the import time and therefore should be considered in the import's speed. (TNRS was already being included in the import time for some, but not all, imports.)
Aaron Marcuse-Kubitza
 

Also available in: Atom