Project

General

Profile

Activity

From 11/24/2012 to 12/23/2012

12/21/2012

03:34 PM Revision 7023: import_all: Allow caller to override $dump_opts
Aaron Marcuse-Kubitza
03:33 PM Revision 7022: pg_dump_vegbien: Renamed $opts env var to $dump_opts to avoid conflicting with other commands' vars of the same name
Aaron Marcuse-Kubitza
03:22 PM Revision 7021: mappings/VegCore.csv: Regenerated from wiki
Aaron Marcuse-Kubitza
03:20 PM Revision 7020: mappings/VegCore.csv: Regenerated from wiki
Aaron Marcuse-Kubitza
02:14 PM Revision 7019: schemas/vegbien.sql: location: Removed location_unique_within_parent_by_sourceaccessioncode, which duplicates location_unique_within_creator_by_sourceaccessioncode because the sourceaccessioncode is globally unique
Aaron Marcuse-Kubitza
02:10 PM Revision 7018: schemas/vegbien.sql: analytical_stem_view: projectID: Use project.projectname if project.sourceaccessioncode isn't provided
Aaron Marcuse-Kubitza
02:02 PM Revision 7017: schemas/vegbien.sql: location: location_unique_within_parent: Split into *_by_sourceaccessioncode and *_by_authorlocationcode_position, with each ID being matched separately. This way, if the initial import of a subplot's location provides both fields, but fkey references use only one field, the fkey references will still match the existing location because only one of the fields needs to match.
Aaron Marcuse-Kubitza
01:26 PM Revision 7016: schemas/vegbien.sql: analytical_stem_view: elevationInMeters: Use parent_location.elevation_m when location.elevation_m not provided
Aaron Marcuse-Kubitza
01:17 PM Revision 7015: schemas/vegbien.sql: analytical_stem_view: scientificName: Fixed bug where need to use accepted_taxon*label*.taxonomicname instead of accepted_taxonverbatim.taxonomicname, because taxonverbatim's name component fields aren't populated if the name doesn't match a scrubbed name. The datasource's own taxonverbatim can't be used for this because the canon_label_id refers to the concatenated taxonomic name owned by the TNRS datasource.
Aaron Marcuse-Kubitza
01:00 PM Revision 7014: inputs/NVS/Plot/map.csv: Corrected Plot ID mapping to go to subplotID instead of locationID, because each subplot gets its own ID in this field
Aaron Marcuse-Kubitza
12:50 PM Revision 7013: schemas/vegbien.sql: location: location_unique_within_parent: Also apply this constraint when sourceaccessioncode is provided, because it may be a concatenated value populated for use by the analytical DB but which is not used as an fkey by the datasource itself
Aaron Marcuse-Kubitza
12:30 PM Revision 7012: schemas/vegbien.sql: analytical_stem_view: locationID: Concatenate parent location's and subplot's IDs using '; ' instead of ' '
Aaron Marcuse-Kubitza
12:22 PM Revision 7011: schemas/vegbien.sql: analytical_*: Renamed locationName to locationID because it's now globally unique (within the datasource) and can be used as a sourceaccessioncode
Aaron Marcuse-Kubitza
12:19 PM Revision 7010: schemas/vegbien.sql: analytical_stem_view: locationName: For subplots without their own sourceaccessioncode (globally unique ID), prepend the parent location's unique ID so that locationName is globally unique
Aaron Marcuse-Kubitza
12:07 PM Revision 7009: mappings/VegCore-VegBIEN.csv: locationID/locationName + subplot -> location.sourceaccessioncode mapping: Fixed bug where subplot was incorrectly being mapped to this field even when there was no location*. (This field can only be populated if both location* *and* subplot are specified.) Also only map locationID for this, to avoid inconsistencies where one table supplies locationID+subplot, while another table supplies locationName+subplot, but they both get mapped to the same field, preventing plots from being matched up with their observations when creating the analytical_stem.
Aaron Marcuse-Kubitza
11:31 AM Revision 7008: xml_func.py: Simplifying functions: Logic: _and(), _or(): Evaluate an expression of only constant values
Aaron Marcuse-Kubitza
11:30 AM Revision 7007: lists.py: Added and_(), or_()
Aaron Marcuse-Kubitza
11:28 AM Revision 7006: xml_func.py: is_scalar(): Fixed bug where need to check if value is a string before calling is_var_name()
Aaron Marcuse-Kubitza
10:15 AM Revision 7005: inputs/NVS/StemObservation/map.csv: Remapped Verbatim Code to authorTaxonCode, because as it's used this is actually an identifier for the taxon, not the stem, despite Nick Spencer's revised mapping
Aaron Marcuse-Kubitza

12/20/2012

05:21 PM Revision 7004: schemas/vegbien.ERD.mwb: Regenerated exports
Aaron Marcuse-Kubitza
05:21 PM Revision 7003: README.TXT: Schema changes: Update graphical ERD exports: Added step to commit changes
Aaron Marcuse-Kubitza
05:02 PM Revision 7002: inputs/NVS/*/map.csv: Remapped with Nick Spencer's suggested changes
Aaron Marcuse-Kubitza
04:41 PM Revision 7001: xml_func.py: _first(): Fixed bug where need to choose the first *non-empty* param, by first pruning empty child nodes
Aaron Marcuse-Kubitza
04:38 PM Revision 7000: mappings/VegCore-VegBIEN.csv: authortaxoncode mappings: Only using authorTaxonCode if there is no plant ID: Added individualID, stemID to the terms that cause authorTaxonCode not to be mapped to VegBIEN authortaxoncode
Aaron Marcuse-Kubitza
04:03 PM Revision 6999: mappings/VegCore-VegBIEN.csv: authortaxoncode mappings: Only using authorTaxonCode if there is no plant ID: Added individualID, stemID to the terms that cause authorTaxonCode not to be mapped to VegBIEN authortaxoncode
Aaron Marcuse-Kubitza
03:59 PM Revision 6998: schemas/vegbien.sql: analytical_*: Renamed individualID to individualObservationID because this actually corresponds to plantobservation.sourceaccessioncode, which is an observation *of* an individual
Aaron Marcuse-Kubitza
03:56 PM Revision 6997: mappings/VegCore.csv: Regenerated from wiki
Aaron Marcuse-Kubitza
03:53 PM Revision 6996: README.TXT: Data import: Recording the import times: Changed <version> back to $version because these commands are actually run on vegbiendev, where $version is set. (Modifications to import.stats.xls would be made on your local machine.)
Aaron Marcuse-Kubitza
03:50 PM Revision 6995: README.TXT: Data import: Added step to unset $version before starting the import, to avoid importing on top of the last import's data
Aaron Marcuse-Kubitza
02:47 PM Revision 6994: README.TXT: Data import: Replaced $version with <version> where it needs to be manually filled in
Aaron Marcuse-Kubitza
02:40 PM Revision 6993: README.TXT: Data import: On nimoy: Added command to set $version
Aaron Marcuse-Kubitza
02:26 PM Revision 6992: mappings/VegCore-VegBIEN.csv: authortaxoncode mappings: Only use authorTaxonCode if there is no plant ID, because an individual plant gets its own taxonoccurrence and thus needs the taxonoccurrence's IDs to be unique to the plant, regardless of what the author designates as the taxonoccurrence code
Aaron Marcuse-Kubitza
01:47 PM Revision 6991: Generated inputs/NVS/new_terms.csv
Aaron Marcuse-Kubitza
01:47 PM Revision 6990: input.Makefile: SVN: $(svnFilesGlob): Also match *terms.csv in top-level dir
Aaron Marcuse-Kubitza
01:23 PM Revision 6989: mappings/VegCore-VegBIEN.csv: Mapped authorTaxonCode
Aaron Marcuse-Kubitza
01:12 PM Revision 6988: mappings/VegCore.csv: Regenerated from wiki
Aaron Marcuse-Kubitza
01:12 PM Revision 6987: README.TXT: Maintenance: VegCore data dictionary: Added step to commit updated mappings/VegCore.csv
Aaron Marcuse-Kubitza
12:13 PM Revision 6986: schemas/Makefile: %/publish: Fixed bug where commands were not being run transactionally, because --single-transaction requires `--file -` to work properly
Aaron Marcuse-Kubitza
11:36 AM Revision 6985: input.Makefile: Editing import: Removed rotate because appending the current svn revision doesn't make sense, since this is not related to the revision used to import the datasource
Aaron Marcuse-Kubitza
11:34 AM Revision 6984: input.Makefile: Editing import: Added rename/% and use it in rotate
Aaron Marcuse-Kubitza
11:21 AM Revision 6983: inputs/import.stats.xls: Updated import times
Aaron Marcuse-Kubitza
11:21 AM Revision 6982: schemas/Makefile: Use $* instead of $(@D) for clarity. $(@D) is only needed when the dir part of the target includes a prefix in addition to the % stem.
Aaron Marcuse-Kubitza
10:45 AM Revision 6981: make_analytical_db: Automatically call export_analytical_db when finished
Aaron Marcuse-Kubitza
10:35 AM Revision 6980: schemas/vegbien.sql: make_family_higher_plant_group(): Added `taxonepithet IS NOT NULL` filter, to allow make_analytical_db to proceed even when the NCBI import fails (leaving some nodes with rank = 'family' but no associated taxonepithet). The most recent NCBI import failed due to the search_path/DuplicateException bug resulting from the import schema and public being in the search_path together.
Aaron Marcuse-Kubitza
10:14 AM Revision 6979: schemas/Makefile: Fixed bug where need `SHELL := /bin/bash` for \$(confirmRmPublicSchema) to work correctly
Aaron Marcuse-Kubitza
10:12 AM Revision 6978: lib/common.Makefile: $(confirm): Added comment that this requires `SHELL := /bin/bash` to work correctly
Aaron Marcuse-Kubitza
10:09 AM Revision 6977: import_all: after_import(): Added `make backups/vegbien.$version.backup/test`
Aaron Marcuse-Kubitza
10:05 AM Revision 6976: sql.py: DbConn._db(): search_path: Don't append the existing search_path, because it usually includes the public schema, which is now different from the schema being imported into. This fixes a bug where sql.function_exists() would find public-schema functions in both the public schema and the import's schema because both were in the search_path, causing a DuplicateException "more than one function named ...". Note that the elements of the existing search_path are no longer needed now that vegbien_dest's $schemas includes $public. Also note that if an instance of DbConn does not specify the schemas param, the existing search_path will be left as-is rather than overwritten with an empty list.
Aaron Marcuse-Kubitza
09:54 AM Revision 6975: README.TXT: Data import: recording the import times in inputs/import.stats.xls: Added step to determine the import date using import_date
Aaron Marcuse-Kubitza
09:52 AM Revision 6974: import_date: Added note that Mac and Linux differ in the order they sort the logs in
Aaron Marcuse-Kubitza
09:50 AM Revision 6973: README.TXT: Data import: recording the import times in inputs/import.stats.xls: Updated pattern for new log filename format
Aaron Marcuse-Kubitza
09:47 AM Revision 6972: README.TXT: Data import: recording the import times in inputs/import.stats.xls: Removed extra ./ before bin/import_times
Aaron Marcuse-Kubitza
09:46 AM Revision 6971: import_date: Added note that the time this outputs is the time the first special input *finished* importing. The import itself generally starts a few minutes before that, and the exact time is in that import's public schema comment.
Aaron Marcuse-Kubitza
09:41 AM Revision 6970: import_date: Removed duplicate Usage message at top of file, which is repeated in the Usage message provided when the program is run with no arguments
Aaron Marcuse-Kubitza
09:40 AM Revision 6969: Added import_date
Aaron Marcuse-Kubitza
09:38 AM Revision 6968: Added mtime
Aaron Marcuse-Kubitza
09:29 AM Revision 6967: lib/common.Makefile: System: Added $(mtime)
Aaron Marcuse-Kubitza
09:27 AM Revision 6966: lib/common.Makefile: $(date): Factored date format out into $(dateFmt)
Aaron Marcuse-Kubitza
09:25 AM Revision 6965: backups/Makefile: Factored $(isMac) out into lib/common.Makefile
Aaron Marcuse-Kubitza
08:30 AM Revision 6964: README.TXT: Data import: tailing logs: Updated pattern for new log filename format
Aaron Marcuse-Kubitza

12/19/2012

02:02 PM Revision 6963: schemas/Makefile: Installation: %/publish: Fixed bug where need quotes around source schema name
Aaron Marcuse-Kubitza
01:57 PM Revision 6962: README.TXT: Data import: Moved deletion previous imports before the import, so that full DB backup can be automated
Aaron Marcuse-Kubitza
01:55 PM Revision 6961: README.TXT: Data import: `make backups/vegbien.$version.backup/test`: Added --exclude-schema=public to leave out the previous (now published) import so it doesn't bloat the backup. Note that public is included in the vegbien.$version.backup for the previous import, named according to its version.
Aaron Marcuse-Kubitza
01:49 PM Revision 6960: import_all: after_import(): Added `make backups/TNRS.backup-remake`
Aaron Marcuse-Kubitza
01:46 PM Revision 6959: README.TXT: Data import: Added step to publish the import to the public schema
Aaron Marcuse-Kubitza
01:42 PM Revision 6958: import_all: after_import(): Added export_analytical_db
Aaron Marcuse-Kubitza
01:36 PM Revision 6957: README.TXT: Data import: bin/export_analytical_db: Removed `env public=$version` because export_analytical_db now uses $version as $public when provided
Aaron Marcuse-Kubitza
01:35 PM Revision 6956: README.TXT: Data import: To remake analytical DB: Removed `env public=...` because $version (which replaces $public) is now set automatically by import_all
Aaron Marcuse-Kubitza
01:32 PM Revision 6955: schemas/Makefile: Installation: py_functions/install: Removed `env public=`, which is not needed since $(psqlAsAdminVegbien) does not use psql_script_vegbien (which uses $public)
Aaron Marcuse-Kubitza
01:28 PM Revision 6954: export_analytical_db: Use vegbien_dest to set the default value for $public
Aaron Marcuse-Kubitza
01:21 PM Revision 6953: README.TXT: Data import: If many inputs have errors: Updated command to `make schemas/$version/uninstall` because the current import's schema is now named $version
Aaron Marcuse-Kubitza
01:15 PM Revision 6952: schemas/Makefile: Installation: $(schemas), $(schemasReversed) (used e.g. by `make schemas/reinstall`): Removed public so that when `make schemas/reinstall` is run before an import, it will not remove any active (published) import which resides in the public schema
Aaron Marcuse-Kubitza
01:10 PM Revision 6951: README.TXT: Schema changes: Reinstall public separately from the other schemas so that it will still be reinstalled when schemas/reinstall excludes the public schema to avoid removing any active (published) import
Aaron Marcuse-Kubitza
01:01 PM Revision 6950: vegbien_dest callers: Removed no longer needed explicit setting $prefix to "", because this is now the default value
Aaron Marcuse-Kubitza
01:00 PM Revision 6949: vegbien_dest: Changed default $prefix to "", so that the majority of callers don't need to manually set $prefix to "" to avoid it defaulting to out_
Aaron Marcuse-Kubitza
12:45 PM Revision 6948: README.TXT: Data import: Use env var $version, which is now set by import_all, instead of manually inserting the version for <version>
Aaron Marcuse-Kubitza
12:40 PM Revision 6947: vegbien_dest: Also export $version
Aaron Marcuse-Kubitza
12:30 PM Revision 6946: import_all: Run the import directly into a new, already-versioned public schema. This removes the need to manually rename the schema after import, and allows the backup commands to use the stored $version shell variable to refer to the last import.
Aaron Marcuse-Kubitza
12:25 PM Revision 6945: schemas/Makefile: %/publish: Added instruction to run `unset version` after the command, to clear the $version shell variable which will be set by import_all
Aaron Marcuse-Kubitza
12:12 PM Revision 6944: README.TXT: Data import: Replaced <import_name> with <version> because the import name is now just the version
Aaron Marcuse-Kubitza
12:10 PM Revision 6943: README.TXT: Data import: Replaced r<revision> with <version> because the version string is now equal to r<revision>
Aaron Marcuse-Kubitza
12:09 PM Revision 6942: README.TXT: Backups: Replaced <date> with <version> because the date is no longer included in the version string
Aaron Marcuse-Kubitza
12:08 PM Revision 6941: README.TXT: Name archived imports without the "public." prefix so that their backups will work with the new `make backups/%.backup/remove` command, which does not add back the prefix
Aaron Marcuse-Kubitza
11:56 AM Revision 6940: backups/Makefile; $(public*): Don't add a "public." prefix to get the name of the public schema
Aaron Marcuse-Kubitza
11:40 AM Revision 6939: backups/Makefile: Removed no longer used $(rmSchema)
Aaron Marcuse-Kubitza
11:39 AM Revision 6938: backups/Makefile: Use \$(rmSchemaCmd) from lib/common.Makefile instead of \$(rmSchema)
Aaron Marcuse-Kubitza
11:20 AM Revision 6937: vegbien_dest: Use $version as $public when $public not provided. When neither is provided, continue to use "public" and also set $version to that.
Aaron Marcuse-Kubitza
11:11 AM Revision 6936: schemas/Makefile: Installation: rotate: Use just the version, without the "public." prefix
Aaron Marcuse-Kubitza
11:04 AM Revision 6935: schemas/Makefile: Installation: `public/install public%/install`: Generalized to %/install to allow public schema versions with any name. This requires moving `%/install: %.sql` before it to override it.
Aaron Marcuse-Kubitza
11:00 AM Revision 6934: schemas/Makefile: Installation: Merged public/install and public%/install
Aaron Marcuse-Kubitza
10:54 AM Revision 6933: schemas/Makefile: Installation: Moved %/uninstall to beginning of section because it applies to all schemas
Aaron Marcuse-Kubitza
10:52 AM Revision 6932: schemas/Makefile: Installation: public: Generalized public%/publish to %/publish so that public schema versions don't have to start with public_
Aaron Marcuse-Kubitza
10:34 AM Revision 6931: schemas/Makefile: Installation: %/uninstall: Also display schema delete confirmation for schemas whose name is just the version suffix (r<revision #>)
Aaron Marcuse-Kubitza
10:32 AM Revision 6930: schemas/Makefile: Merged public%/uninstall and %/uninstall
Aaron Marcuse-Kubitza
09:49 AM Revision 6929: lib/common.Makefile: Added version target, which prints the current $(version) value
Aaron Marcuse-Kubitza
09:36 AM Revision 6928: schemas/Makefile: Installation: public: public%/uninstall: Fixed bug where need to remove the *specified* version of the public schema, not public itself. Generalized $(confirmRmPublicSchema) so it could also be used for named versions of the public schema. Inlined $(rmPublicSchema) since it's now only used in one place.
Aaron Marcuse-Kubitza
09:26 AM Revision 6927: lib/common.Makefile: Revisions: $(version): Use just the revision # to avoid cluttering the schema and log file names with long datetime strings
Aaron Marcuse-Kubitza
09:25 AM Revision 6926: schemas/Makefile: public%/install: schema comment: Include current date/time after version
Aaron Marcuse-Kubitza
09:20 AM Revision 6925: lib/common.Makefile: Replaced no longer used $(date) with function to generate human-readable text date (rather than date to put in filename). Removed leading zeros from date and hour. Added timezone.
Aaron Marcuse-Kubitza
09:07 AM Revision 6924: backups/Makefile: Removed no longer used $(dateFmt), $(mtime)
Aaron Marcuse-Kubitza
08:59 AM Revision 6923: backups/Makefile: Removed %.backup/rotate, because this incorrectly causes the current time rather than the version to be used in the backup filename. The version should instead be specified in the backup filename when it's created.
Aaron Marcuse-Kubitza
08:51 AM Revision 6922: schemas/Makefile: Installation: public: Added public%/publish to replace the current public schema with the given version
Aaron Marcuse-Kubitza
08:37 AM Revision 6921: schemas/Makefile: Installation: public: public/uninstall: Added public%/uninstall as a target to allow uninstalling versions of the public schema
Aaron Marcuse-Kubitza
08:30 AM Revision 6920: schemas/Makefile: Installation: public: public%/install: Add a comment on the schema containing the versioned schema name, so that if the schema is later renamed to just public (i.e. "published" as the current version), it will still be possible to tell which version the public schema came from
Aaron Marcuse-Kubitza
08:22 AM Revision 6919: schemas/Makefile: Installation: public: Added public%/install, to install a version of the public schema
Aaron Marcuse-Kubitza
07:59 AM Revision 6918: schemas/Makefile: Removed unused $(os)
Aaron Marcuse-Kubitza
07:58 AM Revision 6917: schemas/Makefile: Removed unused $(SED)
Aaron Marcuse-Kubitza
06:22 AM Revision 6916: Moved schemas-related commands from root Makefile to schemas/Makefile
Aaron Marcuse-Kubitza
06:15 AM Revision 6915: Makefiles: Factored out common vars/functions into lib/common.Makefile
Aaron Marcuse-Kubitza
05:59 AM Revision 6914: root Makefile: $(psqlNoSearchPath): Merged $(psqlAsBien) into it because it's the only place $(psqlAsBien) is used
Aaron Marcuse-Kubitza
05:56 AM Revision 6913: root Makefile: $(psqlAsBien): Use psql_script_vegbien instead of psql_vegbien, which adds $(psqlOpts) itself
Aaron Marcuse-Kubitza
05:50 AM Revision 6912: schemas/Makefile: Include lib/common.Makefile
Aaron Marcuse-Kubitza
05:23 AM Revision 6911: inputs/import.stats.xls: Reformatted so the first by column import and the comparison by row import will fit on the same page when printed on portrait-mode letter paper
Aaron Marcuse-Kubitza
05:10 AM Revision 6910: inputs/import.stats.xls: Changed import type labels to By row/By column so they would fit into one field, leaving the extra field free to contain the revision #
Aaron Marcuse-Kubitza
05:02 AM Revision 6909: lib/common.Makefile: Revisions: Allow $(version) to be overridden in the environment, so that the public schema and all log files share the same, pregenerated version
Aaron Marcuse-Kubitza
04:16 AM Revision 6908: schemas/vegbien.sql: Merged provider_view, provider_count, and owner_count into provider_count, using the combining query for Brad's data providers page at <http://bien.nceas.ucsb.edu/bien/people/data-providers/>
Aaron Marcuse-Kubitza
01:23 AM Revision 6907: schemas/vegbien.sql: sync_taxon_trait_to_view(): Changed pkey to index because there can be multiple values of the same taxon's trait from different observations
Aaron Marcuse-Kubitza
01:16 AM Revision 6906: mappings/Makefile: VegCore.csv: Filter out the VegCore tables so they are not matched as terms. This is necessary because some terms have the same name as a table, but the term should be the match rather than the table.
Aaron Marcuse-Kubitza
12:29 AM Revision 6905: sql.py: DbConn.col_info(): raising sql_gen.NoUnderlyingTableException: Fixed bug where also need to catch DoesNotExistException, which is thrown by ::regclass
Aaron Marcuse-Kubitza
12:26 AM Revision 6904: sql.py: DbConn.col_info(): Fixed bug where need to run run_query() recoverably, because this query throws an exception if the column's table does not exist (the information_schema query just returned no rows)
Aaron Marcuse-Kubitza
12:22 AM Revision 6903: sql.py: DbConn.col_info(): Fixed bug where need to use pg_get_expr() on pg_attrdef.adbin instead of shortcut field adsrc, because adsrc does not include schema qualifiers on table names (including strings passed to `nextval('..._seq'::regclass)`)
Aaron Marcuse-Kubitza

12/18/2012

11:42 PM Revision 6902: sql.py: DbConn.col_info(): Fixed bug where need to pass through cacheable param to run_query()
Aaron Marcuse-Kubitza
11:41 PM Revision 6901: sql.py: DbConn.col_info(): Fixed bug where need to use .to_str(self) instead of self.esc_value() because self.esc_value() expects a value, not a sql_gen.Literal instance
Aaron Marcuse-Kubitza
11:34 PM Revision 6900: sql.py: DbConn.col_info(): Fixed bug where self needs to be used everywhere that db normally is, because this is a DbConn method rather than a global function
Aaron Marcuse-Kubitza
11:31 PM Revision 6899: sql.py: DbConn.col_info(): For PostgreSQL, use pg_catalog tables directly instead of their views in information_schema. This allows using ::regclass to look up the table in the search_path, and fixes a bug in imports with an explicit public schema where column types were looked up in public instead of public.<version>. Also don't wrap default using sql_gen.as_Code() when it's None (indicating no default value, aka default=NULL), because this value is interpreted specially by sql_gen.TypedCol.
Aaron Marcuse-Kubitza
11:06 PM Revision 6898: inputs/Makefile: Input data: $(rsyncSrcs): Also include log files other than install.log.sql
Aaron Marcuse-Kubitza
09:41 PM Revision 6897: import_all: Run all imports (not just the main datasources' import) with $import_source turned off, so that the Source tables will not be imported a second time when the datasource's main tables are imported. Note that it's not necessary to wait for asynchronous commands after the jobs for the main import are started (so that $import_source is not unset until after they are started), because with_all does not return until all jobs are started and have noted the $import_source setting in effect in the shell environment.
Aaron Marcuse-Kubitza
09:32 PM Revision 6896: import_all: Source tables import: Fixed bug where need to use $all option to with_all to also include special datasources starting with "."
Aaron Marcuse-Kubitza
09:23 PM Revision 6895: make_analytical_db: Also create taxon_trait materialized view
Aaron Marcuse-Kubitza

12/17/2012

08:17 PM Revision 6894: inputs/*/*/map.csv: Reverted special OMIT mappings for input columns that have the same name as a VegCore table and have not yet been mapped to a VegCore term
Aaron Marcuse-Kubitza
08:06 PM Revision 6893: mappings/Makefile: VegCore.csv: Filter out the VegCore tables so they are not matched as terms. This is necessary because some terms have the same name as a table, but the term should be the match rather than the table.
Aaron Marcuse-Kubitza
08:04 PM Revision 6892: mappings/VegCore.csv: Changed line endings to \r\n to match the output of filter_out_ci
Aaron Marcuse-Kubitza
05:51 PM Revision 6891: inputs/CTFS/TaxonOccurrence/map.csv: Mapped SpeciesAuthority
Aaron Marcuse-Kubitza
04:59 PM Revision 6890: backups/Makefile: Synchronization: $(remote): Fixed bug where need trailing / at end of path
Aaron Marcuse-Kubitza
04:32 PM Revision 6889: backups/Makefile: Synchronization: $(remote): Updated path to backups
Aaron Marcuse-Kubitza
04:30 PM Revision 6888: README.TXT: Data import: On jupiter: Updated path to backups
Aaron Marcuse-Kubitza
04:25 PM Revision 6887: README.TXT: Installation: Added command to change to the directory of the checked out files
Aaron Marcuse-Kubitza
04:24 PM Revision 6886: README.TXT: Installation: Added command to check out files from svn
Aaron Marcuse-Kubitza
03:51 PM Revision 6885: schemas/vegbien.sql: Added taxon_trait materialized view
Aaron Marcuse-Kubitza
02:43 PM Revision 6884: mappings/Veg+-VegCore.csv: Sources: Removed redundant bien2_ prefix from bien2_staging subnamespace
Aaron Marcuse-Kubitza
02:21 PM Revision 6883: schemas/vegbien.sql: trait: trait_unique: Removed value and units because there should only be one value of a trait for each taxonoccurrence
Aaron Marcuse-Kubitza
02:18 PM Revision 6882: schemas/vegbien.sql: Reattached trait to taxonoccurrence instead of taxonlabel, because the TraitObservation traits data is actually associated with a particular occurrence (plant observation complete with location, date, etc.), rather than just a taxon
Aaron Marcuse-Kubitza
01:31 PM Revision 6881: Added inputs/bien2_traits/
Aaron Marcuse-Kubitza
01:29 PM Revision 6880: mappings/VegCore-VegBIEN.csv: Mapped traits-related DwC terms measurementType, measurementValue, measurementUnit
Aaron Marcuse-Kubitza
12:34 PM Revision 6879: schemas/vegbien.ERD.mwb: Added trait table to ERD
Aaron Marcuse-Kubitza
12:25 PM Revision 6878: schemas/vegbien.sql: trait: Added trait_unique unique index
Aaron Marcuse-Kubitza
12:19 PM Revision 6877: schemas/vegbien.sql: trait: Added units field
Aaron Marcuse-Kubitza
12:14 PM Revision 6876: schemas/vegbien.sql: trait: Renamed type to name because TraitObservation stores trait names rather than types
Aaron Marcuse-Kubitza
12:07 PM Revision 6875: schemas/vegbien.sql: trait: Linked to taxonlabel instead of stemobservation, because TraitObservation's traits are taxon-level and stem-level traits currently go in named fields instead of a stem traits table
Aaron Marcuse-Kubitza
11:45 AM Revision 6874: inputs/.TNRS/tnrs_*/map.csv: Remapped Source to OMIT so it won't match to the Source table
Aaron Marcuse-Kubitza
11:37 AM Revision 6873: inputs/.TNRS/tnrs_other/map.csv: Updated for new VegCore terms, which include Source as a table name. This field will need to be remapped so it doesn't collide with the table name.
Aaron Marcuse-Kubitza
10:04 AM Revision 6872: inputs/import.stats.xls: Updated import times
Aaron Marcuse-Kubitza
10:01 AM Revision 6871: README.TXT: Data import: Added step to check that the source table contains entries for all inputs
Aaron Marcuse-Kubitza

12/14/2012

01:01 PM Revision 6870: Regenerated vegbien.ERD exports
Aaron Marcuse-Kubitza
12:52 PM Revision 6869: make_analytical_db: Also populate owner_count
Aaron Marcuse-Kubitza
12:51 PM Revision 6868: make_analytical_db: Generate provider_count before analytical_aggregate because it's much faster
Aaron Marcuse-Kubitza
12:50 PM Revision 6867: schemas/vegbien.sql: Added materialized view owner_count, generated from owner_count_view
Aaron Marcuse-Kubitza
12:21 PM Revision 6866: make_analytical_db: Also populate provider_count
Aaron Marcuse-Kubitza
12:20 PM Revision 6865: schemas/vegbien.sql: Added materialized view provider_count, generated from provider_count_view
Aaron Marcuse-Kubitza
12:09 PM Revision 6864: schemas/vegbien.sql: Added provider_count_view for counts of occurrences per top-level provider
Aaron Marcuse-Kubitza
11:56 AM Revision 6863: Regenerated mappings/VegCore.htm
Aaron Marcuse-Kubitza
11:52 AM Revision 6862: Regenerated mappings/VegCore.htm
Aaron Marcuse-Kubitza
11:10 AM Revision 6861: Regenerated mappings/VegCore.htm
Aaron Marcuse-Kubitza
10:39 AM Revision 6860: schemas/vegbien.sql: provider_view: Sort NULL sourcetype last
Aaron Marcuse-Kubitza
10:36 AM Revision 6859: schemas/vegbien.sql: Added provider_view, which combines source and sourcename
Aaron Marcuse-Kubitza
10:31 AM Revision 6858: schemas/vegbien.sql: sourcename: Gave public_ SELECT permissions
Aaron Marcuse-Kubitza
10:17 AM Revision 6857: Regenerated mappings/VegCore.htm
Aaron Marcuse-Kubitza
10:15 AM Revision 6856: README.TXT: Maintenance: VegCore data dictionary: Regenerate everything in mappings/ that changes when VegCore.htm changes (such as VegCore.tables.redmine) instead of just VegCore.csv
Aaron Marcuse-Kubitza
09:29 AM Revision 6855: inputs/*/Source/map.csv without mappings: Added referenceType, etc. mappings. This also ensures that the source table entry for the datasource will be created before the herbaria list is imported, causing all top-level datasources to sort at the top of the source table.
Aaron Marcuse-Kubitza
09:02 AM Revision 6854: schemas/vegbien.sql: Granted the public_ user read-only access to the contents of the source table
Aaron Marcuse-Kubitza
08:53 AM Revision 6853: root Makefile: PostgreSQL: $(editPhppgadmin): Ignore errors if patch has already been applied
Aaron Marcuse-Kubitza
08:52 AM Revision 6852: lib/phpPgAdmin.config.inc.php.diff: Remove context so segment matching would depend only on the $conf['extra_login_security'] line itself
Aaron Marcuse-Kubitza
08:29 AM Revision 6851: mappings/Makefile: Added VegCore.tables.redmine, which contains the Redmine-formatted list of VegCore tables to paste into <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCore#Tables>
Aaron Marcuse-Kubitza
08:18 AM Revision 6850: mappings/: Removed no longer used VegCore.redmine. VegCore.csv is now generated from the Redmine page instead of the other way around.
Aaron Marcuse-Kubitza
08:12 AM Revision 6849: mappings/Makefile: Added VegCore.tables.csv, which contains all the tables in the VegCore data dictionary
Aaron Marcuse-Kubitza
06:59 AM Revision 6848: README.TXT: Data import: backups/fix_perms: Run using sudo to also change permissions on files owned by the bien user, and to change the owner of files owned by you to the bien user
Aaron Marcuse-Kubitza
06:45 AM Revision 6847: Regenerated mappings/VegCore.csv, which adds categories
Aaron Marcuse-Kubitza
05:47 AM Revision 6846: README.TXT: Maintenance: Added instructions to regenerate mappings/VegCore.csv whenever the VegCore data dictionary page is changed
Aaron Marcuse-Kubitza
05:41 AM Revision 6845: mappings/Makefile: Generate VegCore.csv from the VegCore data dictionary page by extracting all HTML anchors (in Redmine, each section heading, and therefore each VegCore term, gets its own anchor)
Aaron Marcuse-Kubitza
05:34 AM Revision 6844: mappings/VegCore.csv: Changed line endings to \n to match what sed generates from the VegCore data dictionary page
Aaron Marcuse-Kubitza
05:31 AM Revision 6843: mappings/VegCore.csv: Removed informational columns, because this information is now maintained on the VegCore data dictionary page at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCore>
Aaron Marcuse-Kubitza
05:11 AM Revision 6842: mappings/Veg+-VegCore.csv.csv: Removed hypothetical terms which are not in use by any VegBIEN datasource
Aaron Marcuse-Kubitza
05:00 AM Revision 6841: mappings/Veg+-VegCore.csv: habit: Remapped to growthForm, which replaces verbatimGrowthForm
Aaron Marcuse-Kubitza
04:59 AM Revision 6840: mappings/Veg+-VegCore.csv.csv: Removed hypothetical terms which are not in use by any VegBIEN datasource
Aaron Marcuse-Kubitza
04:50 AM Revision 6839: mappings/VegCore.csv: BIEN2 terms: Added sub-namespaces (bien_web, geoscrub, etc.) to source URLs
Aaron Marcuse-Kubitza
04:15 AM Revision 6838: dict2redmine: redmine_add_links(): Hyperlink just the source name, not also the () around it
Aaron Marcuse-Kubitza
03:54 AM Revision 6837: dict2redmine: RedmineDictWriter: Use h2 instead of h3 for the term name so that the term will be normal-sized instead of smaller in the Redmine table of contents
Aaron Marcuse-Kubitza
03:52 AM Revision 6836: dict2redmine: Renamed redmine_url() to redmine_link() because it generates links, not URLs
Aaron Marcuse-Kubitza
03:49 AM Revision 6835: dict2redmine: redmine_add_links(): Put citations in () instead of [] to avoid conflicting with the Redmine syntax for internal links ( [[...]] )
Aaron Marcuse-Kubitza
03:18 AM Revision 6834: mappings/VegCore.csv: Terms: Removed namespace prefixes (dcterms:), because VegCore terms are globally unique within VegCore and there should not be multiple versions of the same VegCore term with different namespaces. Provenance is instead indicated in the Sources column, which contains not just a namespace but a full URL to each source term.
Aaron Marcuse-Kubitza
03:00 AM Revision 6833: dict2redmine: Hyperlink each term to its anchor in the data dictionary, rather than to its first source, which is not necessarily the definitive definition of the term. This also allows clicking the term to get its permalink in the address bar, rather than having to click the small, light gray paragraph mark next to the term name that Redmine provides.
Aaron Marcuse-Kubitza
02:57 AM Revision 6832: dict2redmine: redmine_add_links(): Fixed bug where need to avoid matching internal links ( [[...]] ) as citations ( [...] )
Aaron Marcuse-Kubitza
02:46 AM Revision 6831: mappings/VegCore.csv: Term names: Changed special characters to _ because Redmine doesn't support special characters in HTML anchors (it removes everything except letters, numbers, _, and -)
Aaron Marcuse-Kubitza
02:42 AM Revision 6830: mappings/Makefile: .Veg+-VegCore.csv.last_cleanup: Also canon the output (VegCore) column to the VegCore.csv vocabulary. ? prefixes are not a problem because there are always at least two alternatives listed for these terms, so canon will not modify the output field.
Aaron Marcuse-Kubitza
01:49 AM Revision 6829: psql_script_vegbien: Run psql_vegbien with `nice -n +5` to prevent CPU-intensive operations from slowing down the shell/UI
Aaron Marcuse-Kubitza
01:46 AM Revision 6828: inputs/import.stats.xls: Updated import times
Aaron Marcuse-Kubitza
01:37 AM Revision 6827: Regenerated inputs/CVS/taxonObservation_/new_terms.csv. Note that it includes mappings to terms which are not in mappings/VegCore-VegBIEN.csv, which are prefixed with *.
Aaron Marcuse-Kubitza
01:34 AM Revision 6826: input.Makefile: Maps validation: %/new_terms.csv: Undid incorrect change of column to filter terms out of. This actually needs to be the input column, even though unmapped_terms.csv is generated from the output column, because it's possible to have a mapping to a term which is not in mappings/VegCore-VegBIEN.csv, and such a term would show up in unmapped_terms.csv but should not be filtered out of new_terms.csv.
Aaron Marcuse-Kubitza
01:17 AM Revision 6825: lib/phpPgAdmin.login.php.diff: public_ user's password message: Print as its own message instead of appending it to $msg. Print it before any error message so it always appears at the top of the page.
Aaron Marcuse-Kubitza
12:51 AM Revision 6824: root Makefile: PostgreSQL: phpPgAdmin: Edit config file to allow passwordless logins. Edit login page to fill in public_ as the default username and add a message to leave the password blank for that user.
Aaron Marcuse-Kubitza

12/12/2012

10:45 PM Revision 6823: root Makefile: $(postgresReload-*): Ignore `mv -n` errors, which generally indicate that the existing *.conf was already renamed to *.conf.old
Aaron Marcuse-Kubitza
10:36 PM Revision 6822: Makefile mk_db, schemas/pg_hba*.conf: Added passwordless public_ user with access to just the database schema. Note that in PostgreSQL, only users with explicit GRANT permissions on a table can read data in that table, but all DB users with a login can view all table schemas.
Aaron Marcuse-Kubitza
10:26 PM Revision 6821: README.TXT: Maintenance: system updates that affect PostgreSQL: Added that this applies to both Linux and Mac OS X
Aaron Marcuse-Kubitza
10:26 PM Revision 6820: README.TXT: Maintenance: system updates that affect PostgreSQL: list of things that could break if PostgreSQL is not restarted: Added that you may not be able to access the database as the postgres superuser
Aaron Marcuse-Kubitza
10:24 PM Revision 6819: README.TXT: Maintenance: system updates that affect PostgreSQL: list of things that could break if PostgreSQL is not restarted: Added that you may not be able to access the database as the postgres superuser
Aaron Marcuse-Kubitza
09:40 PM Revision 6818: backups/fix_perms: Removed world read permissions from backups dir. Note that this will require superuser permissions to view archived backups on jupiter, because the bien group is not set up with the same members as on vegbiendev. (On jupiter, it contains only stri,regetz,donoghue,naiamh.)
Aaron Marcuse-Kubitza
08:55 PM Revision 6817: inputs/CVS/taxonObservation_/map.csv: Mapped plantname, plantNameWithAuthority
Aaron Marcuse-Kubitza
08:47 PM Revision 6816: inputs/CVS/cvs.~.utils.sql: plantconcept_plantnames(): Use CVS's taxonLevel values, which are different from the VegBank plantLevel values that the original version of this function used
Aaron Marcuse-Kubitza
08:25 PM Revision 6815: inputs/CVS/cvs.~.utils.sql: plantconcept_*(): Use plantConcept.lowestParentConcept_ID,taxonLevel instead of plantStatus.plantParent_ID,plantLevel to find the plantConcept's ancestors, because CVS does not use plantStatus except in very few cases and instead puts the parent link directly in plantConcept
Aaron Marcuse-Kubitza
08:09 PM Revision 6814: inputs/VegBank/vegbank.~.utils.sql: plantconcept_plantnames(): Made function STABLE instead of VOLATILE because it does not modify any tables
Aaron Marcuse-Kubitza
08:08 PM Revision 6813: inputs/CVS/cvs.~.utils.sql: plantconcept_plantnames(): Made function STABLE instead of VOLATILE because it does not modify any tables
Aaron Marcuse-Kubitza
06:57 PM Revision 6812: mappings/VegCore.csv: Removed no longer used verbatimGrowthForm. Use growthForm instead.
Aaron Marcuse-Kubitza
06:56 PM Revision 6811: mappings/VegCore-VegBIEN.csv: Removed no longer used verbatimGrowthForm. Map to growthForm instead and translate growth form values to VegBIEN's growthform enum.
Aaron Marcuse-Kubitza
06:54 PM Revision 6810: inputs/Madidi/Organism/map.csv: Habit: Mapped growth form values
Aaron Marcuse-Kubitza
06:39 PM Revision 6809: inputs/Madidi/Organism/map.csv: Remapped Habit from verbatimGrowthForm to growthForm, which points to the same place
Aaron Marcuse-Kubitza
06:27 PM Revision 6808: inputs/CVS/taxonObservation_/map.csv: Use denorm_* denormalized taxonomic ranks in place of the normalized ranks when both are provided
Aaron Marcuse-Kubitza
06:25 PM Revision 6807: input.Makefile: Maps validation: %/new_terms.csv: Fixed bug where need to filter unmapped_terms.csv's terms out of the output column, not the input column, because that's what the unmapped terms are generated from. Usually these columns are the same for unmapped terms, but sometimes an output term is changed from the original column's name but still doesn't match a VegCore term in mappings/VegCore-VegBIEN.csv.
Aaron Marcuse-Kubitza
06:08 PM Revision 6806: input.Makefile: SVN: add: Added comment with instructions to update all inputs with these settings, using `make inputs/add`
Aaron Marcuse-Kubitza
06:07 PM Revision 6805: input.Makefile: SVN: add: verify: Also ignore *.xlsx
Aaron Marcuse-Kubitza
06:00 PM Revision 6804: README.TXT: Data import: Creating enough disk space: Added instructions for removing archived backups to free up space
Aaron Marcuse-Kubitza
05:15 PM Revision 6803: inputs/CVS/taxonObservation_/map.csv: Fixed bug where taxonLevel, not taxonRank, needs to be mapped to taxonRank, because CVS's taxonRank is actually a number, while taxonLevel contains the corresponding text string
Aaron Marcuse-Kubitza
05:12 PM Revision 6802: README.TXT: Data import: Before import, added step to make sure there is at least 100GB of disk space
Aaron Marcuse-Kubitza
04:41 PM Revision 6801: sql_io.py: put_table(): is_function: Fixed bug where need to add the pkeys table's test pkey constraint *after* the data is added rather than when the empty table is created, to avoid adding a pkey constraint that will later be violated by data which returns multiple output rows for an input row (such as calls to _split())
Aaron Marcuse-Kubitza
04:36 PM Revision 6800: sql_io.py: put_table(): insert_into_pkeys(): Allow callers to override run_query_into()'s add_pkey_ param in case the initial version of the pkeys table should not yet have the test pkey constraint (e.g. because data is added after the table is created)
Aaron Marcuse-Kubitza
04:24 PM Revision 6799: README.TXT: Data import: Checking for errors: Search for "Command exited with non-zero status" to find errors, which is faster than checking that each input's log ends in "Encountered 0 error(s)"
Aaron Marcuse-Kubitza
04:13 PM Revision 6798: inputs/import.stats.xls: Updated import times
Aaron Marcuse-Kubitza
03:50 PM Revision 6797: README.TXT: Data import: import_all: Corrected text of note about time until control is returned to the shell
Aaron Marcuse-Kubitza
03:42 PM Revision 6796: README.TXT: Data import: Moved download of logs to right after the import is done, because this is a quick step that doesn't depend on the backup- and export-creation steps
Aaron Marcuse-Kubitza

12/11/2012

11:41 AM Revision 6795: mappings/VegCore-VegBIEN.csv: institutionCode: Removed mapping to sourcename.matched_source_id, which is now autopopulated. Split any list of institutionCodes apart using new _split().
Aaron Marcuse-Kubitza
11:28 AM Revision 6794: schemas/vegbien.sql: sourcename: Added sourcename_set_matched_source_id() trigger
Aaron Marcuse-Kubitza
11:22 AM Revision 6793: schemas/functions.sql: Added _split()
Aaron Marcuse-Kubitza
11:13 AM Revision 6792: schemas/vegbien.sql: sourcelist_unique: Removed COALESCE() around name because it's NOT NULL
Aaron Marcuse-Kubitza
11:11 AM Revision 6791: schemas/vegbien.sql: Allow multiple institutionCodes for each specimenreplicate by linking new sourcelist table many-to-many to source via sourcename (which is now a linking table)
Aaron Marcuse-Kubitza
10:50 AM Revision 6790: schemas/vegbien.sql: sourcename: Removed system, which has been replaced by source_id as the scoping field
Aaron Marcuse-Kubitza
10:42 AM Revision 6789: schemas/vegbien.sql: party: Added sourceaccessioncode and uniquify on it instead when provided. vegbien.ERD.mwb: Rearranged party-related tables to allow the tables to be fully expanded.
Aaron Marcuse-Kubitza
10:40 AM Revision 6788: schemas/vegbien.sql: Renamed sampletype to observationtype to match the VegCore term
Aaron Marcuse-Kubitza
10:09 AM Revision 6787: Added inputs/SALVIAS/salvias_users.~.clean_up.sql
Aaron Marcuse-Kubitza
10:01 AM Revision 6786: inputs/SALVIAS/: Added salvias_users tables
Aaron Marcuse-Kubitza
10:00 AM Revision 6785: my2pg: Translate blob to bytea
Aaron Marcuse-Kubitza
09:55 AM Revision 6784: my2pg: Also remove UNIQUE and FULLTEXT inline indexes
Aaron Marcuse-Kubitza
08:55 AM Revision 6783: mappings/VegCore.csv: UNUSED: Comments: Added Redmine formatting
Aaron Marcuse-Kubitza
08:55 AM Revision 6782: mappings/VegCore.csv: OMIT: Changed "is omitted" to "should be omitted", because the mappings specify suggestions rather than requirements as to how a field should be used
Aaron Marcuse-Kubitza
08:51 AM Revision 6781: mappings/VegCore.csv: Removed no longer used subInstitutionCode. Use datasource, institutionCode instead.
Aaron Marcuse-Kubitza
08:51 AM Revision 6780: schemas/vegbien.sql: analytical_*: Renamed subInstitutionCode to institutionCode because this is the institution storing the specimen, as defined by DwC
Aaron Marcuse-Kubitza
08:45 AM Revision 6779: schemas/vegbien.sql: analytical_*: Renamed institutionCode to datasource because this is actually the top-level datasource providing the record, not the institution storing the specimen
Aaron Marcuse-Kubitza
08:38 AM Revision 6778: mappings/VegCore.csv: Added datasource
Aaron Marcuse-Kubitza
08:32 AM Revision 6777: schemas/vegbien.sql: Renamed sampletype to observationtype to match the VegCore term
Aaron Marcuse-Kubitza
08:16 AM Revision 6776: mappings/VegCore.csv: referenceType: Added closed list values
Aaron Marcuse-Kubitza
08:08 AM Revision 6775: mappings/VegCore.csv: observationMeasure: Re-sourced to SALVIAS:observation_type, since SALVIAS comes before VegBIEN in the source precedence
Aaron Marcuse-Kubitza
08:01 AM Revision 6774: mappings/VegCore.csv: Renamed sampleType to observationType to match the SALVIAS term it's derived from
Aaron Marcuse-Kubitza
07:57 AM Revision 6773: inputs/SALVIAS-CSV/Plot/map.csv: Mapped observation_type
Aaron Marcuse-Kubitza
07:37 AM Revision 6772: mappings/VegCore.csv: Added individualCount_*cm_or_more used by analytical_aggregate
Aaron Marcuse-Kubitza
07:10 AM Revision 6771: mappings/VegCore.csv: subplotX/Y: Added definition
Aaron Marcuse-Kubitza
07:07 AM Revision 6770: mappings/VegCore.csv: subplot*: Sources: Put SALVIAS:subplot last, because the specific field is closer in meaning to the term than the category
Aaron Marcuse-Kubitza
07:02 AM Revision 6769: mappings/VegCore.csv: project*Date: Added definition
Aaron Marcuse-Kubitza
06:58 AM Revision 6768: mappings/VegCore.csv: parentPlotName: Added definition
Aaron Marcuse-Kubitza
06:56 AM Revision 6767: mappings/VegCore.csv: parent*: Sources: Put VegBank:PARENT_ID last, because the specific field is closer in meaning to the term than the category
Aaron Marcuse-Kubitza
06:40 AM Revision 6766: mappings/VegCore.csv: locationName: Added definition
Aaron Marcuse-Kubitza
06:33 AM Revision 6765: mappings/VegCore.csv: eventDate/startDate/endDate: Added definition
Aaron Marcuse-Kubitza
06:27 AM Revision 6764: mappings/VegCore.csv: locationName: Sources: Put VegX:plotName first because it is closest in meaning to the term
Aaron Marcuse-Kubitza
06:22 AM Revision 6763: mappings/VegCore.csv: recordedBy*: Added definition
Aaron Marcuse-Kubitza
06:15 AM Revision 6762: mappings/VegCore.csv: recordedBy.middleName: Added source to DwC:recordedBy
Aaron Marcuse-Kubitza
06:01 AM Revision 6761: inputs/CVS/: Joined together stemCount and stemLocation tables to create stemLocation_, in order to include the stem size class's measurements in each tagged stem's stemobservation (in addition to in the stemobservation for the aggregateoccurrence as a whole)
Aaron Marcuse-Kubitza
05:32 AM Revision 6760: inputs/CVS/: Joined together taxonImportance and stemCount tables to create stemCount_, because stemCount actually stores stem abundance by size, rather than grouping stems by organism (http://vegbankdev.nceas.ucsb.edu/vegbank/views/dba_tabledescription_detail.jsp?view=detail&wparam=stemcount&entity=dba_tabledescription&where=where_tablename), and is thus an AggregateOccurrence-related table along with taxonImportance
Aaron Marcuse-Kubitza
05:30 AM Revision 6759: inputs/CVS/: Joined together taxonImportance and stemCount tables to create stemCount_, because stemCount actually stores stem abundance by size, rather than grouping stems by organism (http://vegbankdev.nceas.ucsb.edu/vegbank/views/dba_tabledescription_detail.jsp?view=detail&wparam=stemcount&entity=dba_tabledescription&where=where_tablename), and is thus an AggregateOccurrence-related table along with taxonImportance
Aaron Marcuse-Kubitza
04:41 AM Revision 6758: inputs/CVS/taxonObservation_/map.csv: Fixed bug where need to indicate that data is plots data to prevent the specimenreplicate ID from being forwarded to the location ID
Aaron Marcuse-Kubitza
04:38 AM Revision 6757: inputs/VegBank/taxonobservation_/map.csv: Fixed bug where need to indicate that data is plots data to prevent the specimenreplicate ID from being forwarded to the location ID
Aaron Marcuse-Kubitza
04:37 AM Revision 6756: mappings/VegCore-VegBIEN.csv: Don't forward specimenreplicate IDs to location for plots data (where the specimenreplicate IDs apply only to the specimen)
Aaron Marcuse-Kubitza
04:31 AM Revision 6755: xml_func.py: Simplifying functions: Added _eq()
Aaron Marcuse-Kubitza
04:31 AM Revision 6754: xml_func.py: Added is_scalar()
Aaron Marcuse-Kubitza
04:30 AM Revision 6753: xml_func.py: process(): row-based mode: preserving complex funcs: Fixed bug where functions with no params would crash reduce() because it requires at least one value when no initial value is specified
Aaron Marcuse-Kubitza
04:28 AM Revision 6752: Added scalar.py
Aaron Marcuse-Kubitza
03:39 AM Revision 6751: Renamed inputs/NCU-NCSC/ to NCU because this is the primary herbarium contained in the data
Aaron Marcuse-Kubitza
03:33 AM Revision 6750: inputs/VegBank/: Joined together stemcount and stemlocation tables to create stemlocation_, in order to include the stem size class's measurements in each tagged stem's stemobservation (in addition to in the stemobservation for the aggregateoccurrence as a whole)
Aaron Marcuse-Kubitza
03:29 AM Revision 6749: inputs/VegBank/: Joined together stemcount and stemlocation tables to create stemlocation_, in order to include the stem size class's measurements in each tagged stem's stemobservation (in addition to in the stemobservation for the aggregateoccurrence as a whole)
Aaron Marcuse-Kubitza
03:18 AM Revision 6748: inputs/VegBank/stemlocation/map.csv: Also mapped stemlocation_id to individualID to create one plantobservation for each stemobservation
Aaron Marcuse-Kubitza
03:15 AM Revision 6747: inputs/VegBank/stemlocation/map.csv: Remapped stemcount_id to aggregateOccurrenceID to match stemcount_id's mapping in stemcount_
Aaron Marcuse-Kubitza
02:59 AM Revision 6746: inputs/VegBank/: Joined together taxonimportance and stemcount tables to create stemcount_, because stemcount actually stores stem abundance by size, rather than grouping stems by organism (http://vegbankdev.nceas.ucsb.edu/vegbank/views/dba_tabledescription_detail.jsp?view=detail&wparam=stemcount&entity=dba_tabledescription&where=where_tablename)
Aaron Marcuse-Kubitza
02:53 AM Revision 6745: Added inputs/VegBank/_archive
Aaron Marcuse-Kubitza
02:50 AM Revision 6744: input.Makefile: Testing: Added `%/test: %/test.xml` to allow testing just a subdir
Aaron Marcuse-Kubitza
02:42 AM Revision 6743: input.Makefile: General targets: Added `%/: %/map.csv` to allow remaking just a subdirectory
Aaron Marcuse-Kubitza
01:53 AM Revision 6742: inputs/CVS/: Refreshed data with new export from Bob
Aaron Marcuse-Kubitza
01:52 AM Revision 6741: inputs/CVS/cvs-archive-2012-12-04.schema.sql: Fixed types using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Tools#MS-Access-database-MDB>
Aaron Marcuse-Kubitza
01:48 AM Revision 6740: bin/map: Removed column names simplification, which was causing columns with the same alphanumeric characters but different punctuation to be simplified to the same name. Name simplification is now performed by the mapping mechanism itself, and can be overridden in the mappings.
Aaron Marcuse-Kubitza
01:24 AM Revision 6739: Regenerated inputs/VegBank/new_terms.csv
Aaron Marcuse-Kubitza
12:08 AM Revision 6738: Added inputs/NCU/_src/NCU_specimens_public_2012-12-10.zip.url
Aaron Marcuse-Kubitza
12:04 AM Revision 6737: inputs/NCU/: Refreshed data with new export from Bob
Aaron Marcuse-Kubitza

12/10/2012

09:33 PM Revision 6736: Renamed inputs/NCU-NCSC/ to NCU because this is the primary herbarium contained in the data
Aaron Marcuse-Kubitza
09:31 PM Revision 6735: Renamed inputs/NCU-NCSC/ to NCU because this is the primary herbarium contained in the data
Aaron Marcuse-Kubitza
09:21 PM Revision 6734: Added inputs/NCU-NCSC/_archive
Aaron Marcuse-Kubitza
09:21 PM Revision 6733: input.Makefile: SVN: add: Also add _archive/ subdir
Aaron Marcuse-Kubitza
08:23 PM Revision 6732: publish_analytical_db: Time the import of the data
Aaron Marcuse-Kubitza
08:17 PM Revision 6731: export_analytical_db: Also create a .md5 for the export
Aaron Marcuse-Kubitza
08:16 PM Revision 6730: export_analytical_db: Run commands in the root svn dir
Aaron Marcuse-Kubitza
08:05 PM Revision 6729: mappings/VegCore.csv: soil composition terms: Removed ppm units from the definition, since units are actually fraction or percent
Aaron Marcuse-Kubitza
08:03 PM Revision 6728: README.TXT: Data import: Moved On local machine steps after On nimoy steps, because the On nimoy steps are more important
Aaron Marcuse-Kubitza
07:59 PM Revision 6727: mappings/VegCore.csv: Comments: Added quotes around quotations from other sources
Aaron Marcuse-Kubitza
07:56 PM Revision 6726: mappings/VegCore.csv: Definitions: Added quotes around quotations from other sources
Aaron Marcuse-Kubitza
07:52 PM Revision 6725: Added backups/fix_perms
Aaron Marcuse-Kubitza
07:45 PM Revision 6724: backups/Makefile: Synchronization: %/download: Also download any .md5 file for the file
Aaron Marcuse-Kubitza
07:24 PM Revision 6723: README.TXT: Data import: On nimoy: Added instructions to verify the export's MD5 sum
Aaron Marcuse-Kubitza
07:23 PM Revision 6722: README.TXT: Data import: On nimoy: Replaced step to manually upload the analytical_aggregate export with the command to download it from jupiter
Aaron Marcuse-Kubitza
07:18 PM Revision 6721: README.TXT: Data import: On nimoy: Removed step to rename any existing analytical_aggregate table, since the import is now done directly into the versioned table
Aaron Marcuse-Kubitza
07:11 PM Revision 6720: mappings/VegCore.csv: VegX terms without definitions in VegX: Added definitions from non-VegX sources, etc.
Aaron Marcuse-Kubitza
06:28 PM Revision 6719: README.TXT: Data import: Added instructions to verify the backups' MD5 sums on jupiter
Aaron Marcuse-Kubitza
06:23 PM Revision 6718: README.TXT: Data import: Removed step to copy backups to jupiter, because this now done by `make backups/upload`
Aaron Marcuse-Kubitza
06:11 PM Revision 6717: schemas/vegbien.sql: sync_*_to_view(): Also add `GRANT SELECT TO bien_read` on the *view* used to generate the table, in case the permission was lost when the view was modified
Aaron Marcuse-Kubitza
06:08 PM Revision 6716: schemas/vegbien.sql: sync_*_to_view(): Added `GRANT SELECT TO bien_read`
Aaron Marcuse-Kubitza
06:04 PM Revision 6715: schemas/vegbien.sql: analytical_*: Added back bien_read's SELECT permissions, which had gotten removed when the tables were re-synced to their views
Aaron Marcuse-Kubitza
06:03 PM Revision 6714: schemas/vegbien.my.sql: Regenerated with expanded repl word matching
Aaron Marcuse-Kubitza
06:00 PM Revision 6713: repl: :-prefixing of words to form vars: Fixed bug where : must be matched as a lookbehind assertion, not a capturing group, because the provided regexp itself or its replacement may reference capturing groups, which it expects to be numbered starting with 1
Aaron Marcuse-Kubitza
05:47 PM Revision 6712: inputs/import.stats.xls: Updated import times
Aaron Marcuse-Kubitza
05:47 PM Revision 6711: Regenerated inputs/NY/Specimen/new_terms.csv
Aaron Marcuse-Kubitza

12/07/2012

06:49 PM Revision 6710: inputs/JBM/Specimen/test.xml.ref: Updated inserted row count, which had gotten changed when a test was run on a non-empty database
Aaron Marcuse-Kubitza
06:34 PM Revision 6709: mappings/VegCore.csv: height_ft: Added source to VegBank:stemHeight, which includes a description of the term
Aaron Marcuse-Kubitza
06:30 PM Revision 6708: mappings/VegCore.csv: height_m: Added source to VegBank:stemHeight, which includes a description of the term
Aaron Marcuse-Kubitza
06:27 PM Revision 6707: mappings/VegCore.csv: projectName: Added definition from VegX schema
Aaron Marcuse-Kubitza
06:25 PM Revision 6706: mappings/VegCore.csv: project*Date: Re-sourced to VegBank:project.*Date, since VegX does not have an equivalent term
Aaron Marcuse-Kubitza
06:16 PM Revision 6705: mappings/VegCore.csv: VegX terms: Added definitions from VegX schema, where provided
Aaron Marcuse-Kubitza
05:55 PM Revision 6704: mappings/VegCore.csv: projectName: Added source to VegX:project.title
Aaron Marcuse-Kubitza
05:50 PM Revision 6703: mappings/Makefile: .VegCore.csv.last_cleanup, .Veg+-VegCore.csv.last_cleanup: Also replace Veg+ terms in sources list, which are references to VegCore terms that have since been renamed
Aaron Marcuse-Kubitza
05:47 PM Revision 6702: repl: text mode: Also match "vars" with the term prefixed by ":". Consider .- to be word characters. Only match a word when preceeded by whitespace or CSV field start characters.
Aaron Marcuse-Kubitza
05:41 PM Revision 6701: repl: column mode: Removed parsing and checking of column name, which prevents using repl for general-purpose regexp/word replacement
Aaron Marcuse-Kubitza
04:41 PM Revision 6700: mappings/VegCore.csv: Definition: Moved closed list values to new Values column
Aaron Marcuse-Kubitza
04:39 PM Revision 6699: mappings/VegCore.csv: Added Values column to store closed list values
Aaron Marcuse-Kubitza
04:35 PM Revision 6698: mappings/VegCore.csv: geovalidation terms: Removed source to DwC:georeferenceVerificationStatus, because that is for georeferencing, not geovalidation
Aaron Marcuse-Kubitza
04:30 PM Revision 6697: mappings/VegCore.csv, Veg+-VegCore.csv: obs*Date: Re-sourced to VegX:obs*Date
Aaron Marcuse-Kubitza
04:23 PM Revision 6696: mappings/VegCore.csv: projectID: Re-sourced to plotObservation.projectID
Aaron Marcuse-Kubitza
04:17 PM Revision 6695: dict2redmine: RedmineTableWriter: Fixed bug where need to escape embedded | , using new redmine_table_esc()
Aaron Marcuse-Kubitza
04:16 PM Revision 6694: dict2redmine: Added redmine_table_esc()
Aaron Marcuse-Kubitza
04:13 PM Revision 6693: dict2redmine: Added redmine_esc()
Aaron Marcuse-Kubitza
04:06 PM Revision 6692: mappings/VegCore.csv: TCS terms: Added TCS comments from <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegBIEN_taxonomic_schema#TCS>
Aaron Marcuse-Kubitza
03:58 PM Revision 6691: dict2redmine: redmine_add_links(): Include the [] in the link text, to avoid the need for redmine_pad(), etc.
Aaron Marcuse-Kubitza
03:55 PM Revision 6690: dict2redmine: redmine_add_links(): Make the link bold so it stands out as a link
Aaron Marcuse-Kubitza
03:53 PM Revision 6689: dict2redmine: redmine_add_links(): Use new redmine_pad()
Aaron Marcuse-Kubitza
03:53 PM Revision 6688: dict2redmine: Added redmine_pad()
Aaron Marcuse-Kubitza
03:51 PM Revision 6687: dict2redmine: redmine_add_links(): Use redmine_url() to create the internal link
Aaron Marcuse-Kubitza
03:51 PM Revision 6686: dict2redmine: redmine_url(): Support internal links
Aaron Marcuse-Kubitza
03:47 PM Revision 6685: dict2redmine: redmine_add_links(): Fixed bug where need to explicitly specify the source name as the link text
Aaron Marcuse-Kubitza
03:44 PM Revision 6684: dict2redmine: RedmineDictWriter: Link citations to entry in sources list
Aaron Marcuse-Kubitza
03:18 PM Revision 6683: mappings/VegCore.csv: Restored name of latLongDomainValid term, which had gotten replaced with coordinatePrecision
Aaron Marcuse-Kubitza
03:16 PM Revision 6682: mappings/VegCore.csv: startDate, endDate: Changed comment to "a date range usually applies to the event"
Aaron Marcuse-Kubitza
03:14 PM Revision 6681: mappings/VegCore.csv: Added Examples column to store data in TCS Examples column at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegBIEN_taxonomic_schema#TCS>
Aaron Marcuse-Kubitza
03:10 PM Revision 6680: mappings/VegCore.csv: non-phylogenetic taxonomic terms: Added definitions from TCS schema
Aaron Marcuse-Kubitza
03:07 PM Revision 6679: mappings/VegCore.csv: *forma, *variety: Fixed sources, which had been swapped between the two sets of terms
Aaron Marcuse-Kubitza
02:57 PM Revision 6678: mappings/VegCore.csv: Special values: Moved comments to Comments column
Aaron Marcuse-Kubitza
01:11 PM Revision 6677: dict2redmine: Fixed bug where all header fields need to be preserved because columns are now filtered out instead of removed in each row
Aaron Marcuse-Kubitza
01:05 PM Revision 6676: dict2redmine: Put the definition before and outside of the fields table
Aaron Marcuse-Kubitza
12:53 PM Revision 6675: mappings/VegCore.csv: Moved Definition values that are actually comments into separate Comments column
Aaron Marcuse-Kubitza
12:46 PM Revision 6674: dict2redmine: RedmineDictWriter: Omit empty columns from the fields table
Aaron Marcuse-Kubitza

12/06/2012

11:18 PM Revision 6673: dict2redmine: Generate an outline instead of a table so each term will be indexed in the page's table of contents
Aaron Marcuse-Kubitza
11:13 PM Revision 6672: schemas/vegbien.sql: coordinates: coordinates_unique: Removed md5() around verbatimcoordinates because functions within unique indexes (other than the standard COALESCE()) are not yet supported by the import algorithm
Aaron Marcuse-Kubitza
11:10 PM Revision 6671: exc.py: e_msg(): Emit a warning instead of an AssertionError if e.args[0] isn't a string, to assist in debugging malformed exceptions
Aaron Marcuse-Kubitza
11:02 PM Revision 6670: mappings/VegCore.csv: sampleType: Re-sourced to bien_web.observationType
Aaron Marcuse-Kubitza
10:34 PM Revision 6669: schemas/vegbien.sql: analytical_stem_view: scientificNameWithMorphospecies: Fixed bug where need to use the taxonomicname in accepted_taxonlabel instead of accepted_taxonverbatim, because taxonverbatim only contains fields provided by the data provider (in this case, TNRS), but TNRS does not provide the taxonomic name (taxon name+author), only the taxon name and author components separately
Aaron Marcuse-Kubitza
10:09 PM Revision 6668: schemas/vegbien.sql: coordinates: coordinates_unique: Use md5() on verbatimcoordinates so that it doesn't cause the index row size to be exceeded. This should fix a bug in the HIBG import where long verbatimcoordinates values were causing the error 'OperationalError: index row size 2784 exceeds maximum 2712 for index "coordinates_unique"'.
Aaron Marcuse-Kubitza
09:56 PM Revision 6667: backups/Makefile: Synchronization: Replaced download target, which downloads all backups, with %/download, which downloads just a specific backup, because you would generally only want to extract a single backup from the archive for reinstallation
Aaron Marcuse-Kubitza
09:47 PM Revision 6666: backups/Makefile: Synchronization: Sync with jupiter instead of vegbiendev. This requires running `make backups/upload` on vegbiendev to archive the files, instead of `make backups/download` to download them to your local machine.
Aaron Marcuse-Kubitza
08:58 PM Revision 6665: inputs/.geoscrub/geoscrub_output/map.csv: Removed no longer accurate comment that county is not yet used by VegBIEN
Aaron Marcuse-Kubitza
08:56 PM Revision 6664: inputs/.geoscrub/geoscrub_output/map.csv: *validity: Remapped 2 ("Point is <=5km from putative GADM polygon, but still outside it") to true instead of false, because 5km is close enough to the polygon that the mismatch could result from shapefile simplifying, boundary changes, or other factors that don't affect geovalidity
Aaron Marcuse-Kubitza
08:52 PM Revision 6663: inputs/.geoscrub/geoscrub_output/map.csv: *validity: Remapped 0 ("Complete name provided, but couldn't be scrubbed to GADM") to NULL instead of false, because the absence of a name match does not mean the coordinates are invalid
Aaron Marcuse-Kubitza
08:51 PM Revision 6662: inputs/.{NCBI,TNRS}/import_order.txt: Added Source
Aaron Marcuse-Kubitza
08:50 PM Revision 6661: input.Makefile: SVN: add: Add a Source table to store datasource metadata. This adds a Source table to all herbaria which are listed in .herbaria, and therefore didn't previously need a Source table to indicate their referenceType and sampleType.
Aaron Marcuse-Kubitza
08:44 PM Revision 6660: input.Makefile: SVN: add: Add a Source table to store datasource metadata. This adds a Source table to all herbaria which are listed in .herbaria, and therefore didn't previously need a Source table to indicate their referenceType and sampleType.
Aaron Marcuse-Kubitza
08:43 PM Revision 6659: inputs/input.Makefile: SVN: add: verify/: Added *.xls to svn:ignore
Aaron Marcuse-Kubitza
08:33 PM Revision 6658: inputs/.geoscrub/geoscrub_output/postprocess.sql: Added index on decimallatitude, decimallongitude
Aaron Marcuse-Kubitza
08:30 PM Revision 6657: Added inputs/.geoscrub/geoscrub_output/postprocess.sql, which adds NOT NULL constraints on decimallatitude, decimallongitude
Aaron Marcuse-Kubitza
06:55 PM Revision 6656: schemas/vegbien.sql: analytical_*: Changed type of boolean columns to integer so that they will be exported as 1/0 instead of t/f by export_analytical_db. This will enable MySQL's LOAD DATA INFILE to import the values correctly.
Aaron Marcuse-Kubitza
06:07 PM Revision 6655: backups/Makefile: Checksums: %.md5/test: Only use md5sum's -v option on Mac, because it's not supported on Linux (there, verbose mode is the default)
Aaron Marcuse-Kubitza
05:57 PM Revision 6654: mappings/VegCore.csv: cultivated* source: Added picklist value to URL
Aaron Marcuse-Kubitza
05:46 PM Revision 6653: README.TXT: Data import: On nimoy: Creating analytical_aggregate table: publish_analytical_db: Rewrapped line
Aaron Marcuse-Kubitza
05:45 PM Revision 6652: README.TXT: Data import: On nimoy: Creating analytical_aggregate table: Changed name to analytical_aggregate_r<revision> to allow storing different versions simultaneously
Aaron Marcuse-Kubitza
05:26 PM Revision 6651: publish_analytical_db: Require caller to specify the name of the table to load data into. This allows appending a revision to analytical_aggregate, or publishing a table other than analytical_aggregate.
Aaron Marcuse-Kubitza
05:24 PM Revision 6650: publish_analytical_db: Require caller to specify the name of the table to load data into. This allows appending a revision to analytical_aggregate, or publishing a table other than analytical_aggregate.
Aaron Marcuse-Kubitza
05:23 PM Revision 6649: inputs/input.Makefile: SVN: add: verify/: Added *.xls to svn:ignore
Aaron Marcuse-Kubitza
04:33 PM Revision 6648: backups/Makefile: SQL: Full DB: vegbien.%.backup: Also generate MD5 sum
Aaron Marcuse-Kubitza
04:18 PM Revision 6647: inputs/import.stats.xls: Updated import times
Aaron Marcuse-Kubitza

12/05/2012

10:57 AM Revision 6646: README.TXT: Data import: Delete previous imports based on the full DB backup file
Aaron Marcuse-Kubitza
10:56 AM Revision 6645: backups/Makefile: Support removing public schema versions based on the version of a full DB backup
Aaron Marcuse-Kubitza
10:52 AM Revision 6644: mappings/VegCore.csv, Veg+-VegCore.csv: Removed the additional dict namespace for the SALVIAS sources. This removes the extra "dict:" namespace on the generate Redmine source term names.
Aaron Marcuse-Kubitza
10:49 AM Revision 6643: mappings/VegCore.csv, Veg+-VegCore.csv: Added TNRS provider namespace, inserting it before BIEN in the sort order
Aaron Marcuse-Kubitza
10:43 AM Revision 6642: mappings/VegCore.csv: Changed + to _ in URL fragments
Aaron Marcuse-Kubitza
10:41 AM Revision 6641: mappings/VegCore.csv, Veg+-VegCore.csv: Removed the additional BIEN namespace for the BIEN sources, and use just BIEN2 and VegBIEN as the sub-namespaces. This removes the extra "BIEN:" namespace on the generate Redmine source term names.
Aaron Marcuse-Kubitza
10:37 AM Revision 6640: mappings/VegCore.csv, Veg+-VegCore.csv: Removed the "terms" text in the current DwC terms' provider, and leave just the sort order. This removes the extra "terms:" namespace on the generate Redmine source term names.
Aaron Marcuse-Kubitza
10:33 AM Revision 6639: dict2redmine: url_term(): Remove empty URL comments
Aaron Marcuse-Kubitza
10:32 AM Revision 6638: dict2redmine: url_comment_text(): Interpret a URL comment containing just a number as a sort order without text
Aaron Marcuse-Kubitza
10:29 AM Revision 6637: dict2redmine: url_term(): Prefix any provider in the URL to the term name, to create a namespace. Each hierarchical component of the provider is stored in a URL comment.
Aaron Marcuse-Kubitza
10:27 AM Revision 6636: dict2redmine: Added url_comment_re
Aaron Marcuse-Kubitza
10:27 AM Revision 6635: dict2redmine: Added url_comment_text()
Aaron Marcuse-Kubitza
10:26 AM Revision 6634: dict2redmine: Call simplify_url() just on the first source so that source2redmine_url() can use the raw URL (to extract comments, etc.)
Aaron Marcuse-Kubitza
09:09 AM Revision 6633: dict2redmine: Removed no longer used explicit Definition column #
Aaron Marcuse-Kubitza
09:06 AM Revision 6632: dict2redmine: Use the input spreadsheet's column names and order, and pass through columns other than the term and sources columns
Aaron Marcuse-Kubitza
09:05 AM Revision 6631: mappingsf/VegCore.csv, Veg+-VegCore.csv: Renamed Comments to Definition to match Redmine table
Aaron Marcuse-Kubitza
09:04 AM Revision 6630: mappings/VegCore.csv, Veg+-VegCore.csv: Reversed order of Comments, Sources columns to match Redmine table order
Aaron Marcuse-Kubitza
08:58 AM Revision 6629: mappings/VegCore.csv, Veg+-VegCore.csv: Reversed order of Comments, Sources columns to match Redmine table order
Aaron Marcuse-Kubitza
08:56 AM Revision 6628: dict2redmine: Store term_str in a var before using it, like sources_str
Aaron Marcuse-Kubitza
08:43 AM Revision 6627: dict2redmine: Added Definition column
Aaron Marcuse-Kubitza
08:32 AM Revision 6626: dict2redmine: Take term and sources col #s as args instead of hardcoding them by column name or position
Aaron Marcuse-Kubitza
08:25 AM Revision 6625: dict2redmine: url_term(): Also match any namespace that's part of the term
Aaron Marcuse-Kubitza
08:21 AM Revision 6624: dict2redmine: Sources: Use source2redmine_url() to extract the term from each source URL
Aaron Marcuse-Kubitza
08:20 AM Revision 6623: dict2redmine: source2redmine_url(): Support empty URLs
Aaron Marcuse-Kubitza
08:15 AM Revision 6622: dict2redmine: url_term(): Fixed bug where need to use match.group() instead of match.groups()
Aaron Marcuse-Kubitza
08:02 AM Revision 6621: mappings/Makefile: Create VegCore.redmine from VegCore.csv
Aaron Marcuse-Kubitza
08:01 AM Revision 6620: Added dict2redmine
Aaron Marcuse-Kubitza
07:26 AM Revision 6619: mappings/VegCore.csv, Veg+-VegCore.csv: Renamed Source column to Sources because it can contain multiple sources
Aaron Marcuse-Kubitza
07:12 AM Revision 6618: mappings/VegCore.csv, Veg+-VegCore.csv: Source: DwC terms: Scoped sort order by category, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCore_refactoring#Scope-DwC-sort-order-by-category>
Aaron Marcuse-Kubitza
06:35 AM Revision 6617: mappings/VegCore.csv, Veg+-VegCore.csv: Source: VegX terms: Split combined field group/field sort order into separate sort orders for field and field group
Aaron Marcuse-Kubitza
06:22 AM Revision 6616: mappings/VegCore.csv, Veg+-VegCore.csv: Source: VegX terms: Added top-level table sort order
Aaron Marcuse-Kubitza
06:07 AM Revision 6615: mappings/VegCore.csv: taxonName: Reordered sources so it would sort with *TaxonName and scientificName
Aaron Marcuse-Kubitza
06:04 AM Revision 6614: mappings/VegCore.csv: Source: DwC Taxon: Added sort order so it would sort together with its fields
Aaron Marcuse-Kubitza
05:58 AM Revision 6613: mappings/VegCore.csv, Veg+-VegCore.csv: Source: DwC occurrenceID: Corrected sort order to 019 instead of 000
Aaron Marcuse-Kubitza
05:55 AM Revision 6612: mappings/VegCore.csv, Veg+-VegCore.csv: Source: DwC terms: Added category, with category sort order, as URL comment. This will allow terms to be sorted just within their category rather than globally for DwC.
Aaron Marcuse-Kubitza
05:49 AM Revision 6611: mappings/Veg+-VegCore.csv: Source: DwC: dcterms: Added back "dcterms:" prefix to URL fragment
Aaron Marcuse-Kubitza
05:31 AM Revision 6610: mappings/VegCore.csv: Source: TNRS terms: Added sort order to web page fragment (simple_download, detailed_download)
Aaron Marcuse-Kubitza
05:25 AM Revision 6609: mappings/VegCore.csv, Veg+-VegCore.csv: Removed no longer used Order within table column. Instead, embed the sort order in the URL using a () comment.
Aaron Marcuse-Kubitza
05:23 AM Revision 6608: mappings/VegCore.csv, Veg+-VegCore.csv: Merged the Order within table column with the Source URL, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCore_refactoring#Merging-the-Order-within-table-column-with-the-Source-URL>. Sorting on the Source column now groups related terms together according to their sort order in the source they came from.
Aaron Marcuse-Kubitza
05:11 AM Revision 6607: mappings/Veg+-VegCore.csv: Order within table: Filled in missing sort orders
Aaron Marcuse-Kubitza
04:51 AM Revision 6606: mappings/VegCore.csv, Veg+-VegCore.csv: Source: Web pages: Use / instead of . to separate nested elements of URL fragment. Use _ instead of + to represent space.
Aaron Marcuse-Kubitza
04:19 AM Revision 6605: mappings/VegCore.csv: Order within table: Filled in missing sort orders
Aaron Marcuse-Kubitza
03:58 AM Revision 6604: mappings/VegCore.csv: Source: Removed trailing whitespace
Aaron Marcuse-Kubitza
03:43 AM Revision 6603: mappings/VegCore.csv: Order within table: Fixed to include one entry for every URL, including when the Order field is empty and there are multiple URLs
Aaron Marcuse-Kubitza
03:33 AM Revision 6602: mappings/VegCore.csv: Order within table: Fixed to include one entry for every URL
Aaron Marcuse-Kubitza
02:03 AM Revision 6601: mappings/VegCore.csv: Source: "dcterms:" terms: Fixed URL fragments to use : instead of # after dcterms
Aaron Marcuse-Kubitza
01:42 AM Revision 6600: mappings/VegCore.csv, Veg+-VegCore.csv: Sources: BIEN2: Moved DB sort order right before the DB name in the URL to avoid duplicating the DB name in the comment
Aaron Marcuse-Kubitza
01:35 AM Revision 6599: mappings/VegCore.csv, Veg+-VegCore.csv: Sources: Added sort order comments to URLs so they sort in the order indicated at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCore#Sources>. URL comments are enclosed in (), and the sort order element of a comment is a number right after the ( .
Aaron Marcuse-Kubitza
12:37 AM Revision 6598: mappings/Makefile: .Veg+-VegCore.csv.last_cleanup: Sort by the source URL instead of the VegCore term
Aaron Marcuse-Kubitza
12:35 AM Revision 6597: mappings/Makefile: Split .Veg+-VegCore.csv.last_cleanup and .VegX-VegCore.csv.last_cleanup into separate targets so their recipes can be different
Aaron Marcuse-Kubitza
12:17 AM Revision 6596: mappings/VegCore-VegBIEN.csv: Mapped dcterms:rights
Aaron Marcuse-Kubitza

12/04/2012

11:52 PM Revision 6595: backups/Makefile: Synchronization: Also sync *.md5
Aaron Marcuse-Kubitza
09:52 PM Revision 6594: import_all: Fixed bug where need to wait for *all* asynchronous commands started before the main import, not just the first
Aaron Marcuse-Kubitza
09:51 PM Revision 6593: import_all: Import all Source tables before the herbaria list, so that any custom metadata will override the info in the herbaria list
Aaron Marcuse-Kubitza
09:43 PM Revision 6592: input.Makefile: Tables discovery: $(dontImport): Don't import the Source table when $import_source env var is set to ""
Aaron Marcuse-Kubitza
09:33 PM Revision 6591: input.Makefile: SVN: add: Add a Source table to store datasource metadata. This adds a Source table to all herbaria which are listed in .herbaria, and therefore didn't previously need a Source table to indicate their referenceType and sampleType.
Aaron Marcuse-Kubitza
09:22 PM Revision 6590: Added inputs/VASCAN/Source/
Aaron Marcuse-Kubitza
09:18 PM Revision 6589: csvs.py: stream_info(): Use the Excel dialect and an empty header if the CSV file is empty
Aaron Marcuse-Kubitza
08:29 PM Revision 6588: pg_dump_limit: Also remove CREATE DATABASE statements
Aaron Marcuse-Kubitza
08:09 PM Revision 6587: Added inputs/JBM/Source/
Aaron Marcuse-Kubitza
08:07 PM Revision 6586: mappings/Veg+-VegCore.csv: Removed type->dcterms:type automapping because this term can have many different meanings
Aaron Marcuse-Kubitza
08:06 PM Revision 6585: mappings/Veg+-VegCore.csv: Removed type->dcterms:type automapping because this term can have many different meanings
Aaron Marcuse-Kubitza
08:03 PM Revision 6584: Added inputs/NVS/Source/
Aaron Marcuse-Kubitza
08:02 PM Revision 6583: Added inputs/IUCN/European_Red_List_Plants/header.csv
Aaron Marcuse-Kubitza
08:02 PM Revision 6582: Added inputs/CVS/_src/
Aaron Marcuse-Kubitza
08:01 PM Revision 6581: input.Makefile: SVN: $(svnFilesGlob): Include test.xml.ref instead of all test*.xml* to avoid including test outputs
Aaron Marcuse-Kubitza
07:57 PM Revision 6580: inputs/*/verify/: Updated svn:ignore
Aaron Marcuse-Kubitza
07:55 PM Revision 6579: mappings/VegCore-VegBIEN.csv: Mapped verbatimCoordinates
Aaron Marcuse-Kubitza
07:54 PM Revision 6578: Updated inputs/HIBG/Specimen/new_terms.csv
Aaron Marcuse-Kubitza
07:50 PM Revision 6577: Added inputs/HIBG/Source/
Aaron Marcuse-Kubitza
07:49 PM Revision 6576: inputs/HIBG/verify/: Updated svn:ignore
Aaron Marcuse-Kubitza
07:47 PM Revision 6575: Added inputs/NCU-NCSC/Source/
Aaron Marcuse-Kubitza
07:47 PM Revision 6574: inputs/NCU-NCSC/verify/: Updated svn:ignore
Aaron Marcuse-Kubitza
07:07 PM Revision 6573: backups/Makefile: Checksums: %.md5/test: Made it an _always target
Aaron Marcuse-Kubitza
07:05 PM Revision 6572: backups/Makefile: Checksums: Added %.md5/test to test generated checksums
Aaron Marcuse-Kubitza
07:01 PM Revision 6571: backups/Makefile: Moved md5-related targets to separate Checksums section
Aaron Marcuse-Kubitza
06:59 PM Revision 6570: backups/Makefile: %.md5: Removed not applicable comment which had been copied from %.sql
Aaron Marcuse-Kubitza

12/03/2012

07:37 PM Revision 6569: inputs/VegBank/plot_/map.csv: Mapped confidentialitystatus to coordinateUncertaintyInMeters, overriding locationaccuracy when the confidentialitystatus indicates fuzzing
Aaron Marcuse-Kubitza
07:24 PM Revision 6568: inputs/NVS/*/map.csv: Mapped Taxon Growth Form values to growthform enum
Aaron Marcuse-Kubitza
07:08 PM Revision 6567: Added inputs/NVS/Source/
Aaron Marcuse-Kubitza
07:05 PM Revision 6566: inputs/NVS/import_order.txt: Specified import order
Aaron Marcuse-Kubitza
07:02 PM Revision 6565: inputs/input.Makefile: Staging tables installation: $(allInstalls): Exclude the Source table, which contains only (metadata) mappings, not data
Aaron Marcuse-Kubitza
06:54 PM Revision 6564: Regenerated vegbien.ERD exports
Aaron Marcuse-Kubitza
06:46 PM Revision 6563: Added inputs/NVS/TaxonOccurrence.Understory/
Aaron Marcuse-Kubitza
06:38 PM Revision 6562: Added inputs/NVS/Coordinates/
Aaron Marcuse-Kubitza
06:38 PM Revision 6561: inputs/NVS/Plot/map.csv: Mapped Plot
Aaron Marcuse-Kubitza
06:33 PM Revision 6560: mappings/VegCore-VegBIEN.csv: Mapped verbatimCoordinates
Aaron Marcuse-Kubitza
06:27 PM Revision 6559: mappings/Veg+-VegCore.csv: Removed type->dcterms:type automapping because this term can have many different meanings
Aaron Marcuse-Kubitza
06:07 PM Revision 6558: inputs/NVS/: Renamed Organism to AggregateOccurrence because this actually contains aggregated samplings
Aaron Marcuse-Kubitza
06:04 PM Revision 6557: inputs/NVS/StemObservation/map.csv: Mapped Item ID, Item Obs ID
Aaron Marcuse-Kubitza
05:58 PM Revision 6556: Added inputs/NVS/StemObservation/
Aaron Marcuse-Kubitza
05:52 PM Revision 6555: Added inputs/NVS/TaxonOccurrence/
Aaron Marcuse-Kubitza
05:51 PM Revision 6554: Added inputs/NVS/map.csv (global mappings)
Aaron Marcuse-Kubitza
05:47 PM Revision 6553: inputs/NVS/Project/map.csv: Remapped Project Abbreviation to projectName instead of Name, because Project Abbreviation is what's used throughout the tables to link to the project
Aaron Marcuse-Kubitza
05:34 PM Revision 6552: inputs/NVS/Plot/map.csv: Mapped Physiography
Aaron Marcuse-Kubitza
05:31 PM Revision 6551: inputs/NVS/Plot/map.csv: Mapped Area
Aaron Marcuse-Kubitza
05:29 PM Revision 6550: inputs/NVS/Plot/map.csv: Altitude: Provided rationale for units determination
Aaron Marcuse-Kubitza
05:25 PM Revision 6549: Added inputs/NVS/Organism/
Aaron Marcuse-Kubitza
05:06 PM Revision 6548: Added inputs/NVS/Project/
Aaron Marcuse-Kubitza
05:05 PM Revision 6547: mappings/VegCore-VegBIEN.csv: Mapped projectStartDate, projectEndDate
Aaron Marcuse-Kubitza
05:02 PM Revision 6546: mappings/VegCore.csv: Added projectStartDate, projectEndDate
Aaron Marcuse-Kubitza
05:02 PM Revision 6545: mappings/VegCore.csv: Renamed plotName to locationName because this term also applies to the location of a specimen. This replaces CTFS's definition of locationName as locality.
Aaron Marcuse-Kubitza
04:37 PM Revision 6544: root Makefile: apt-get: Use --yes to allow unattended installations
Aaron Marcuse-Kubitza
03:58 PM Revision 6543: schemas/vegbien.sql: analytical_*: Renamed plotName to locationName to match the new VegCore term name
Aaron Marcuse-Kubitza
03:51 PM Revision 6542: mappings/VegCore.csv: Renamed plotName to locationName because this term also applies to the location of a specimen. This replaces CTFS's definition of locationName as locality.
Aaron Marcuse-Kubitza
03:30 PM Revision 6541: mappings/VegCore.csv: Added subInstitutionCode
Aaron Marcuse-Kubitza
03:25 PM Revision 6540: schemas/vegbien.sql: analytical_stem_view: locationevent info: Fixed bug where need to use project.sourceaccessioncode instead of locationevent.project_id for the projectID
Aaron Marcuse-Kubitza
03:21 PM Revision 6539: schemas/vegbien.sql: analytical_stem_view: locationevent info: Fixed bug where need to use the parent locationevent's obsstartdate instead when the subevent does not provide it
Aaron Marcuse-Kubitza
03:19 PM Revision 6538: schemas/vegbien.sql: analytical_stem_view: locationevent info: Fixed bug where need to use the parent locationevent's project and method instead when the subevent does not provide them, because they are often attached to it instead
Aaron Marcuse-Kubitza
03:07 PM Revision 6537: schemas/vegbien.sql: analytical_stem_view: geolocation info: Fixed bug where need to use the parent location instead when provided, because lat/long and placenames are attached to it instead of the subplot's location
Aaron Marcuse-Kubitza
02:47 PM Revision 6536: backups/Makefile: %.md5: Fixed bug where md5sum does not have a -q option like md5
Aaron Marcuse-Kubitza
02:43 PM Revision 6535: backups/Makefile: %.md5: Fixed bug where need to use md5sum instead of md5 on Linux
Aaron Marcuse-Kubitza
02:39 PM Revision 6534: schemas/vegbien.sql: analytical_stem_view: Filter out non-current taxondeterminations (occurrences with no taxondetermination are preserved)
Aaron Marcuse-Kubitza
02:10 PM Revision 6533: schemas/vegbien.sql: Removed no longer needed darwin_core table. Use analytical_stem instead, which is now identical.
Aaron Marcuse-Kubitza
02:02 PM Revision 6532: schemas/vegbien.sql: sync_analytical_*_to_view(): Creating analytical_* table: Fixed bug where need LIMIT 0 so that it can be used on a full DB, which will have data in the tables used by analytical_stem_view
Aaron Marcuse-Kubitza
01:40 PM Revision 6531: schemas/vegbien.sql: Merged darwin_core into analytical_stem
Aaron Marcuse-Kubitza
01:21 PM Revision 6530: schemas/vegbien.sql: darwin_core_view, analytical_stem_view: Updated now that newWorldCountries.isoCode is a text field
Aaron Marcuse-Kubitza
12:35 PM Revision 6529: README.TXT: Data import: backups: Step to copy backups to jupiter: Added full path to aaronmk/ (/data/dev/aaronmk)
Aaron Marcuse-Kubitza
12:00 PM Revision 6528: inputs/newWorld/geoscrub.schema.~.changes.sql: Reversed order of adding unique constraints and changing types
Aaron Marcuse-Kubitza
11:57 AM Revision 6527: inputs/newWorld/geoscrub.schema.~.changes.sql: Changed isoCode type to text. Added unique constraint on isoCode.
Aaron Marcuse-Kubitza
11:06 AM Revision 6526: backups/Makefile: Added md5s target to generate .md5 files for all backups
Aaron Marcuse-Kubitza
11:05 AM Revision 6525: inputs/import.stats.xls: Updated import times
Aaron Marcuse-Kubitza
10:48 AM Revision 6524: backups/Makefile: %.md5: Run with `nice -n +5` to avoid slowing down the UI
Aaron Marcuse-Kubitza
10:46 AM Revision 6523: backups/: svn:ignore: Added *.md5. Removed no longer applicable *.log.
Aaron Marcuse-Kubitza
10:42 AM Revision 6522: backups/Makefile: Changed paths to be relative to the Makefile rather than the current directory, so this Makefile can be used in other directories as well (such as jupiter:/aaronmk/VegBIEN.backups/)
Aaron Marcuse-Kubitza
10:34 AM Revision 6521: backups/Makefile: %.backup: Also create MD5 of backup
Aaron Marcuse-Kubitza
10:31 AM Revision 6520: backups/Makefile: Added %.md5 target to create checksums of each backup
Aaron Marcuse-Kubitza
10:17 AM Revision 6519: README.TXT: Data import: backups: Added step to copy backups to jupiter in /aaronmk/VegBIEN.backups/ . The jupiter folder, which has several TB of space available, will replace local backup drives as the location for archived backups.
Aaron Marcuse-Kubitza
10:00 AM Revision 6518: README.TXT: Data import: Removed additional backup of just the public schema, which is not needed because the public schema is included in the full DB backup. The additional public schema backup increased the total backup size by 60-70%, so this will help conserve limited disk space on vegbiendev as well as on local archives of the backups.
Aaron Marcuse-Kubitza
09:52 AM Revision 6517: README.TXT: Backups: Full DB: Updated steps to match Data import steps, which add the date to the backup filename when it's created rather than afterwards
Aaron Marcuse-Kubitza
09:42 AM Revision 6516: README.TXT: Backups: Archived imports: Back up: Added instructions for archiving the last import before backing it up
Aaron Marcuse-Kubitza
09:10 AM Revision 6515: Regenerated vegbien.ERD exports
Aaron Marcuse-Kubitza
09:08 AM Revision 6514: schemas/vegbien.sql: analytical_*: Removed NOT NULL constraint on dateCollected
Aaron Marcuse-Kubitza
09:07 AM Revision 6513: schemas/vegbien.sql: source: Added sampletype field to indicate a plot or specimen datasource
Aaron Marcuse-Kubitza
09:00 AM Revision 6512: schemas/vegbien.sql: analytical_*: Removed NOT NULL constraint on dateCollected
Aaron Marcuse-Kubitza
08:55 AM Revision 6511: schemas/vegbien.sql: sync_analytical_*_to_view(): Added NOT NULL constraints
Aaron Marcuse-Kubitza

11/30/2012

05:20 PM Revision 6510: make_analytical_db: Added step to create darwin_core materialized view
Aaron Marcuse-Kubitza
05:09 PM Revision 6509: inputs/*/Source/map.csv for non-herbaria: Mapped sampleType
Aaron Marcuse-Kubitza
05:02 PM Revision 6508: inputs/.herbaria/herbaria/map.csv: Set sampleType to "specimen"
Aaron Marcuse-Kubitza
05:02 PM Revision 6507: mappings/VegCore-VegBIEN.csv: Mapped sampleType
Aaron Marcuse-Kubitza
05:00 PM Revision 6506: mappings/VegCore.csv: Added sampleType
Aaron Marcuse-Kubitza
04:57 PM Revision 6505: schemas/vegbien.sql: source: Added sampletype field to indicate a plot or specimen datasource
Aaron Marcuse-Kubitza
04:55 PM Revision 6504: schemas/vegbien.sql: Added sampletype enum
Aaron Marcuse-Kubitza
04:46 PM Revision 6503: root Makefile: $(postgresReload-*): Confirm the operation before continuing, since it involves changing PostgreSQL config files in nontrivial ways. Added instructions for setting kernel.shmmax to at least 4GB minus 1 byte on Linux, to work with the shared_buffers setting in postgresql.conf.
Aaron Marcuse-Kubitza
04:03 PM Revision 6502: schemas/postgresql.conf: shared_buffers: Documented that it must be less than ~95% of SHMMAX
Aaron Marcuse-Kubitza
03:58 PM Revision 6501: schemas/vegbien.sql: analytical_stem_view: identifiedBy: Fixed bug where need to use party.fullname instead of name components because the name is now mapped to fullname
Aaron Marcuse-Kubitza
03:28 PM Revision 6500: schemas/vegbien.sql: analytical_stem_view, darwin_core_view: dateCollected: Use the parent plot event's obsstartdate when the subplot event does not have its own obsstartdate
Aaron Marcuse-Kubitza
01:56 PM Revision 6499: schemas/vegbien.sql: analytical_stem_view: Don't filter out rows without a date or non-current taxondeterminations
Aaron Marcuse-Kubitza
01:54 PM Revision 6498: schemas/vegbien.sql: analytical_stem_view: Don't filter out rows without a date
Aaron Marcuse-Kubitza
01:28 PM Revision 6497: schemas/vegbien.sql: Added darwin_core_view
Aaron Marcuse-Kubitza
12:56 PM Revision 6496: schemas/vegbien.sql: analytical_stem_view: identifiedBy: Fixed bug where need to use party.fullname instead of name components because the name is now mapped to fullname
Aaron Marcuse-Kubitza
12:40 PM Revision 6495: schemas/vegbien.sql: sync_analytical_*_to_view(): Added CREATE INDEX statements
Aaron Marcuse-Kubitza
12:31 PM Revision 6494: README.TXT: Data import: Added steps to publish analytical DB on nimoy.bien_web
Aaron Marcuse-Kubitza
10:46 AM Revision 6493: schemas/vegbien.sql: analytical_stem_view: Changed JOINs to LEFT JOINs to include occurrences without taxondeterminations
Aaron Marcuse-Kubitza
10:21 AM Revision 6492: export_analytical_db: Use 'NULL' as the NULL value instead of \N, because MySQL has problems with \N
Aaron Marcuse-Kubitza
09:57 AM Revision 6491: publish_analytical_db: Load to bien3_adb instead of bien_web
Aaron Marcuse-Kubitza

11/29/2012

05:41 PM Revision 6490: README.TXT: Data import: Added step to export analytical DB
Aaron Marcuse-Kubitza
01:11 PM Revision 6489: root Makefile: $(postgres-Linux): Fixed bug where need $(asAdmin) before commands to rename existing *.conf
Aaron Marcuse-Kubitza
01:01 PM Revision 6488: root Makefile: $(postgres-Linux): Also install postgresql-contrib, which contains the hstore extension
Aaron Marcuse-Kubitza

11/28/2012

06:18 PM Revision 6487: Added inputs/NVS/
Aaron Marcuse-Kubitza
06:04 PM Revision 6486: inputs/CVS/Organism/map.csv: Mapped accordingTo to "Weakley 2006"
Aaron Marcuse-Kubitza
06:02 PM Revision 6485: inputs/NY/Specimen/map.csv: Omit UniqueNYInternalRecordNumber to avoid confusion since this is an internal-only ID. This makes InstitutionCode+CollectionCode+CatalogNumber the globally unique identifier instead.
Aaron Marcuse-Kubitza
06:00 PM Revision 6484: README.TXT: Added Datasource refreshing section with instructions for refreshing VegBank
Aaron Marcuse-Kubitza
05:57 PM Revision 6483: schemas/vegbien.sql: Renamed taxonconcept.concept_source_id back to concept_reference_id
Aaron Marcuse-Kubitza
05:52 PM Revision 6482: schemas/vegbien.sql: Renamed soilobs to soilsample per working group discussion
Aaron Marcuse-Kubitza
05:27 PM Revision 6481: input.Makefile: SVN: add: verify: Fixed bug where need to use $ prefix before string to parse newline
Aaron Marcuse-Kubitza
05:27 PM Revision 6480: input.Makefile: SVN: add: verify: Fixed bug where need to use $ prefix before string to parse newline
Aaron Marcuse-Kubitza
05:25 PM Revision 6479: inputs/NY/verify/: svn:ignore .csv files
Aaron Marcuse-Kubitza
05:25 PM Revision 6478: input.Makefile: SVN: add: Also svn:ignore .csv files
Aaron Marcuse-Kubitza
02:47 PM Revision 6477: export_analytical_db: Export NULL as \N to work with MySQL
Aaron Marcuse-Kubitza
01:22 PM Revision 6476: schemas/vegbien.sql: analytical_*: Added index on NOT NULL columns, starting with institutionCode
Aaron Marcuse-Kubitza
01:19 PM Revision 6475: schemas/vegbien.sql: analytical_*: Removed primary keys and NOT NULL constraints on columns that sometimes have NULL values
Aaron Marcuse-Kubitza
01:08 PM Revision 6474: publish_analytical_db: Added CSV dialect information
Aaron Marcuse-Kubitza
12:42 PM Revision 6473: root Makefile: PostgreSQL: $(postgresReload-*): Rename existing *.conf to *.conf.old
Aaron Marcuse-Kubitza

11/27/2012

06:44 PM Revision 6472: publish_analytical_db: Use LOAD DATA *LOCAL* INFILE instead of LOAD DATA INFILE to avoid needing FILE permissions on bien_web
Aaron Marcuse-Kubitza
01:17 PM Revision 6471: Added publish_analytical_db
Aaron Marcuse-Kubitza
12:43 PM Revision 6470: export_analytical_db: Append the public schema version to the CSV filename
Aaron Marcuse-Kubitza
12:27 PM Revision 6469: backups/Makefile: $(rsyncBackups): Added *.csv
Aaron Marcuse-Kubitza

11/26/2012

06:12 PM Revision 6468: Added export_analytical_db
Aaron Marcuse-Kubitza
06:10 PM Revision 6467: backups/: Ignore _* and *.csv
Aaron Marcuse-Kubitza
01:35 PM Revision 6466: make_analytical_db: mk_analytical_table(): Use explicit schema references everywhere. This fixes a bug where the TRUNCATE/INSERT steps on the public schema's table would reference the analytical_db view instead because they were not schema-scoped.
Aaron Marcuse-Kubitza
01:33 PM Revision 6465: make_analytical_db: mk_analytical_table(): Factored table references in different schemas out into vars
Aaron Marcuse-Kubitza

11/25/2012

09:31 PM Revision 6464: schemas/vegbien.sql: analytical_stem_view: recordNumber: Combine identifying fields in taxonoccurrence, plantobservation, and stemobservation to ensure that this field is unique within the plot and not NULL
Aaron Marcuse-Kubitza
09:13 PM Revision 6463: Regenerated vegbien.ERD exports
Aaron Marcuse-Kubitza
08:52 PM Revision 6462: make_analytical_db: Moved set -x () around just psql_verbose_vegbien so embedded $() expressions wouldn't also be in set -x (verbose) mode
Aaron Marcuse-Kubitza
08:49 PM Revision 6461: make_analytical_db: Fixed bug where need to use bash instead of sh because vegbien_dest requires it
Aaron Marcuse-Kubitza
08:37 PM Revision 6460: make_analytical_db: Factored analytical_* table creation code out into mk_analytical_table() function
Aaron Marcuse-Kubitza
08:28 PM Revision 6459: make_analytical_db: Create analytical_db views pointing to the analytical_* versions in the public schema
Aaron Marcuse-Kubitza
08:21 PM Revision 6458: vegbien_dest: $schemas: Removed analytical_db because views that will be added to it were shadowing public schema tables with the same names during population of those tables in make_analytical_db
Aaron Marcuse-Kubitza
07:47 PM Revision 6457: vegbien_dest: Export $public, to make sure it's available to any invoked scripts as an env var
Aaron Marcuse-Kubitza
07:45 PM Revision 6456: vegbien_dest: $schemas: Added analytical_db
Aaron Marcuse-Kubitza
07:38 PM Revision 6455: inputs/import.stats.xls: Added separate tab with stats for 2012-6~9. The Excel format apparently only supports 255 columns, so previous imports had been silently truncated off. Note that once the 2012-10 imports reach column 255, a new tab will need to be created with the 2012-10+ imports.
Aaron Marcuse-Kubitza
07:20 PM Revision 6454: bin/map: in_is_db: by_col: Clearing errors table: Skip this if the table has been set to None because it didn't exist (and thus was a metadata-only map spreadsheet)
Aaron Marcuse-Kubitza
06:54 PM Revision 6453: schemas/vegbien.sql: analytical_stem_view: scientificNameWithMorphospecies: Fixed bug where need to use the specific_epithet from the accepted_taxonverbatim rather than the parsed_taxonverbatim
Aaron Marcuse-Kubitza
06:45 PM Revision 6452: schemas/vegbien.sql: analytical_stem_view: scientificNameWithMorphospecies: Include the family any time the genus is not specified, instead of just when accepted_taxonlabel.rank = 'family'. These should have the same effect since TNRS includes the rank, but using COALESCE() is clearer.
Aaron Marcuse-Kubitza
06:41 PM Revision 6451: schemas/vegbien.sql: analytical_stem_view: scientificNameWithMorphospecies: Changed to also include morphospecies when just the family is specified
Aaron Marcuse-Kubitza
06:35 PM Revision 6450: schemas/vegbien.sql: analytical_stem_view: Fixed bug where location.authorlocationcode needed to be used as the plotName when location.sourceaccessioncode was not provided, to ensure that plotName would be NOT NULL
Aaron Marcuse-Kubitza
06:20 PM Revision 6449: inputs/FIA/import_order.txt: Fixed bug where FIA_COND_unique needed to be explicitly included in import_order.txt now that we're using import_order.txt to import the Source metadata table before the data tables
Aaron Marcuse-Kubitza
06:15 PM Revision 6448: inputs/import.stats.xls: Updated import times
Aaron Marcuse-Kubitza

11/24/2012

03:07 PM Revision 6447: root Makefile: PostgreSQL: $(postgresReload-Linux): Try chmoding both as your user and as the bien user
Aaron Marcuse-Kubitza
02:46 PM Revision 6446: input.Makefile: Testing: $(runTest): Ignore failed diffs when the test is compared to another test's output (e.g. in by_col mode)
Aaron Marcuse-Kubitza
02:41 PM Revision 6445: bin/map: in_is_db: If table does not exist, set table to None so that db_xml.put_table() doesn't try to access it. This fixes a bug in metadata-only map spreadsheets under column-based import.
Aaron Marcuse-Kubitza
02:40 PM Revision 6444: db_xml.py: put_table(): Support None in_table by calling put() directly
Aaron Marcuse-Kubitza
02:29 PM Revision 6443: Removed no longer used geoscrub.*.sql. Use geoscrub_output instead.
Aaron Marcuse-Kubitza
02:27 PM Revision 6442: Removed no longer used geoscrub_cleaned_unique. Use geoscrub_output instead.
Aaron Marcuse-Kubitza
02:25 PM Revision 6441: Removed no longer used geoscrub_cultivated. Use analytical_stem_view.cultivated instead.
Aaron Marcuse-Kubitza
02:25 PM Revision 6440: Removed no longer used geoscrub_cultivated. Use analytical_stem_view.cultivated instead.
Aaron Marcuse-Kubitza
02:23 PM Revision 6439: schemas/vegbien.sql: analytical_stem_view: cultivated: Removed BIEN2's geoscrub_cultivated, which has now been replaced by the primary corresponding scripts (and never had particularly many matches to the locations in any case)
Aaron Marcuse-Kubitza
02:14 PM Revision 6438: schemas/vegbien.sql: analytical_stem_view: cultivated: Use OR instead of _or() to combine cultivated_family_locations.country IS NOT NULL with the other values, because this field's false value should not be used in place of NULL if all the other values are NULL, as it would be with _or(). (cultivated_family_locations.country IS NOT NULL can indicate presence, but not absence, of cultivated status.)
Aaron Marcuse-Kubitza
02:06 PM Revision 6437: schemas/functions.sql, vegbien.sql: _and(), _or(): Added comment comparing the function and the corresponding logical operator
Aaron Marcuse-Kubitza
01:50 PM Revision 6436: schemas/vegbien.sql: public: Added _or(), for use by analytical_stem_view
Aaron Marcuse-Kubitza
01:48 PM Revision 6435: schemas/vegbien.sql: analytical_stem_view: cultivated: Also set if family/country combination found in cultivated_family_locations
Aaron Marcuse-Kubitza
01:39 PM Revision 6434: schemas/vegbien.sql: cultivated_family_locations: Added data from nimoy:/home/boyle/bien2/geoscrub/cultivated/cult_by_taxon/flag_by_taxa.inc
Aaron Marcuse-Kubitza
01:33 PM Revision 6433: schemas/vegbien.sql: Added cultivated_family_locations to store locations where various taxon families are considered cultivated
Aaron Marcuse-Kubitza
01:24 PM Revision 6432: mappings/VegCore-VegBIEN.csv: Mapped locality description fields to location.iscultivated using _locationnarrative_is_cultivated()
Aaron Marcuse-Kubitza
01:23 PM Revision 6431: xml_func.py: Simplifying functions: Added passthru entries for _and, _or
Aaron Marcuse-Kubitza
01:06 PM Revision 6430: schemas/vegbien.sql: Added _locationnarrative_is_cultivated()
Aaron Marcuse-Kubitza
12:57 PM Revision 6429: lib/PostgreSQL-MySQL.csv: Change text to varchar(255) because text columns can't be used in indexes in MySQL
Aaron Marcuse-Kubitza
12:51 PM Revision 6428: lib/PostgreSQL-MySQL.csv: Resaved in Excel, which removed unnecessary quotes around fields
Aaron Marcuse-Kubitza
12:22 PM Revision 6427: schemas/vegbien.sql: analytical_aggregate: Added identifiedBy, which is no longer a scoping field (which would prevent scientificNameWithMorphospecies from being unique) now that there is only one taxondetermination for each taxonoccurrence
Aaron Marcuse-Kubitza
12:05 PM Revision 6426: schemas/vegbien.sql: analytical_stem_view: dateCollected: For plots data, use the locationevent obsstartdate instead of the collectiondate in order to group taxonoccurrences/stems from the same locationevent together
Aaron Marcuse-Kubitza
11:59 AM Revision 6425: schemas/vegbien.sql: analytical_* pkeys: Added dateCollected because the records are actually unique within the location*event*, not the location
Aaron Marcuse-Kubitza
11:57 AM Revision 6424: schemas/vegbien.sql: analytical_stem_view: Exclude records with no collectiondate or obsstartdate, which is required to uniquely identify a record
Aaron Marcuse-Kubitza
11:54 AM Revision 6423: analytical_stem_view: dateCollected: Use locationevent.obsstartdate when aggregateoccurrence.collectiondate is not provided
Aaron Marcuse-Kubitza
11:37 AM Revision 6422: schemas/vegbien.sql: analytical_stem_view: Include only the current taxondetermination for each taxonoccurrence, to avoid cross-joining taxondeterminations with stems and thus multiplying the number of rows for datasources that have multiple taxondeterminations per taxonoccurrence
Aaron Marcuse-Kubitza
11:33 AM Revision 6421: schemas/vegbien.sql: taxondetermination: Added AFTER trigger to set the current taxondetermination for the taxonoccurrence
Aaron Marcuse-Kubitza
11:11 AM Revision 6420: lib/PostgreSQL-MySQL.csv: Statements ending in ";": When matching any character, use .*? (with the (?s) flag) instead of [^;]* in order to allow embedded ; to be matched. This fixes a bug where a CREATE VIEW statement was not removed because it contained an embedded ; .
Aaron Marcuse-Kubitza
11:06 AM Revision 6419: schemas/vegbien.sql: taxondetermination: Added unique index to ensure that there is only one current determination for each taxonoccurrence
Aaron Marcuse-Kubitza
11:05 AM Revision 6418: lib/PostgreSQL-MySQL.csv: Remove indexes with WHERE clauses
Aaron Marcuse-Kubitza
10:34 AM Revision 6417: schemas/vegbien.sql: analytical_aggregate: Added primary key on institutionCode, plotName, scientificNameWithMorphospecies, recordNumber. Note that this makes these fields NOT NULL, which should not be a problem because there are inner joins instead of LEFT JOINs on most of the tables which provide them, and LEFT JOINed tables have their identifying fields combined to create a NOT NULL value.
Aaron Marcuse-Kubitza
10:27 AM Revision 6416: schemas/vegbien.sql: analytical_stem_view: recordNumber: Combine identifying fields in taxonoccurrence, plantobservation, and stemobservation to ensure that this field is unique within the plot and not NULL
Aaron Marcuse-Kubitza
10:23 AM Revision 6415: lib/PostgreSQL-MySQL.csv: Only match a statement-terminating ; when it's at the end of a line
Aaron Marcuse-Kubitza
10:02 AM Revision 6414: schemas/vegbien.sql: analytical_aggregate: Added primary key on institutionCode, plotName, scientificNameWithMorphospecies. Note that this makes these fields NOT NULL, which should not be a problem because there are inner joins instead of LEFT JOINs on the tables which provide them.
Aaron Marcuse-Kubitza
09:21 AM Revision 6413: db_xml.py: put(): _setDefault(): Delay the evaluation of each col_default's value until the col_default is actually retrieved. This fixes a bug in the source table mappings where the explicit source entry was being created *after* the col_default source entry, causing the initial entry, which did not have the additional fields populated, to be used instead.
Aaron Marcuse-Kubitza
09:14 AM Revision 6412: dicts.py: Added WrapDict, a dict that runs a function on each value retrieved
Aaron Marcuse-Kubitza
08:59 AM Revision 6411: db_xml.py: put(): _setDefault(): Fixed bug where need to copy col_defaults before calling update() on it, to avoid modifying the input value (which may be reused by the caller, expecting it to be unmodified)
Aaron Marcuse-Kubitza
08:54 AM Revision 6410: db_xml.py: put(): col_defaults param: Fixed bug where need to use None as default value, because col_defaults will be modified by put() and the {} default value is a global instance
Aaron Marcuse-Kubitza
08:29 AM Revision 6409: mappings/VegCore-VegBIEN.csv: source table mappings: Set shortname to env var $source when it's not explicitly specified, because shortname is a required field of source
Aaron Marcuse-Kubitza
08:16 AM Revision 6408: db_xml.py: put(): Pass through the values of nodes which are text nodes
Aaron Marcuse-Kubitza
08:15 AM Revision 6407: db_xml.py: put(): put_(): Support _setDefault() values which are text nodes, by passing text strings through when put_() is run on all col_defaults entries
Aaron Marcuse-Kubitza
07:50 AM Revision 6406: db_xml.py: put(): _setDefault(): Support setting multiple col_defaults at once by using the param names themselves as the column names
Aaron Marcuse-Kubitza
07:47 AM Revision 6405: dicts.py: DictProxy: Implemented __delitem__()
Aaron Marcuse-Kubitza
07:32 AM Revision 6404: bin/map: update_in_label(): Removed hardcoded source_id col_default, which is now set in mappings/VegCore-VegBIEN.csv's output root
Aaron Marcuse-Kubitza
07:29 AM Revision 6403: mappings/VegCore-VegBIEN.csv: Set the source_id col_default to the datasource name using the new _setDefault() built-in function and _env()
Aaron Marcuse-Kubitza
07:25 AM Revision 6402: db_xml.py: put(): Added _setDefault() built-in function, which adds an entry to col_defaults
Aaron Marcuse-Kubitza
07:23 AM Revision 6401: xml_func.py: _env(): Fixed bug where need to retrieve actual string value of name param using xml_dom.NodeTextEntryIter instead of NodeEntryIter
Aaron Marcuse-Kubitza
07:20 AM Revision 6400: xml_func.py: _env(): Fixed bug where need to use xml_dom.replace_with_text() instead of xml_dom.replace() because replace() requires a DOM node
Aaron Marcuse-Kubitza
06:44 AM Revision 6399: bin/map: update_in_label(): Set $source env var to the in_label (datasource name), to make it available to _env()
Aaron Marcuse-Kubitza
06:43 AM Revision 6398: xml_func.py: Simplifying functions: Added _env()
Aaron Marcuse-Kubitza
06:05 AM Revision 6397: Added inputs/VegBank/Source/, containing referenceType metadata
Aaron Marcuse-Kubitza
06:00 AM Revision 6396: Added inputs/SpeciesLink/Source/, containing referenceType metadata
Aaron Marcuse-Kubitza
05:55 AM Revision 6395: Added inputs/SALVIAS*/Source/, containing referenceType metadata
Aaron Marcuse-Kubitza
05:47 AM Revision 6394: Added inputs/REMIB/Source/, containing referenceType metadata
Aaron Marcuse-Kubitza
05:41 AM Revision 6393: Added inputs/GBIF/Source/, containing referenceType metadata
Aaron Marcuse-Kubitza
05:34 AM Revision 6392: Added inputs/TEAM/Source/, containing referenceType metadata
Aaron Marcuse-Kubitza
05:33 AM Revision 6391: Placed inputs/TEAM/_src/Vegetation-Tree-and-Liana-Metadata-1.5.pdf under version control
Aaron Marcuse-Kubitza
05:27 AM Revision 6390: inputs/FIA/import_order.txt: Added Source, which needs to come before Organism
Aaron Marcuse-Kubitza
05:22 AM Revision 6389: Added inputs/Madidi/Source/, containing referenceType metadata
Aaron Marcuse-Kubitza
05:19 AM Revision 6388: Added inputs/FIA/Source/, containing referenceType metadata
Aaron Marcuse-Kubitza
05:14 AM Revision 6387: Added inputs/CVS/Source/, containing referenceType metadata
Aaron Marcuse-Kubitza
05:07 AM Revision 6386: Added inputs/CTFS/Source/, containing referenceType metadata
Aaron Marcuse-Kubitza
05:05 AM Revision 6385: bin/map: Support map spreadsheets containing only metadata mappings (with no corresponding staging table), by falling back to an empty table when the named table does not exist
Aaron Marcuse-Kubitza
04:19 AM Revision 6384: mappings/VegCore-VegBIEN.csv: institutionCode: Also map to the sourcename's matched source, which identifies whether the source is a herbarium
Aaron Marcuse-Kubitza
04:08 AM Revision 6383: schemas/vegbien.sql: source: Made shortname NOT NULL to ensure that all datasources have a globally-unique short name
Aaron Marcuse-Kubitza
03:33 AM Revision 6382: import_all: Added import of inputs/.herbaria/ before the main import
Aaron Marcuse-Kubitza
03:28 AM Revision 6381: Added inputs/.herbaria/
Aaron Marcuse-Kubitza
03:25 AM Revision 6380: input.Makefile: SVN: add: Also run %/add on all data subdirs
Aaron Marcuse-Kubitza
03:21 AM Revision 6379: input.Makefile: Existing maps discovery: Moved tables discovery to its own section, above SVN so it can be used by SVN
Aaron Marcuse-Kubitza
03:11 AM Revision 6378: mappings/VegCore.csv: referenceType: Fixed sort order
Aaron Marcuse-Kubitza
03:09 AM Revision 6377: mappings/VegCore-VegBIEN.csv: Mapped referenceType
Aaron Marcuse-Kubitza
03:06 AM Revision 6376: mappings/VegCore.csv: Added referenceType
Aaron Marcuse-Kubitza
02:10 AM Revision 6375: mappings/VegCore-VegBIEN.csv: institutionCode: Remap to source.shortname when specimen information is not provided, as is the case for geoscrub.herbaria on nimoy
Aaron Marcuse-Kubitza
01:47 AM Revision 6374: inputs/bien_web/observation/map.csv: Mapped observationID->occurrenceID
Aaron Marcuse-Kubitza
01:20 AM Revision 6373: README.TXT: Datasource setup: Add input data for each table present in the datasource: Added step to run `make inputs/<datasrc>/<table>/install` if the table is in a .sql export
Aaron Marcuse-Kubitza
01:17 AM Revision 6372: README.TXT: Datasource setup: MySQL inputs: Added step to install the export, which needs to happen before mapping individual tables
Aaron Marcuse-Kubitza
01:13 AM Revision 6371: README.TXT: Datasource setup: Add input data for each table present in the datasource: Replaced "CSV" with "CSV(s)" because there can be multiple CSV part files for one table
Aaron Marcuse-Kubitza
01:11 AM Revision 6370: README.TXT: Datasource setup: Add input data for each table present in the datasource: Don't add a CSV or create.sql file for tables that are in a .sql export
Aaron Marcuse-Kubitza
01:06 AM Revision 6369: README.TXT: Schema changes: Sync ERD with vegbien.sql schema: Changed instructions to just select tables with arrows next to them rather than all tables, because each table that's updated will have its lines reset and the number of lines that need to be fixed should be minimized
Aaron Marcuse-Kubitza
01:02 AM Revision 6368: README.TXT: Datasource setup: Accept the test cases: `make inputs/<datasrc>/test by_col=1`: Clarified that errors could indicate bugs in the *VegBIEN* unique constraints
Aaron Marcuse-Kubitza
12:59 AM Revision 6367: README.TXT: Data import: To remake analytical DB: Added explicit public schema setting since the analytical DB is often manually remade *after* the public schema has been renamed. Removed warnings that certain commands must be run after running make_analytical_db, because the "remake analytical DB" instructions no longer require this.
Aaron Marcuse-Kubitza
12:48 AM Revision 6366: README.TXT: Datasource setup: MySQL inputs: Added steps to export the database to a PostgreSQL-compatible .sql file, which can be directly used by the install process without the need to export each table as CSV
Aaron Marcuse-Kubitza
12:36 AM Revision 6365: README.TXT: Datasource setup: Choosing a table name: Documented that for .sql exports, you must use the name of the table in the DB export, not a suggested or custom name
Aaron Marcuse-Kubitza
12:34 AM Revision 6364: input.Makefile: Staging tables installation: $(dbExports): Also include the files that would be generated by running _MySQL/*.make and creating the corresponding PostgreSQL translations
Aaron Marcuse-Kubitza
12:18 AM Revision 6363: input.Makefile: Staging tables installation: Moved .sql export downloading and translation to separate Input data retrieval section
Aaron Marcuse-Kubitza
 

Also available in: Atom