xml_func.py: Added is_scalar()
xml_func.py: process(): row-based mode: preserving complex funcs: Fixed bug where functions with no params would crash reduce() because it requires at least one value when no initial value is specified
Added scalar.py
Renamed inputs/NCU-NCSC/ to NCU because this is the primary herbarium contained in the data
inputs/VegBank/: Joined together stemcount and stemlocation tables to create stemlocation_, in order to include the stem size class's measurements in each tagged stem's stemobservation (in addition to in the stemobservation for the aggregateoccurrence as a whole)
inputs/VegBank/stemlocation/map.csv: Also mapped stemlocation_id to individualID to create one plantobservation for each stemobservation
inputs/VegBank/stemlocation/map.csv: Remapped stemcount_id to aggregateOccurrenceID to match stemcount_id's mapping in stemcount_
inputs/VegBank/: Joined together taxonimportance and stemcount tables to create stemcount_, because stemcount actually stores stem abundance by size, rather than grouping stems by organism (http://vegbankdev.nceas.ucsb.edu/vegbank/views/dba_tabledescription_detail.jsp?view=detail&wparam=stemcount&entity=dba_tabledescription&where=where_tablename)
Added inputs/VegBank/_archive
input.Makefile: Testing: Added `%/test: %/test.xml` to allow testing just a subdir
input.Makefile: General targets: Added `%/: %/map.csv` to allow remaking just a subdirectory
inputs/CVS/: Refreshed data with new export from Bob
inputs/CVS/cvs-archive-2012-12-04.schema.sql: Fixed types using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Tools#MS-Access-database-MDB>
bin/map: Removed column names simplification, which was causing columns with the same alphanumeric characters but different punctuation to be simplified to the same name. Name simplification is now performed by the mapping mechanism itself, and can be overridden in the mappings.
Regenerated inputs/VegBank/new_terms.csv
Added inputs/NCU/_src/NCU_specimens_public_2012-12-10.zip.url
inputs/NCU/: Refreshed data with new export from Bob
Added inputs/NCU-NCSC/_archive
input.Makefile: SVN: add: Also add _archive/ subdir
publish_analytical_db: Time the import of the data
export_analytical_db: Also create a .md5 for the export
export_analytical_db: Run commands in the root svn dir
mappings/VegCore.csv: soil composition terms: Removed ppm units from the definition, since units are actually fraction or percent
README.TXT: Data import: Moved On local machine steps after On nimoy steps, because the On nimoy steps are more important
mappings/VegCore.csv: Comments: Added quotes around quotations from other sources
mappings/VegCore.csv: Definitions: Added quotes around quotations from other sources
Added backups/fix_perms
backups/Makefile: Synchronization: %/download: Also download any .md5 file for the file
README.TXT: Data import: On nimoy: Added instructions to verify the export's MD5 sum
README.TXT: Data import: On nimoy: Replaced step to manually upload the analytical_aggregate export with the command to download it from jupiter
README.TXT: Data import: On nimoy: Removed step to rename any existing analytical_aggregate table, since the import is now done directly into the versioned table
mappings/VegCore.csv: VegX terms without definitions in VegX: Added definitions from non-VegX sources, etc.
README.TXT: Data import: Added instructions to verify the backups' MD5 sums on jupiter
README.TXT: Data import: Removed step to copy backups to jupiter, because this now done by `make backups/upload`
schemas/vegbien.sql: sync_*_to_view(): Also add `GRANT SELECT TO bien_read` on the view used to generate the table, in case the permission was lost when the view was modified
schemas/vegbien.sql: sync_*_to_view(): Added `GRANT SELECT TO bien_read`
schemas/vegbien.sql: analytical_*: Added back bien_read's SELECT permissions, which had gotten removed when the tables were re-synced to their views
schemas/vegbien.my.sql: Regenerated with expanded repl word matching
repl: :-prefixing of words to form vars: Fixed bug where : must be matched as a lookbehind assertion, not a capturing group, because the provided regexp itself or its replacement may reference capturing groups, which it expects to be numbered starting with 1
inputs/import.stats.xls: Updated import times
Regenerated inputs/NY/Specimen/new_terms.csv
inputs/JBM/Specimen/test.xml.ref: Updated inserted row count, which had gotten changed when a test was run on a non-empty database
mappings/VegCore.csv: height_ft: Added source to VegBank:stemHeight, which includes a description of the term
mappings/VegCore.csv: height_m: Added source to VegBank:stemHeight, which includes a description of the term
mappings/VegCore.csv: projectName: Added definition from VegX schema
mappings/VegCore.csv: project*Date: Re-sourced to VegBank:project.*Date, since VegX does not have an equivalent term
mappings/VegCore.csv: VegX terms: Added definitions from VegX schema, where provided
mappings/VegCore.csv: projectName: Added source to VegX:project.title
mappings/Makefile: .VegCore.csv.last_cleanup, .Veg+-VegCore.csv.last_cleanup: Also replace Veg+ terms in sources list, which are references to VegCore terms that have since been renamed
repl: text mode: Also match "vars" with the term prefixed by ":". Consider .- to be word characters. Only match a word when preceeded by whitespace or CSV field start characters.
repl: column mode: Removed parsing and checking of column name, which prevents using repl for general-purpose regexp/word replacement
mappings/VegCore.csv: Definition: Moved closed list values to new Values column
mappings/VegCore.csv: Added Values column to store closed list values
mappings/VegCore.csv: geovalidation terms: Removed source to DwC:georeferenceVerificationStatus, because that is for georeferencing, not geovalidation
mappings/VegCore.csv, Veg+-VegCore.csv: obs*Date: Re-sourced to VegX:obs*Date
mappings/VegCore.csv: projectID: Re-sourced to plotObservation.projectID
dict2redmine: RedmineTableWriter: Fixed bug where need to escape embedded | , using new redmine_table_esc()
dict2redmine: Added redmine_table_esc()
dict2redmine: Added redmine_esc()
mappings/VegCore.csv: TCS terms: Added TCS comments from <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegBIEN_taxonomic_schema#TCS>
dict2redmine: redmine_add_links(): Include the [] in the link text, to avoid the need for redmine_pad(), etc.
dict2redmine: redmine_add_links(): Make the link bold so it stands out as a link
dict2redmine: redmine_add_links(): Use new redmine_pad()
dict2redmine: Added redmine_pad()
dict2redmine: redmine_add_links(): Use redmine_url() to create the internal link
dict2redmine: redmine_url(): Support internal links
dict2redmine: redmine_add_links(): Fixed bug where need to explicitly specify the source name as the link text
dict2redmine: RedmineDictWriter: Link citations to entry in sources list
mappings/VegCore.csv: Restored name of latLongDomainValid term, which had gotten replaced with coordinatePrecision
mappings/VegCore.csv: startDate, endDate: Changed comment to "a date range usually applies to the event"
mappings/VegCore.csv: Added Examples column to store data in TCS Examples column at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegBIEN_taxonomic_schema#TCS>
mappings/VegCore.csv: non-phylogenetic taxonomic terms: Added definitions from TCS schema
mappings/VegCore.csv: *forma, *variety: Fixed sources, which had been swapped between the two sets of terms
mappings/VegCore.csv: Special values: Moved comments to Comments column
dict2redmine: Fixed bug where all header fields need to be preserved because columns are now filtered out instead of removed in each row
dict2redmine: Put the definition before and outside of the fields table
mappings/VegCore.csv: Moved Definition values that are actually comments into separate Comments column
dict2redmine: RedmineDictWriter: Omit empty columns from the fields table
dict2redmine: Generate an outline instead of a table so each term will be indexed in the page's table of contents
schemas/vegbien.sql: coordinates: coordinates_unique: Removed md5() around verbatimcoordinates because functions within unique indexes (other than the standard COALESCE) are not yet supported by the import algorithm
exc.py: e_msg(): Emit a warning instead of an AssertionError if e.args0 isn't a string, to assist in debugging malformed exceptions
mappings/VegCore.csv: sampleType: Re-sourced to bien_web.observationType
schemas/vegbien.sql: analytical_stem_view: scientificNameWithMorphospecies: Fixed bug where need to use the taxonomicname in accepted_taxonlabel instead of accepted_taxonverbatim, because taxonverbatim only contains fields provided by the data provider (in this case, TNRS), but TNRS does not provide the taxonomic name (taxon name+author), only the taxon name and author components separately
schemas/vegbien.sql: coordinates: coordinates_unique: Use md5() on verbatimcoordinates so that it doesn't cause the index row size to be exceeded. This should fix a bug in the HIBG import where long verbatimcoordinates values were causing the error 'OperationalError: index row size 2784 exceeds maximum 2712 for index "coordinates_unique"'.
backups/Makefile: Synchronization: Replaced download target, which downloads all backups, with %/download, which downloads just a specific backup, because you would generally only want to extract a single backup from the archive for reinstallation
backups/Makefile: Synchronization: Sync with jupiter instead of vegbiendev. This requires running `make backups/upload` on vegbiendev to archive the files, instead of `make backups/download` to download them to your local machine.
inputs/.geoscrub/geoscrub_output/map.csv: Removed no longer accurate comment that county is not yet used by VegBIEN
inputs/.geoscrub/geoscrub_output/map.csv: *validity: Remapped 2 ("Point is <=5km from putative GADM polygon, but still outside it") to true instead of false, because 5km is close enough to the polygon that the mismatch could result from shapefile simplifying, boundary changes, or other factors that don't affect geovalidity
inputs/.geoscrub/geoscrub_output/map.csv: *validity: Remapped 0 ("Complete name provided, but couldn't be scrubbed to GADM") to NULL instead of false, because the absence of a name match does not mean the coordinates are invalid
inputs/.{NCBI,TNRS}/import_order.txt: Added Source
input.Makefile: SVN: add: Add a Source table to store datasource metadata. This adds a Source table to all herbaria which are listed in .herbaria, and therefore didn't previously need a Source table to indicate their referenceType and sampleType.
inputs/input.Makefile: SVN: add: verify/: Added *.xls to svn:ignore
inputs/.geoscrub/geoscrub_output/postprocess.sql: Added index on decimallatitude, decimallongitude
Added inputs/.geoscrub/geoscrub_output/postprocess.sql, which adds NOT NULL constraints on decimallatitude, decimallongitude
schemas/vegbien.sql: analytical_*: Changed type of boolean columns to integer so that they will be exported as 1/0 instead of t/f by export_analytical_db. This will enable MySQL's LOAD DATA INFILE to import the values correctly.
backups/Makefile: Checksums: %.md5/test: Only use md5sum's -v option on Mac, because it's not supported on Linux (there, verbose mode is the default)