/trunk/inputs/CTFS/Subplot - Changes - BIEN 3 - NCEAS Projects

root/trunk/inputs/CTFS/Subplot @ 12519

svn:ignore: *

#	Date	Author	Comment
11970	01/20/2014 11:33 AM	Aaron Marcuse-Kubitza	moved everything into /trunk/ to create the standard svn layout, for use with tools that require this (eg. git-svn). IMPORTANT: do NOT do an `svn up`. instead, re-use your working copy's existing files with `svn switch` (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.switch.html).
11396	10/21/2013 07:14 PM	Aaron Marcuse-Kubitza	fix: bin/map: put template: comment out the "Put template:" label so that the output is valid XML, and displays properly in a browser rather than showing a syntax error
10866	09/04/2013 11:06 PM	Aaron Marcuse-Kubitza	inputs///test.xml.ref: updated source.shortname for new datasource name, which now starts out with .new suffix
10419	07/25/2013 01:59 PM	Aaron Marcuse-Kubitza	inputs/CTFS/: switched to new-style import, using the steps at wiki.vegpath.org/Adding_new-style_import_to_a_datasource
10257	07/11/2013 12:09 PM	Aaron Marcuse-Kubitza	inputs///map.csv: added distinguishing #... suffix (e.g. UNUSED#institutionID) to the special terms OMIT, PRIVATE, UNUSED (VegCore.vegpath.org#Special-terms) to avoid creating a collision in the staging table renaming
10209	07/10/2013 02:32 AM	Aaron Marcuse-Kubitza	inputs///map.csv for CSV tables with a row_num column: added missing row_num entry, which is needed by the staging table column renaming to make the order of the map.csv columns match the order in the staging table
10203	07/10/2013 01:24 AM	Aaron Marcuse-Kubitza	bugfix: inputs/CTFS//VegBIEN.csv: regenerated from map.csv. they may have gotten out of date because they are marked as _no_import, even though they are* in import_order.txt.
10091	06/27/2013 12:28 PM	Aaron Marcuse-Kubitza	added inputs///header.csv for CSV inputs, which are now generated by inputs/input.Makefile %/install
8801	05/02/2013 08:53 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: SVN: add, %/add: /logs: also svn:ignore .gz, used for compressed log files
8176	03/25/2013 09:01 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: %/.map.csv.last_cleanup: Run fix_line_endings after canon/translate to standardize Python's \r\n line endings back to \n. This prevents issues with mixed line endings because LibreOffice (and probably Excel) treat all cell-internal line endings as \n but row line endings as whatever the file had, while text editors like jEdit translate all line endings to whatever the autodetected line ending is. (This creates spurious line ending diffs when a map spreadsheet containing multiline cells is edited in a text editor.)
7790	02/28/2013 02:16 AM	Aaron Marcuse-Kubitza	inputs/CTFS/: Switched global _no_import to table-specific _no_imports to allow adding new tables that are imported
7464	02/05/2013 03:40 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: locationID->location.sourceaccessioncode: Removed restriction that this mapping can't occur if geovalidation information is present. The locationID is no longer mapped to the place.sourceaccessioncode, so this filter is not necessary.
7009	12/21/2012 12:07 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: locationID/locationName + subplot -> location.sourceaccessioncode mapping: Fixed bug where subplot was incorrectly being mapped to this field even when there was no location. (This field can only be populated if both location and subplot are specified.) Also only map locationID for this, to avoid inconsistencies where one table supplies locationID+subplot, while another table supplies locationName+subplot, but they both get mapped to the same field, preventing plots from being matched up with their observations when creating the analytical_stem.
6992	12/20/2012 02:26 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: authortaxoncode mappings: Only use authorTaxonCode if there is no plant ID, because an individual plant gets its own taxonoccurrence and thus needs the taxonoccurrence's IDs to be unique to the plant, regardless of what the author designates as the taxonoccurrence code
6989	12/20/2012 01:23 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped authorTaxonCode
6406	11/24/2012 07:50 AM	Aaron Marcuse-Kubitza	db_xml.py: put(): _setDefault(): Support setting multiple col_defaults at once by using the param names themselves as the column names
6403	11/24/2012 07:29 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Set the source_id col_default to the datasource name using the new _setDefault() built-in function and _env()
6294	11/19/2012 04:09 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped acceptedCounty, county to the matched place
6002	11/05/2012 08:48 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: subplot locationevent: Only populate parent locationevent's location unique IDs if a subplot #/subplotID is actually specified. (The lack of a location unique ID will cause the parent locationevent's location to be removed, as well as the parent locationevent itself if there is no parent locationevent unique ID.) This fixes a bug where top-level plots in datasources that provide a nullable subplot #/subplotID were incorrectly getting connected to parent locationevents.
5977	11/02/2012 05:18 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: subplots: Also complete the locationevent/location diamond (subplot event -> {subplot location, parent plot event} -> parent plot location) when an eventDate or range is specified, as this is also an identifying field for locationevent. This fixes a bug where subplots data without explicit plot events (such as SALVIAS and TEAM) was not being connected to the appropriate parent plot event as well as parent plot location. This should fix the SALVIAS verification # location events, which should include only parent plots' locationevents to correspond with # locations, which only includes parent plots' locations, and uses locationevent.parent_id being NULL to determine what is a parent plot event.
5905	11/01/2012 01:54 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Map locationID to place.placecode instead when geovalidation columns are provided
5773	10/25/2012 10:36 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: location: Populate sourceaccessioncode with locationID + subplot when subplot is unique only within the parent plot, so that location always has a sourceaccessioncode to use as the plotCode in analytical_db_view
5176	10/02/2012 11:37 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: taxonoccurrence.authortaxoncode: Only populate if needed to distinguish the taxonoccurrence within a plot
4987	09/25/2012 07:22 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Removed unnecessary /_first/# suffix for multiple terms in the same _exists expression, because _exists() only checks whether its node is non-empty, and it does not matter how many child nodes it contains
4979	09/25/2012 04:52 PM	Aaron Marcuse-Kubitza	inputs///map.csv: Prefix a * to every term that's not in Veg+ for easy identification of unmapped terms when editing map.csv. Note that canon will remove the * when it finds a matching Veg+ term.
4895	09/20/2012 10:14 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Added empty mappings for special values (OMIT, etc.), so that they don't show up in **/unmapped_terms.csv. Note that the VegBIEN.csvs only change because the "No join mapping" errors change to "No non-empty join mapping".
4833	09/19/2012 04:16 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: taxonoccurrence.authortaxoncode alternatives: Use _first instead of _alt because when one of these fields is present, it can be used directly even if it's sometimes NULL, without needing to spend a lot of time _alting together fields that won't be used. Datasources where the authortaxoncode is sometimes NULL usually have a separate sourceaccessioncode for the taxonoccurrence. (In the rare case that they don't, they should map a non-NULL field to recordNumber or tag to ensure that taxonoccurrences can be uniquely identified.)
4832	09/19/2012 04:07 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped tag to taxonoccurrence.authortaxoncode when the record is an organism, in case there is no other ID for the taxonoccurrence. This fixes a bug in FIA and TEAM data where all organisms in a plot used the same taxonoccurrence because taxonoccurrence was not properly constrained, causing the loss of individual taxondeterminations on each organism.
4753	09/17/2012 02:01 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: Added units suffix to all core VegBIEN fields that have units. It is the responsibility of the mappings to ensure that all units are properly translated.
4679	09/14/2012 05:59 PM	Aaron Marcuse-Kubitza	inputs///map.csv: Changed output column header from Veg+ to VegCore because the names will be VegCore names after automapping. This is possible now that we're using new automapping scripts that do not require a particular column header.
4663	09/12/2012 05:13 PM	Aaron Marcuse-Kubitza	input.Makefile: Maps validation: $(newTerms): Fixed bug where header needed to be removed before running filter_out_ci because filter_out_ci only removes the header if it matches the vocabulary's header. Removing the header afterward can cause the first row to be removed instead if the header was already removed.
4656	09/12/2012 03:37 PM	Aaron Marcuse-Kubitza	inputs///map.csv: Added Filter column to contain any suffix added after the term, so that the automapping mechanism does not have to deal with the filter expressions
4651	09/12/2012 02:18 PM	Aaron Marcuse-Kubitza	inputs///map.csv: Removed no longer needed [Veg+] suffix in root, because the input column is no longer used by old-style map utilities such as union that needed this
4648	09/12/2012 01:57 PM	Aaron Marcuse-Kubitza	filter_out_ci: Filter header instead of passing it through, in order to properly support CSVs without a header, such as the unmapped_terms.csv and new_terms.csv files. For CSVs with a header, the header of the vocabulary should be removed before passing it to filter_out_ci.
4645	09/12/2012 01:30 PM	Aaron Marcuse-Kubitza	input.Makefile: Maps building: Removed no longer used %/src.csv, because it is no longer needed to generate map.full.csv from map.csv
4642	09/12/2012 01:02 PM	Aaron Marcuse-Kubitza	input.Makefile: Maps building: Removed no longer used %/map.full.csv
4640	09/12/2012 12:56 PM	Aaron Marcuse-Kubitza	input.Makefile: Maps building: %/map.full.csv: Generate by copying map.csv, because the content of these files now differs only in the sort order of the names
4639	09/12/2012 12:53 PM	Aaron Marcuse-Kubitza	inputs///map.csv: Changed empty mappings to self mappings, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Map_refactoring#Change-empty-mappings-to-self-mappings>. Note that in map.full.csv and VegBIEN.csv, lines that have changed are always the result of the input field's case being changed to match the case of the datasource's actual column name.
4638	09/12/2012 12:43 PM	Aaron Marcuse-Kubitza	inputs///map.csv: Changed empty mappings to self mappings, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Map_refactoring#Change-empty-mappings-to-self-mappings>. Note that in map.full.csv and VegBIEN.csv, lines that have changed are always the result of the input field's case being changed to match the case of the datasource's actual column name.
4636	09/12/2012 12:14 PM	Aaron Marcuse-Kubitza	inputs///map.csv: Added back automapped mappings to map.csv, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Map_refactoring#Add-back-automapped-mappings-to-mapcsv>
4621	09/12/2012 07:56 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Added /_simplifyPath:[next=parent_id]/path to root so the returned subplot location will be its parent location if there is no subplot name or ID (indicating that that particular plot did not have subplots). Note that this also causes the parent_id forwarding effect to occur for all other tables containing parent_id, which will help prevent similar issues with subplot events, etc. This will hopefully fix the SALVIAS.plotObservations bug where some organisms did not have a subplot #, causing the subplot location to become NULL and causing the corresponding locationevent rows not to match the locationevent_unique_within_location index filter condition (which requires a parent_id), which caused multiple output table pkeys to be returned for those rows, violating the locationevent_pkeys temp table's primary key.
4617	09/11/2012 11:01 AM	Aaron Marcuse-Kubitza	Regenerated/modified inputs///src.csv to use the self-mapping format used by the new automapping mechanism
4596	09/11/2012 08:22 AM	Aaron Marcuse-Kubitza	input.Makefile: Maps building: %/.map.csv.last_cleanup: $(newTerms): Remove the CSV header from the terms lists so that multiple terms lists can easily be appended together
4594	09/11/2012 08:09 AM	Aaron Marcuse-Kubitza	input.Makefile: Maps building: %/.map.csv.last_cleanup: Generate reports on new and unmapped terms in map.csv
4575	09/11/2012 03:06 AM	Aaron Marcuse-Kubitza	inputs/CTFS/Subplot/map.csv: Manually mapped QuadratID to subplot since it is unique only within Site, and thus can't be the subplotID. Omit QuadratName because QuadratID is used for the same purpose.
4504	09/07/2012 09:11 AM	Aaron Marcuse-Kubitza	intersect, union: Made case- and punctuation-insensitive. mappings/Veg+-VegBIEN.csv: Removed no longer needed duplicate entries for each first letter case, which must now be removed for case- and punctuation-insensitive intersect/union to work. Note that the SpeciesLink `svn diff` hides _alt entry 0, which contains one of the removed duplicate columns that appears in the diff.
4458	09/05/2012 06:23 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: if subplot: Also forward locationID and plotName to the location of the parent locationevent (in addition to the parent location of the location), in order to "complete the diamond" connecting subplot locationevent -> (parent plot locationevent, subplot location) -> parent plot location
4382	08/30/2012 11:35 AM	Aaron Marcuse-Kubitza	mappings/Veg+-VegCore.csv: Remapped CTFS QuadratID to subplot rather than subplotID, because it's only unique within the parent plot, not globally unique, in CTFS
4340	08/29/2012 08:38 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Redirect eventID, fieldNumber (authoreventcode) to parent locationevent when subplot columns exist
4337	08/29/2012 08:13 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Also redirect locationID/plotName to parent location if subplotID column was provided
4336	08/29/2012 08:08 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: location.authorlocationcode mappings: Use _first to remove specimens-related alternatives for this field from consideration when plots-related alternatives exist. This avoids unintentionally using specimens-related columns for this field in plots data.
4333	08/29/2012 07:38 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: location.authorlocationcode mappings: Placed inside "if subplot" _if statement along with sourceaccessioncode to reduce the number of separate _if statements needing a condition mapping
4324	08/29/2012 06:18 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Moved "if subplot" _if statement around /location/parent_id and /location/sourceaccessioncode themselves, so that only one _if cond mapping for subplot is needed. Note that this is only possible because this _if statement uses _exists, allowing it to be fully evaluated by the XML template simplifying mechanism, which supports subtrees as arguments to _if.
4306	08/28/2012 07:54 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: fieldNumber (authoreventcode): Don't copy to location.authorlocationcode if an actual locationID was specified
4288	08/28/2012 06:40 PM	Aaron Marcuse-Kubitza	Added inputs/CTFS/Subplot/

Project

General

Profile