Project

General

Profile

Statistics
| Revision:
  • svn:ignore: *

# Date Author Comment
11970 01/20/2014 11:33 AM Aaron Marcuse-Kubitza

moved everything into /trunk/ to create the standard svn layout, for use with tools that require this (eg. git-svn). IMPORTANT: do NOT do an `svn up`. instead, re-use your working copy's existing files with `svn switch` (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.switch.html).

11396 10/21/2013 07:14 PM Aaron Marcuse-Kubitza

fix: bin/map: put template: comment out the "Put template:" label so that the output is valid XML, and displays properly in a browser rather than showing a syntax error

10866 09/04/2013 11:06 PM Aaron Marcuse-Kubitza

inputs/*/*/test.xml.ref: updated source.shortname for new datasource name, which now starts out with .new suffix

10419 07/25/2013 01:59 PM Aaron Marcuse-Kubitza

inputs/CTFS/: switched to new-style import, using the steps at wiki.vegpath.org/Adding_new-style_import_to_a_datasource

10257 07/11/2013 12:09 PM Aaron Marcuse-Kubitza

inputs/*/*/map.csv: added distinguishing #... suffix (e.g. UNUSED#institutionID) to the special terms OMIT, PRIVATE, UNUSED (VegCore.vegpath.org#Special-terms) to avoid creating a collision in the staging table renaming

10209 07/10/2013 02:32 AM Aaron Marcuse-Kubitza

inputs/*/*/map.csv for CSV tables with a row_num column: added missing row_num entry, which is needed by the staging table column renaming to make the order of the map.csv columns match the order in the staging table

10203 07/10/2013 01:24 AM Aaron Marcuse-Kubitza

bugfix: inputs/CTFS/*/VegBIEN.csv: regenerated from map.csv. they may have gotten out of date because they are marked as _no_import, even though they are in import_order.txt.

10091 06/27/2013 12:28 PM Aaron Marcuse-Kubitza

added inputs/*/*/header.csv for CSV inputs, which are now generated by inputs/input.Makefile %/install

8801 05/02/2013 08:53 PM Aaron Marcuse-Kubitza

inputs/input.Makefile: SVN: add, %/add: */logs: also svn:ignore *.gz, used for compressed log files

8176 03/25/2013 09:01 PM Aaron Marcuse-Kubitza

inputs/input.Makefile: %/.map.csv.last_cleanup: Run fix_line_endings after canon/translate to standardize Python's \r\n line endings back to \n. This prevents issues with mixed line endings because LibreOffice (and probably Excel) treat all cell-internal line endings as \n but row line endings as whatever the file had, while text editors like jEdit translate all line endings to whatever the autodetected line ending is. (This creates spurious line ending diffs when a map spreadsheet containing multiline cells is edited in a text editor.)

7790 02/28/2013 02:16 AM Aaron Marcuse-Kubitza

inputs/CTFS/: Switched global _no_import to table-specific _no_imports to allow adding new tables that are imported

7464 02/05/2013 03:40 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: locationID->location.sourceaccessioncode: Removed restriction that this mapping can't occur if geovalidation information is present. The locationID is no longer mapped to the place.sourceaccessioncode, so this filter is not necessary.

7009 12/21/2012 12:07 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: locationID/locationName + subplot -> location.sourceaccessioncode mapping: Fixed bug where subplot was incorrectly being mapped to this field even when there was no location*. (This field can only be populated if both location* and subplot are specified.) Also only map locationID for this, to avoid inconsistencies where one table supplies locationID+subplot, while another table supplies locationName+subplot, but they both get mapped to the same field, preventing plots from being matched up with their observations when creating the analytical_stem.

6992 12/20/2012 02:26 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: authortaxoncode mappings: Only use authorTaxonCode if there is no plant ID, because an individual plant gets its own taxonoccurrence and thus needs the taxonoccurrence's IDs to be unique to the plant, regardless of what the author designates as the taxonoccurrence code

6989 12/20/2012 01:23 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: Mapped authorTaxonCode

6406 11/24/2012 07:50 AM Aaron Marcuse-Kubitza

db_xml.py: put(): _setDefault(): Support setting multiple col_defaults at once by using the param names themselves as the column names

6403 11/24/2012 07:29 AM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: Set the source_id col_default to the datasource name using the new _setDefault() built-in function and _env()

6294 11/19/2012 04:09 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: Mapped acceptedCounty, county to the matched place

6002 11/05/2012 08:48 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: subplot locationevent: Only populate parent locationevent's location unique IDs if a subplot #/subplotID is actually specified. (The lack of a location unique ID will cause the parent locationevent's location to be removed, as well as the parent locationevent itself if there is no parent locationevent unique ID.) This fixes a bug where top-level plots in datasources that provide a nullable subplot #/subplotID were incorrectly getting connected to parent locationevents.

5977 11/02/2012 05:18 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: subplots: Also complete the locationevent/location diamond (subplot event -> {subplot location, parent plot event} -> parent plot location) when an eventDate or range is specified, as this is also an identifying field for locationevent. This fixes a bug where subplots data without explicit plot events (such as SALVIAS and TEAM) was not being connected to the appropriate parent plot event as well as parent plot location. This should fix the SALVIAS verification # location events, which should include only parent plots' locationevents to correspond with # locations, which only includes parent plots' locations, and uses locationevent.parent_id being NULL to determine what is a parent plot event.

5905 11/01/2012 01:54 AM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: Map locationID to place.placecode instead when geovalidation columns are provided

5773 10/25/2012 10:36 AM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: location: Populate sourceaccessioncode with locationID + subplot when subplot is unique only within the parent plot, so that location always has a sourceaccessioncode to use as the plotCode in analytical_db_view

5176 10/02/2012 11:37 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: taxonoccurrence.authortaxoncode: Only populate if needed to distinguish the taxonoccurrence within a plot

4987 09/25/2012 07:22 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: Removed unnecessary /_first/# suffix for multiple terms in the same _exists expression, because _exists() only checks whether its node is non-empty, and it does not matter how many child nodes it contains

4979 09/25/2012 04:52 PM Aaron Marcuse-Kubitza

inputs/*/*/map.csv: Prefix a * to every term that's not in Veg+ for easy identification of unmapped terms when editing map.csv. Note that canon will remove the * when it finds a matching Veg+ term.

4895 09/20/2012 10:14 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: Added empty mappings for special values (OMIT, etc.), so that they don't show up in **/unmapped_terms.csv. Note that the VegBIEN.csvs only change because the "No join mapping" errors change to "No non-empty join mapping".

4833 09/19/2012 04:16 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: taxonoccurrence.authortaxoncode alternatives: Use _first instead of _alt because when one of these fields is present, it can be used directly even if it's sometimes NULL, without needing to spend a lot of time _alting together fields that won't be used. Datasources where the authortaxoncode is sometimes NULL usually have a separate sourceaccessioncode for the taxonoccurrence. (In the rare case that they don't, they should map a non-NULL field to recordNumber or tag to ensure that taxonoccurrences can be uniquely identified.)

4832 09/19/2012 04:07 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: Mapped tag to taxonoccurrence.authortaxoncode when the record is an organism, in case there is no other ID for the taxonoccurrence. This fixes a bug in FIA and TEAM data where all organisms in a plot used the same taxonoccurrence because taxonoccurrence was not properly constrained, causing the loss of individual taxondeterminations on each organism.

4753 09/17/2012 02:01 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: Added units suffix to all core VegBIEN fields that have units. It is the responsibility of the mappings to ensure that all units are properly translated.

4679 09/14/2012 05:59 PM Aaron Marcuse-Kubitza

inputs/*/*/map.csv: Changed output column header from Veg+ to VegCore because the names will be VegCore names after automapping. This is possible now that we're using new automapping scripts that do not require a particular column header.

4663 09/12/2012 05:13 PM Aaron Marcuse-Kubitza

input.Makefile: Maps validation: $(newTerms): Fixed bug where header needed to be removed before running filter_out_ci because filter_out_ci only removes the header if it matches the vocabulary's header. Removing the header afterward can cause the first row to be removed instead if the header was already removed.

4656 09/12/2012 03:37 PM Aaron Marcuse-Kubitza

inputs/*/*/map.csv: Added Filter column to contain any suffix added after the term, so that the automapping mechanism does not have to deal with the filter expressions

4651 09/12/2012 02:18 PM Aaron Marcuse-Kubitza

inputs/*/*/map.csv: Removed no longer needed [Veg+] suffix in root, because the input column is no longer used by old-style map utilities such as union that needed this

4648 09/12/2012 01:57 PM Aaron Marcuse-Kubitza

filter_out_ci: Filter header instead of passing it through, in order to properly support CSVs without a header, such as the unmapped_terms.csv and new_terms.csv files. For CSVs with a header, the header of the vocabulary should be removed before passing it to filter_out_ci.

4645 09/12/2012 01:30 PM Aaron Marcuse-Kubitza

input.Makefile: Maps building: Removed no longer used %/src.csv, because it is no longer needed to generate map.full.csv from map.csv

4642 09/12/2012 01:02 PM Aaron Marcuse-Kubitza

input.Makefile: Maps building: Removed no longer used %/map.full.csv

4640 09/12/2012 12:56 PM Aaron Marcuse-Kubitza

input.Makefile: Maps building: %/map.full.csv: Generate by copying map.csv, because the content of these files now differs only in the sort order of the names

4639 09/12/2012 12:53 PM Aaron Marcuse-Kubitza

inputs/*/*/map.csv: Changed empty mappings to self mappings, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Map_refactoring#Change-empty-mappings-to-self-mappings&gt;. Note that in map.full.csv and VegBIEN.csv, lines that have changed are always the result of the input field's case being changed to match the case of the datasource's actual column name.

4638 09/12/2012 12:43 PM Aaron Marcuse-Kubitza

inputs/*/*/map.csv: Changed empty mappings to self mappings, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Map_refactoring#Change-empty-mappings-to-self-mappings&gt;. Note that in map.full.csv and VegBIEN.csv, lines that have changed are always the result of the input field's case being changed to match the case of the datasource's actual column name.

4636 09/12/2012 12:14 PM Aaron Marcuse-Kubitza

inputs/*/*/map.csv: Added back automapped mappings to map.csv, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Map_refactoring#Add-back-automapped-mappings-to-mapcsv>

4621 09/12/2012 07:56 AM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: Added /_simplifyPath:[next=parent_id]/path to root so the returned subplot location will be its parent location if there is no subplot name or ID (indicating that that particular plot did not have subplots). Note that this also causes the parent_id forwarding effect to occur for all other tables containing parent_id, which will help prevent similar issues with subplot events, etc. This will hopefully fix the SALVIAS.plotObservations bug where some organisms did not have a subplot #, causing the subplot location to become NULL and causing the corresponding locationevent rows not to match the locationevent_unique_within_location index filter condition (which requires a parent_id), which caused multiple output table pkeys to be returned for those rows, violating the locationevent_pkeys temp table's primary key.

4617 09/11/2012 11:01 AM Aaron Marcuse-Kubitza

Regenerated/modified inputs/*/*/src.csv to use the self-mapping format used by the new automapping mechanism

4596 09/11/2012 08:22 AM Aaron Marcuse-Kubitza

input.Makefile: Maps building: %/.map.csv.last_cleanup: $(newTerms): Remove the CSV header from the terms lists so that multiple terms lists can easily be appended together

4594 09/11/2012 08:09 AM Aaron Marcuse-Kubitza

input.Makefile: Maps building: %/.map.csv.last_cleanup: Generate reports on new and unmapped terms in map.csv

4575 09/11/2012 03:06 AM Aaron Marcuse-Kubitza

inputs/CTFS/Subplot/map.csv: Manually mapped QuadratID to subplot since it is unique only within Site, and thus can't be the subplotID. Omit QuadratName because QuadratID is used for the same purpose.

4504 09/07/2012 09:11 AM Aaron Marcuse-Kubitza

intersect, union: Made case- and punctuation-insensitive. mappings/Veg+-VegBIEN.csv: Removed no longer needed duplicate entries for each first letter case, which must now be removed for case- and punctuation-insensitive intersect/union to work. Note that the SpeciesLink `svn diff` hides _alt entry 0, which contains one of the removed duplicate columns that appears in the diff.

4458 09/05/2012 06:23 AM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: if subplot: Also forward locationID and plotName to the location of the parent locationevent (in addition to the parent location of the location), in order to "complete the diamond" connecting subplot locationevent -> (parent plot locationevent, subplot location) -> parent plot location

4382 08/30/2012 11:35 AM Aaron Marcuse-Kubitza

mappings/Veg+-VegCore.csv: Remapped CTFS QuadratID to subplot rather than subplotID, because it's only unique within the parent plot, not globally unique, in CTFS

4340 08/29/2012 08:38 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: Redirect eventID, fieldNumber (authoreventcode) to parent locationevent when subplot columns exist

4337 08/29/2012 08:13 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: Also redirect locationID/plotName to parent location if subplotID column was provided

4336 08/29/2012 08:08 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: location.authorlocationcode mappings: Use _first to remove specimens-related alternatives for this field from consideration when plots-related alternatives exist. This avoids unintentionally using specimens-related columns for this field in plots data.

4333 08/29/2012 07:38 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: location.authorlocationcode mappings: Placed inside "if subplot" _if statement along with sourceaccessioncode to reduce the number of separate _if statements needing a condition mapping

4324 08/29/2012 06:18 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: Moved "if subplot" _if statement around /location/parent_id and /location/sourceaccessioncode themselves, so that only one _if cond mapping for subplot is needed. Note that this is only possible because this _if statement uses _exists, allowing it to be fully evaluated by the XML template simplifying mechanism, which supports subtrees as arguments to _if.

4306 08/28/2012 07:54 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: fieldNumber (authoreventcode): Don't copy to location.authorlocationcode if an actual locationID was specified

4288 08/28/2012 06:40 PM Aaron Marcuse-Kubitza

Added inputs/CTFS/Subplot/