/ - Changes - BIEN 3 - NCEAS Projects

root @ 4665

#	Date	Author	Comment
4665	09/12/2012 05:20 PM	Aaron Marcuse-Kubitza	mappings/: Removed no longer used Veg+-VegBIEN.csv and derived autogen Veg+.self.csv
4664	09/12/2012 05:16 PM	Aaron Marcuse-Kubitza	input.Makefile: Maps building: %/unmapped_terms.csv: Use $(coreMap) instead of $(vocab) because the terms should already be translated to VegCore terms, rather than still being Veg+
4663	09/12/2012 05:13 PM	Aaron Marcuse-Kubitza	input.Makefile: Maps validation: $(newTerms): Fixed bug where header needed to be removed before running filter_out_ci because filter_out_ci only removes the header if it matches the vocabulary's header. Removing the header afterward can cause the first row to be removed instead if the header was already removed.
4662	09/12/2012 05:11 PM	Aaron Marcuse-Kubitza	cols: Support CSVs without a header, such as intermediates that become unmapped_terms.csv, new_terms.csv
4661	09/12/2012 04:37 PM	Aaron Marcuse-Kubitza	inputs/: Regenerated unmapped_terms.csv, new_terms.csv
4660	09/12/2012 04:25 PM	Aaron Marcuse-Kubitza	input.Makefile: %/.map.csv.last_cleanup: Removed no longer used prerequisite $(vocab)
4659	09/12/2012 04:24 PM	Aaron Marcuse-Kubitza	input.Makefile: %/.map.csv.last_cleanup: Canonicalize separately on $(coreMap) and $(dict), instead of requiring them to be combined in $(vocab)
4658	09/12/2012 04:20 PM	Aaron Marcuse-Kubitza	input.Makefile: Use mappings/VegCore-VegBIEN.csv instead of mappings/Veg+-VegBIEN.csv as the core map, because the automapper now takes care of Veg+ -> VegCore translation
4657	09/12/2012 04:14 PM	Aaron Marcuse-Kubitza	inputs///map.csv: Moved filter suffixes to separate filter column to enable automapping to work on those mappings' terms, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Map_refactoring#Move-filter-suffixes-to-separate-filter-column>. Note that the only changes to VegBIEN.csvs are the (now automapped) names of terms in "No join mapping" comments.
4656	09/12/2012 03:37 PM	Aaron Marcuse-Kubitza	inputs///map.csv: Added Filter column to contain any suffix added after the term, so that the automapping mechanism does not have to deal with the filter expressions
4655	09/12/2012 03:35 PM	Aaron Marcuse-Kubitza	Added cat_cols
4654	09/12/2012 03:34 PM	Aaron Marcuse-Kubitza	Added ins_col
4653	09/12/2012 03:13 PM	Aaron Marcuse-Kubitza	input.Makefile: Maps building: %/.map.csv.last_cleanup: Reference fixed prerequisites by name instead of by position in the prerequisites list
4652	09/12/2012 02:28 PM	Aaron Marcuse-Kubitza	Removed no longer used intersect
4651	09/12/2012 02:18 PM	Aaron Marcuse-Kubitza	inputs///map.csv: Removed no longer needed [Veg+] suffix in root, because the input column is no longer used by old-style map utilities such as union that needed this
4650	09/12/2012 02:07 PM	Aaron Marcuse-Kubitza	translate: Translate the column header instead of passing it through, in order to properly support CSVs without a header and to support renaming the header when the column's contents change to a different schema or vocabulary
4649	09/12/2012 02:04 PM	Aaron Marcuse-Kubitza	canon: Canonicalize the column header instead of passing it through, in order to properly support CSVs without a header
4648	09/12/2012 01:57 PM	Aaron Marcuse-Kubitza	filter_out_ci: Filter header instead of passing it through, in order to properly support CSVs without a header, such as the unmapped_terms.csv and new_terms.csv files. For CSVs with a header, the header of the vocabulary should be removed before passing it to filter_out_ci.
4647	09/12/2012 01:48 PM	Aaron Marcuse-Kubitza	autoremove: `svn rm`: Fixed bug where needed to add --force in case the file had already been modified before being autoremoved
4646	09/12/2012 01:32 PM	Aaron Marcuse-Kubitza	input.Makefile: Maps building: Removed no longer used $(createOnlyMaps)
4645	09/12/2012 01:30 PM	Aaron Marcuse-Kubitza	input.Makefile: Maps building: Removed no longer used %/src.csv, because it is no longer needed to generate map.full.csv from map.csv
4644	09/12/2012 01:21 PM	Aaron Marcuse-Kubitza	input.Makefile: Maps building: %/map.csv: If it doesn't exist, generate directly using $(mkSrcMap) instead of by copying %/src.csv, in order to eventually avoid the need to create a separate src.csv at all. Note that this avoids the need to run make twice when the table is first created to properly bootstrap all maps.
4643	09/12/2012 01:09 PM	Aaron Marcuse-Kubitza	autoremove: Try `svn rm` first in case the file is in svn
4642	09/12/2012 01:02 PM	Aaron Marcuse-Kubitza	input.Makefile: Maps building: Removed no longer used %/map.full.csv
4641	09/12/2012 12:59 PM	Aaron Marcuse-Kubitza	input.Makefile: Maps building: %/VegBIEN.csv: Use %/map.csv directly because %/map.full.csv is now a copy of it
4640	09/12/2012 12:56 PM	Aaron Marcuse-Kubitza	input.Makefile: Maps building: %/map.full.csv: Generate by copying map.csv, because the content of these files now differs only in the sort order of the names
4639	09/12/2012 12:53 PM	Aaron Marcuse-Kubitza	inputs///map.csv: Changed empty mappings to self mappings, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Map_refactoring#Change-empty-mappings-to-self-mappings>. Note that in map.full.csv and VegBIEN.csv, lines that have changed are always the result of the input field's case being changed to match the case of the datasource's actual column name.
4638	09/12/2012 12:43 PM	Aaron Marcuse-Kubitza	inputs///map.csv: Changed empty mappings to self mappings, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Map_refactoring#Change-empty-mappings-to-self-mappings>. Note that in map.full.csv and VegBIEN.csv, lines that have changed are always the result of the input field's case being changed to match the case of the datasource's actual column name.
4637	09/12/2012 12:31 PM	Aaron Marcuse-Kubitza	join: passthru mode: Fixed bug where empty join mappings needed to have the output field of the right-hand row manually set to the output field of the left-hand row for maps.merge_mappings() to work properly
4636	09/12/2012 12:14 PM	Aaron Marcuse-Kubitza	inputs///map.csv: Added back automapped mappings to map.csv, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Map_refactoring#Add-back-automapped-mappings-to-mapcsv>
4635	09/12/2012 12:07 PM	Aaron Marcuse-Kubitza	inputs/VegBank/taxonobservation_/map.csv: Updated with new renamings of colliding join columns
4634	09/12/2012 12:00 PM	Aaron Marcuse-Kubitza	join: When a join mapping exists but is empty, still include any additional columns from that mapping in the combined row
4633	09/12/2012 11:48 AM	Aaron Marcuse-Kubitza	inputs/SpeciesLink/Specimen/src.csv, inputs/XAL/Specimen/src.csv: Use input term as the initial Veg+ term, so the src.csv can be used with the Add back automapped mappings process at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Map_refactoring#Add-back-automapped-mappings-to-mapcsv>
4632	09/12/2012 11:31 AM	Aaron Marcuse-Kubitza	inputs/XAL/Specimen/src.csv, map.csv: Switched from using root prefixes to full column names, because the namespace mapping functionality can be handled much better by treating each namespace-qualified term as its own term rather than as a term and a prefix
4631	09/12/2012 11:22 AM	Aaron Marcuse-Kubitza	inputs/SpeciesLink/Specimen/src.csv, map.csv: Switched from using root prefixes to full column names, because the namespace mapping functionality can be handled much better by treating each namespace-qualified term as its own term rather than as a term and a prefix
4630	09/12/2012 11:02 AM	Aaron Marcuse-Kubitza	inputs/SpeciesLink/Specimen/map.csv: Removed no longer needed duplicate entries for each first letter case, which cause duplicate output mappings now that join is case- and punctuation-insensitive. Note that the `svn diff` hides _alt entry 0, which contains one of the removed duplicate columns that appears in the diff.
4629	09/12/2012 10:27 AM	Aaron Marcuse-Kubitza	inputs/SpeciesLink/Specimen/src.csv, inputs/XAL/Specimen/src.csv: Added Comments column for consistency with autogenerated src.csv format
4628	09/12/2012 10:14 AM	Aaron Marcuse-Kubitza	join: Added new passthru mode which passes through terms with no input mapping or no join mapping
4627	09/12/2012 09:25 AM	Aaron Marcuse-Kubitza	inputs/: Added [Veg+] to via map roots to indicate that the datasource and Veg+ vocabularies are combinable. This is possible now that automapped entries are no longer subtracted when this is in the map root, so there is no concern of losing comments on subtracted automapped rows. Note that this change turns on old-style automapping for these datasources, causing SALVIAS plotMetadata to acquire additional mappings.
4626	09/12/2012 08:59 AM	Aaron Marcuse-Kubitza	canon, translate, filter_out_ci: Support vocabularies/dictionaries with additional columns in addition to the functional column(s) used by the program. These columns can contain comments, etc. This was not originally supported because Python 2's iterable unpacking only supports "an iterable with the same number of items as there are targets in the target list" (http://docs.python.org/reference/simple_stmts.html#assignment-statements). We now use numeric array indexes instead to get around this limitation, and for consistency with other map-manipulation scripts.
4625	09/12/2012 08:21 AM	Aaron Marcuse-Kubitza	Removed no longer used subtract (use filter_out_ci instead)
4624	09/12/2012 08:19 AM	Aaron Marcuse-Kubitza	input.Makefile: Maps building: %/.map.csv.last_cleanup: Removed no longer needed subtraction of automapped entries, because information about unmapped and new terms is now available in unmapped_terms.csv and new_terms.csv
4623	09/12/2012 08:13 AM	Aaron Marcuse-Kubitza	README.TXT: Data import: `make backups/download`: Removed '&' because running the command in the background prevents rsync from providing a continuously updating progress indication (because a backgrounded process's stdout is not a TTY)
4622	09/12/2012 08:04 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Removed no longer needed /_simplifyPath:[next=parent_id]/path expressions in specific paths because parent_id forwarding is now set globally for all paths in the map root
4621	09/12/2012 07:56 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Added /_simplifyPath:[next=parent_id]/path to root so the returned subplot location will be its parent location if there is no subplot name or ID (indicating that that particular plot did not have subplots). Note that this also causes the parent_id forwarding effect to occur for all other tables containing parent_id, which will help prevent similar issues with subplot events, etc. This will hopefully fix the SALVIAS.plotObservations bug where some organisms did not have a subplot #, causing the subplot location to become NULL and causing the corresponding locationevent rows not to match the locationevent_unique_within_location index filter condition (which requires a parent_id), which caused multiple output table pkeys to be returned for those rows, violating the locationevent_pkeys temp table's primary key.
4620	09/12/2012 07:25 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: namedplace elements: _simplifyPath() calls: Removed no longer needed `require` arg, and removed no longer needed table suffix from `next` arg
4619	09/12/2012 07:02 AM	Aaron Marcuse-Kubitza	inputs/import.stats.xls: Updated with stats from latest import
4618	09/11/2012 11:04 AM	Aaron Marcuse-Kubitza	input.Makefile: Maps validation: $(newTerms): Fixed bug where tail with positive offset needs -n flag
4617	09/11/2012 11:01 AM	Aaron Marcuse-Kubitza	Regenerated/modified inputs///src.csv to use the self-mapping format used by the new automapping mechanism
4616	09/11/2012 10:50 AM	Aaron Marcuse-Kubitza	src_map: Map source columns to themselves so that src.csv can be used directly with the new automapping mechanism
4615	09/11/2012 10:48 AM	Aaron Marcuse-Kubitza	input.Makefile: Maps validation: %/new_terms.csv: Remove terms which are also in %/unmapped_terms.csv, because terms are not considered new (i.e. potential Veg+ terms) until they have been mapped to an existing Veg+ term. Being unmapped has a higher priority than being new, because it affects the current datasource itself rather than the easier mapping of future datasources.
4614	09/11/2012 10:22 AM	Aaron Marcuse-Kubitza	lib/mappings.Makefile: missing_mappings: Display unmapped_terms.csv, new_terms.csv after generating them, to preserve the behavior of the original missing_mappings
4613	09/11/2012 10:17 AM	Aaron Marcuse-Kubitza	root Makefile: Maps validation: Removed no longer used $(missingMappingsCmd)
4612	09/11/2012 10:17 AM	Aaron Marcuse-Kubitza	input.Makefile: Maps validation: Removed no longer used $(missingMappingsCmd)
4611	09/11/2012 10:16 AM	Aaron Marcuse-Kubitza	lib/mappings.Makefile: Removed no longer needed missing_%_mappings targets, since unmapped_terms.csv and new_terms.csv now serve the same purpose in a more efficient way
4610	09/11/2012 10:14 AM	Aaron Marcuse-Kubitza	lib/mappings.Makefile: `ifndef` for $(termsSubdirs): Fixed bug where needed to be termsSubdirs instead of missingMappingsCmd
4609	09/11/2012 10:02 AM	Aaron Marcuse-Kubitza	lib/mappings.Makefile: Require $(termsSubdirs)
4608	09/11/2012 10:00 AM	Aaron Marcuse-Kubitza	Generated global unmapped_terms.csv, new_terms.csv
4607	09/11/2012 10:00 AM	Aaron Marcuse-Kubitza	root Makefile: Maps validation: Added $(termsSubdirs) to enable generation of global unmapped_terms.csv, new_terms.csv
4606	09/11/2012 09:59 AM	Aaron Marcuse-Kubitza	inputs/: Generated combined unmapped_terms.csv, new_terms.csv for all inputs
4605	09/11/2012 09:58 AM	Aaron Marcuse-Kubitza	lib/mappings.Makefile: $(catTerms): Fixed bug where only existing $+ files (using $(+w)) could be included in the list (both to check and to use), because otherwise cat would raise an error or try to read stdin
4604	09/11/2012 09:56 AM	Aaron Marcuse-Kubitza	Existing maps discovery: Fixed bug where new unmapped_terms.csv, new_terms.csv needed to be included in $(anyMap)
4603	09/11/2012 09:52 AM	Aaron Marcuse-Kubitza	lib/common.Makefile: Added $(+w)
4602	09/11/2012 09:22 AM	Aaron Marcuse-Kubitza	lib/common.Makefile: Added $(no/) to remove trailing /
4601	09/11/2012 09:18 AM	Aaron Marcuse-Kubitza	Extracted %/unmapped_terms.csv, %/new_terms.csv as separate targets in the Maps validation section so they can be invoked even when %/.map.csv.last_cleanup is not a top-level target (in $(MAKECMDGOALS)). Continue to invoke them in %/.map.csv.last_cleanup by using $(selfMake).
4600	09/11/2012 08:56 AM	Aaron Marcuse-Kubitza	input.Makefile: Maps validation: Set $(termsSubdirs) to enable unmapped_terms.csv, new_terms.csv generation
4599	09/11/2012 08:56 AM	Aaron Marcuse-Kubitza	lib/mappings.Makefile: Added unmapped_terms.csv, new_terms.csv which are generated by combining the correspondingly-named files in $(termsSubdirs)
4598	09/11/2012 08:42 AM	Aaron Marcuse-Kubitza	input.Makefile: Maps building: %/.map.csv.last_cleanup: $(newTerms): Autoremove empty terms lists to avoid clutter
4597	09/11/2012 08:40 AM	Aaron Marcuse-Kubitza	Added autoremove
4596	09/11/2012 08:22 AM	Aaron Marcuse-Kubitza	input.Makefile: Maps building: %/.map.csv.last_cleanup: $(newTerms): Remove the CSV header from the terms lists so that multiple terms lists can easily be appended together
4595	09/11/2012 08:16 AM	Aaron Marcuse-Kubitza	input.Makefile: Maps building: %/.map.csv.last_cleanup: unmapped_terms.csv, new_terms.csv: Factored out commands into $(newTerms)
4594	09/11/2012 08:09 AM	Aaron Marcuse-Kubitza	input.Makefile: Maps building: %/.map.csv.last_cleanup: Generate reports on new and unmapped terms in map.csv
4593	09/11/2012 08:07 AM	Aaron Marcuse-Kubitza	Added filter_out_ci
4592	09/11/2012 07:26 AM	Aaron Marcuse-Kubitza	input.Makefile: Maps building: %/.map.csv.last_cleanup: Translate map.csv using $(mappings)/$(via)-VegCore.csv
4591	09/11/2012 07:25 AM	Aaron Marcuse-Kubitza	Added translate
4590	09/11/2012 07:08 AM	Aaron Marcuse-Kubitza	mappings/Veg+-VegCore.csv: Removed no longer used Comments column. Use mappings/Veg+.terms.csv to cite term definitions instead.
4589	09/11/2012 07:06 AM	Aaron Marcuse-Kubitza	mappings/Veg+-VegCore.csv: previousCatalogNumber: Removed no longer needed "According to" comment, because this is now documented in the mappings/Veg+.terms.csv entry. Note that the citation for any mapping is the overlap of the terms' definitions, and thus only the definitions need to be cited, not the mapping itself. (The definitions are provided in the links in mappings/Veg+.terms.csv.)
4588	09/11/2012 07:01 AM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: previousCatalogNumber: Added Source link to DwC history entry, which documents the definition of this term
4587	09/11/2012 06:43 AM	Aaron Marcuse-Kubitza	input.Makefile: Maps building: %/.map.csv.last_cleanup: Canonicalize map.csv using $(mappings)/$(via).vocab.csv
4586	09/11/2012 06:40 AM	Aaron Marcuse-Kubitza	Added canon
4585	09/11/2012 06:29 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped min/max SlopeAspect/SlopeGradient. Note that this allows the min/maxSlopeAspect values to bypass the additional _compass filter that is applied to slopeAspect.
4584	09/11/2012 05:49 AM	Aaron Marcuse-Kubitza	Added mappings/Veg+.vocab.csv
4583	09/11/2012 04:41 AM	Aaron Marcuse-Kubitza	inputs/GBIF/Specimen/map.csv: Remapped Original fields to new verbatim taxonomic terms
4582	09/11/2012 04:31 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped min/max SlopeAspect/SlopeGradient. Note that this allows the min/maxSlopeAspect values to bypass the additional _compass filter that is applied to slopeAspect.
4581	09/11/2012 04:23 AM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added min/max SlopeAspect/SlopeGradient
4580	09/11/2012 04:13 AM	Aaron Marcuse-Kubitza	inputs/VegBank/plot_/map.csv: Omit reallatitude/reallongitude because private data should not be placed in a public database
4579	09/11/2012 04:10 AM	Aaron Marcuse-Kubitza	inputs/CVS/Organism/map.csv: Omit realLatitude/realLongitude because private data should not be placed in a public database. Keeping VegBIEN free of restricted-access data allows anyone to run arbitrary queries on the database, without needing an entire security mechanism/front end just to manage users' read-only access to the data (as VegBank has). Note that the private coordinates are still accessible in the staging tables, so they will need to be locked down in order to make VegBIEN secure to public access.
4578	09/11/2012 03:16 AM	Aaron Marcuse-Kubitza	mappings/Veg+-VegCore.csv: Remapped QuadratID to subplotID because the standard definition of an ID term is an ID that's unique within the datasource, and it's just CTFS's usage that makes it unique only within the plot
4577	09/11/2012 03:13 AM	Aaron Marcuse-Kubitza	inputs/CTFS/StemObservation/map.csv: Manually mapped QuadratID to subplot since it is unique only within Site, and thus can't be the subplotID
4576	09/11/2012 03:09 AM	Aaron Marcuse-Kubitza	inputs/CTFS/SubplotObservation/map.csv: Manually mapped QuadratID to subplot since it is unique only within Site, and thus can't be the subplotID
4575	09/11/2012 03:06 AM	Aaron Marcuse-Kubitza	inputs/CTFS/Subplot/map.csv: Manually mapped QuadratID to subplot since it is unique only within Site, and thus can't be the subplotID. Omit QuadratName because QuadratID is used for the same purpose.
4574	09/11/2012 02:57 AM	Aaron Marcuse-Kubitza	mappings/Veg+-VegCore.csv: Removed recordNumber/_alt and recordNumber redirection mappings so that Veg+-VegCore.csv contains only renamings, not business logic. Note that removing the global ordering of these fields does not affect the datasources which contain multiple recordNumber synonyms because they either have a custom ordering or one field is duplicated or unused.
4573	09/11/2012 02:49 AM	Aaron Marcuse-Kubitza	inputs/NY/Specimen/map.csv: Omit CollectorNumber because it is not used, so it does not need to be mapped
4572	09/11/2012 02:45 AM	Aaron Marcuse-Kubitza	inputs/ARIZ/Specimen/map.csv: Omit FieldNumber because it is identical to CollectorNumber, so it does not need to be mapped
4571	09/11/2012 02:19 AM	Aaron Marcuse-Kubitza	inputs/SpeciesLink/Specimen/map.csv: Added manual CollectorNumber mapping which places it after recordNumber/fieldNumber, so that mappings/Veg+-VegCore.csv doesn't need to maintain a global ordering between these fields and just needs to indicate their equivalency
4570	09/11/2012 02:09 AM	Aaron Marcuse-Kubitza	mappings/: Removed no longer needed Veg+-VegCore.to_self.csv, because multiple levels of mappings are no longer needed to get to the VegCore term
4569	09/11/2012 02:07 AM	Aaron Marcuse-Kubitza	mappings/Veg+-VegCore.csv: DescriptionOfSite: Mapped directly to locality rather than to locationNarrative to avoid needing multiple levels of mappings to get to the VegCore term
4568	09/11/2012 01:56 AM	Aaron Marcuse-Kubitza	mappings/Veg+-VegCore.csv: Removed scientificNameAuthorship/_alt and scientificNameAuthorship redirection mappings, which were only used by SpeciesLink but it now has the necessary _alts in its own map.csv
4567	09/11/2012 01:48 AM	Aaron Marcuse-Kubitza	mappings/Veg+-VegCore.csv: Removed dateCollected/_alt and dateCollected redirection mappings, which were only needed when multiple dateCollected fields were being combined in Veg+-VegCore.csv
4566	09/11/2012 01:45 AM	Aaron Marcuse-Kubitza	mappings/: Moved year/month/dayCollected mappings from Veg+-VegCore.csv to VegCore-VegBIEN.csv so that Veg+-VegCore.csv contains only renamings, not business logic. Note that this allows the year/month/dayCollected values to bypass the additional _dateRangeStart filter that is applied to text dates. The priority of the plain dateCollected field is now higher than the year/month/dayCollected fields when both are specified, because the dateCollected field presumably contains verbatim text while the year/month/dayCollected fields contain parsed date parts.

Project

General

Profile