Project

General

Profile

Statistics
| Revision:

# Date Author Comment
4684 09/14/2012 06:38 PM Aaron Marcuse-Kubitza

inputs/*/*/map.csv: Changed _merge to _join everywhere because _merge's (slower) duplicate elimination functionality is not needed (the combined columns do not both contain the same value, so they can simply be concatenated)

4683 09/14/2012 06:21 PM Aaron Marcuse-Kubitza

schemas/functions.sql: _label(): Accept params of any type, in order to support types other than text (which come from staging tables that are imported directly from a SQL export). This fixes a bug in SALVIAS.plotMetadata's column-based import.

4682 09/14/2012 06:17 PM Aaron Marcuse-Kubitza

schemas/functions.sql: _label(): Support NULL labels by not prepending a label

4681 09/14/2012 06:04 PM Aaron Marcuse-Kubitza

mappings/Veg+-VegCore.csv: Changed output column header from Veg+ to VegCore because this is more accurate. This is possible now that we're using new automapping scripts that do not require a particular column header. Note that this change now requires the map.csvs to use VegCore as their output column header, because otherwise the Veg+ header will get automapped to VegCore. (The header replacing is a feature to support changing the header when the schema of the column's terms changes.)

4680 09/14/2012 06:03 PM Aaron Marcuse-Kubitza

mappings/root.sh: Changed output column header from Veg+ to VegCore because this is more accurate following the initial automapping

4679 09/14/2012 05:59 PM Aaron Marcuse-Kubitza

inputs/*/*/map.csv: Changed output column header from Veg+ to VegCore because the names will be VegCore names after automapping. This is possible now that we're using new automapping scripts that do not require a particular column header.

4678 09/14/2012 05:53 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Copied the Change factor formula to all rows (it displays an empty string for rows that don't have both a row-based and a column-based import)

4677 09/14/2012 05:49 PM Aaron Marcuse-Kubitza

README.TXT: Data import: Added steps to record the import times in inputs/import.stats.xls

4676 09/14/2012 05:42 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import

4675 09/14/2012 05:40 PM Aaron Marcuse-Kubitza

Added import_times

4674 09/13/2012 02:40 PM Aaron Marcuse-Kubitza

mappings/root.sh: Removed no longer needed $in_root_suffix

4673 09/13/2012 02:39 PM Aaron Marcuse-Kubitza

src_map: Upgraded to match new map format by adding Filter column

4672 09/13/2012 02:38 PM Aaron Marcuse-Kubitza

input.Makefile: $(viaMaps): Fixed bug where could not wrap it in $(wildcard) because that would prevent map.csv from being created when a new datasource or new subdir is added

4671 09/12/2012 05:36 PM Aaron Marcuse-Kubitza

input.Makefile: $(viaMaps): Removed extra addition of */map.csv, which is already included because all $(tables) have or will get a map.csv

4670 09/12/2012 05:34 PM Aaron Marcuse-Kubitza

mappings/: Removed no longer used derived file Veg+.vocab.csv

4669 09/12/2012 05:33 PM Aaron Marcuse-Kubitza

input.Makefile: Removed no longer used $(vocab)

4668 09/12/2012 05:32 PM Aaron Marcuse-Kubitza

input.Makefile: Maps validation: %/new_terms.csv: Filter out $(coreMap) and $(dict) successively instead of $(vocab), to avoid requiring intermediate mapping files not edited by the user

4667 09/12/2012 05:28 PM Aaron Marcuse-Kubitza

input.Makefile: Maps validation: $(newTerms): Don't hardcode the caller's first filter_out_ci by prerequisite position; instead allow them to specify the command (including the var name) themselves

4666 09/12/2012 05:24 PM Aaron Marcuse-Kubitza

input.Makefile: Maps validation: $(newTerms): For simplicity, subset the columns before running filter_out_ci

4665 09/12/2012 05:20 PM Aaron Marcuse-Kubitza

mappings/: Removed no longer used Veg+-VegBIEN.csv and derived autogen Veg+.self.csv

4664 09/12/2012 05:16 PM Aaron Marcuse-Kubitza

input.Makefile: Maps building: %/unmapped_terms.csv: Use $(coreMap) instead of $(vocab) because the terms should already be translated to VegCore terms, rather than still being Veg+

4663 09/12/2012 05:13 PM Aaron Marcuse-Kubitza

input.Makefile: Maps validation: $(newTerms): Fixed bug where header needed to be removed before running filter_out_ci because filter_out_ci only removes the header if it matches the vocabulary's header. Removing the header afterward can cause the first row to be removed instead if the header was already removed.

4662 09/12/2012 05:11 PM Aaron Marcuse-Kubitza

cols: Support CSVs without a header, such as intermediates that become unmapped_terms.csv, new_terms.csv

4661 09/12/2012 04:37 PM Aaron Marcuse-Kubitza

inputs/: Regenerated unmapped_terms.csv, new_terms.csv

4660 09/12/2012 04:25 PM Aaron Marcuse-Kubitza

input.Makefile: %/.map.csv.last_cleanup: Removed no longer used prerequisite $(vocab)

4659 09/12/2012 04:24 PM Aaron Marcuse-Kubitza

input.Makefile: %/.map.csv.last_cleanup: Canonicalize separately on $(coreMap) and $(dict), instead of requiring them to be combined in $(vocab)

4658 09/12/2012 04:20 PM Aaron Marcuse-Kubitza

input.Makefile: Use mappings/VegCore-VegBIEN.csv instead of mappings/Veg+-VegBIEN.csv as the core map, because the automapper now takes care of Veg+ -> VegCore translation

4657 09/12/2012 04:14 PM Aaron Marcuse-Kubitza

inputs/*/*/map.csv: Moved filter suffixes to separate filter column to enable automapping to work on those mappings' terms, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Map_refactoring#Move-filter-suffixes-to-separate-filter-column&gt;. Note that the only changes to VegBIEN.csvs are the (now automapped) names of terms in "No join mapping" comments.

4656 09/12/2012 03:37 PM Aaron Marcuse-Kubitza

inputs/*/*/map.csv: Added Filter column to contain any suffix added after the term, so that the automapping mechanism does not have to deal with the filter expressions

4655 09/12/2012 03:35 PM Aaron Marcuse-Kubitza

Added cat_cols

4654 09/12/2012 03:34 PM Aaron Marcuse-Kubitza

Added ins_col

4653 09/12/2012 03:13 PM Aaron Marcuse-Kubitza

input.Makefile: Maps building: %/.map.csv.last_cleanup: Reference fixed prerequisites by name instead of by position in the prerequisites list

4652 09/12/2012 02:28 PM Aaron Marcuse-Kubitza

Removed no longer used intersect

4651 09/12/2012 02:18 PM Aaron Marcuse-Kubitza

inputs/*/*/map.csv: Removed no longer needed [Veg+] suffix in root, because the input column is no longer used by old-style map utilities such as union that needed this

4650 09/12/2012 02:07 PM Aaron Marcuse-Kubitza

translate: Translate the column header instead of passing it through, in order to properly support CSVs without a header and to support renaming the header when the column's contents change to a different schema or vocabulary

4649 09/12/2012 02:04 PM Aaron Marcuse-Kubitza

canon: Canonicalize the column header instead of passing it through, in order to properly support CSVs without a header

4648 09/12/2012 01:57 PM Aaron Marcuse-Kubitza

filter_out_ci: Filter header instead of passing it through, in order to properly support CSVs without a header, such as the unmapped_terms.csv and new_terms.csv files. For CSVs with a header, the header of the vocabulary should be removed before passing it to filter_out_ci.

4647 09/12/2012 01:48 PM Aaron Marcuse-Kubitza

autoremove: `svn rm`: Fixed bug where needed to add --force in case the file had already been modified before being autoremoved

4646 09/12/2012 01:32 PM Aaron Marcuse-Kubitza

input.Makefile: Maps building: Removed no longer used $(createOnlyMaps)

4645 09/12/2012 01:30 PM Aaron Marcuse-Kubitza

input.Makefile: Maps building: Removed no longer used %/src.csv, because it is no longer needed to generate map.full.csv from map.csv

4644 09/12/2012 01:21 PM Aaron Marcuse-Kubitza

input.Makefile: Maps building: %/map.csv: If it doesn't exist, generate directly using $(mkSrcMap) instead of by copying %/src.csv, in order to eventually avoid the need to create a separate src.csv at all. Note that this avoids the need to run make twice when the table is first created to properly bootstrap all maps.

4643 09/12/2012 01:09 PM Aaron Marcuse-Kubitza

autoremove: Try `svn rm` first in case the file is in svn

4642 09/12/2012 01:02 PM Aaron Marcuse-Kubitza

input.Makefile: Maps building: Removed no longer used %/map.full.csv

4641 09/12/2012 12:59 PM Aaron Marcuse-Kubitza

input.Makefile: Maps building: %/VegBIEN.csv: Use %/map.csv directly because %/map.full.csv is now a copy of it

4640 09/12/2012 12:56 PM Aaron Marcuse-Kubitza

input.Makefile: Maps building: %/map.full.csv: Generate by copying map.csv, because the content of these files now differs only in the sort order of the names

4639 09/12/2012 12:53 PM Aaron Marcuse-Kubitza

inputs/*/*/map.csv: Changed empty mappings to self mappings, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Map_refactoring#Change-empty-mappings-to-self-mappings&gt;. Note that in map.full.csv and VegBIEN.csv, lines that have changed are always the result of the input field's case being changed to match the case of the datasource's actual column name.

4638 09/12/2012 12:43 PM Aaron Marcuse-Kubitza

inputs/*/*/map.csv: Changed empty mappings to self mappings, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Map_refactoring#Change-empty-mappings-to-self-mappings&gt;. Note that in map.full.csv and VegBIEN.csv, lines that have changed are always the result of the input field's case being changed to match the case of the datasource's actual column name.

4637 09/12/2012 12:31 PM Aaron Marcuse-Kubitza

join: passthru mode: Fixed bug where empty join mappings needed to have the output field of the right-hand row manually set to the output field of the left-hand row for maps.merge_mappings() to work properly

4636 09/12/2012 12:14 PM Aaron Marcuse-Kubitza

inputs/*/*/map.csv: Added back automapped mappings to map.csv, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Map_refactoring#Add-back-automapped-mappings-to-mapcsv>

4635 09/12/2012 12:07 PM Aaron Marcuse-Kubitza

inputs/VegBank/taxonobservation_/map.csv: Updated with new renamings of colliding join columns

4634 09/12/2012 12:00 PM Aaron Marcuse-Kubitza

join: When a join mapping exists but is empty, still include any additional columns from that mapping in the combined row

4633 09/12/2012 11:48 AM Aaron Marcuse-Kubitza

inputs/SpeciesLink/Specimen/src.csv, inputs/XAL/Specimen/src.csv: Use input term as the initial Veg+ term, so the src.csv can be used with the Add back automapped mappings process at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Map_refactoring#Add-back-automapped-mappings-to-mapcsv>

4632 09/12/2012 11:31 AM Aaron Marcuse-Kubitza

inputs/XAL/Specimen/src.csv, map.csv: Switched from using root prefixes to full column names, because the namespace mapping functionality can be handled much better by treating each namespace-qualified term as its own term rather than as a term and a prefix

4631 09/12/2012 11:22 AM Aaron Marcuse-Kubitza

inputs/SpeciesLink/Specimen/src.csv, map.csv: Switched from using root prefixes to full column names, because the namespace mapping functionality can be handled much better by treating each namespace-qualified term as its own term rather than as a term and a prefix

4630 09/12/2012 11:02 AM Aaron Marcuse-Kubitza

inputs/SpeciesLink/Specimen/map.csv: Removed no longer needed duplicate entries for each first letter case, which cause duplicate output mappings now that join is case- and punctuation-insensitive. Note that the `svn diff` hides _alt entry 0, which contains one of the removed duplicate columns that appears in the diff.

4629 09/12/2012 10:27 AM Aaron Marcuse-Kubitza

inputs/SpeciesLink/Specimen/src.csv, inputs/XAL/Specimen/src.csv: Added Comments column for consistency with autogenerated src.csv format

4628 09/12/2012 10:14 AM Aaron Marcuse-Kubitza

join: Added new passthru mode which passes through terms with no input mapping or no join mapping

4627 09/12/2012 09:25 AM Aaron Marcuse-Kubitza

inputs/: Added [Veg+] to via map roots to indicate that the datasource and Veg+ vocabularies are combinable. This is possible now that automapped entries are no longer subtracted when this is in the map root, so there is no concern of losing comments on subtracted automapped rows. Note that this change turns on old-style automapping for these datasources, causing SALVIAS plotMetadata to acquire additional mappings.

4626 09/12/2012 08:59 AM Aaron Marcuse-Kubitza

canon, translate, filter_out_ci: Support vocabularies/dictionaries with additional columns in addition to the functional column(s) used by the program. These columns can contain comments, etc. This was not originally supported because Python 2's iterable unpacking only supports "an iterable with the same number of items as there are targets in the target list" (http://docs.python.org/reference/simple_stmts.html#assignment-statements). We now use numeric array indexes instead to get around this limitation, and for consistency with other map-manipulation scripts.

4625 09/12/2012 08:21 AM Aaron Marcuse-Kubitza

Removed no longer used subtract (use filter_out_ci instead)

4624 09/12/2012 08:19 AM Aaron Marcuse-Kubitza

input.Makefile: Maps building: %/.map.csv.last_cleanup: Removed no longer needed subtraction of automapped entries, because information about unmapped and new terms is now available in unmapped_terms.csv and new_terms.csv

4623 09/12/2012 08:13 AM Aaron Marcuse-Kubitza

README.TXT: Data import: `make backups/download`: Removed '&' because running the command in the background prevents rsync from providing a continuously updating progress indication (because a backgrounded process's stdout is not a TTY)

4622 09/12/2012 08:04 AM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: Removed no longer needed /_simplifyPath:[next=parent_id]/path expressions in specific paths because parent_id forwarding is now set globally for all paths in the map root

4621 09/12/2012 07:56 AM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: Added /_simplifyPath:[next=parent_id]/path to root so the returned subplot location will be its parent location if there is no subplot name or ID (indicating that that particular plot did not have subplots). Note that this also causes the parent_id forwarding effect to occur for all other tables containing parent_id, which will help prevent similar issues with subplot events, etc. This will hopefully fix the SALVIAS.plotObservations bug where some organisms did not have a subplot #, causing the subplot location to become NULL and causing the corresponding locationevent rows not to match the locationevent_unique_within_location index filter condition (which requires a parent_id), which caused multiple output table pkeys to be returned for those rows, violating the locationevent_pkeys temp table's primary key.

4620 09/12/2012 07:25 AM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: namedplace elements: _simplifyPath() calls: Removed no longer needed `require` arg, and removed no longer needed table suffix from `next` arg

4619 09/12/2012 07:02 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import

4618 09/11/2012 11:04 AM Aaron Marcuse-Kubitza

input.Makefile: Maps validation: $(newTerms): Fixed bug where tail with positive offset needs -n flag

4617 09/11/2012 11:01 AM Aaron Marcuse-Kubitza

Regenerated/modified inputs/*/*/src.csv to use the self-mapping format used by the new automapping mechanism

4616 09/11/2012 10:50 AM Aaron Marcuse-Kubitza

src_map: Map source columns to themselves so that src.csv can be used directly with the new automapping mechanism

4615 09/11/2012 10:48 AM Aaron Marcuse-Kubitza

input.Makefile: Maps validation: %/new_terms.csv: Remove terms which are also in %/unmapped_terms.csv, because terms are not considered new (i.e. potential Veg+ terms) until they have been mapped to an existing Veg+ term. Being unmapped has a higher priority than being new, because it affects the current datasource itself rather than the easier mapping of future datasources.

4614 09/11/2012 10:22 AM Aaron Marcuse-Kubitza

lib/mappings.Makefile: missing_mappings: Display unmapped_terms.csv, new_terms.csv after generating them, to preserve the behavior of the original missing_mappings

4613 09/11/2012 10:17 AM Aaron Marcuse-Kubitza

root Makefile: Maps validation: Removed no longer used $(missingMappingsCmd)

4612 09/11/2012 10:17 AM Aaron Marcuse-Kubitza

input.Makefile: Maps validation: Removed no longer used $(missingMappingsCmd)

4611 09/11/2012 10:16 AM Aaron Marcuse-Kubitza

lib/mappings.Makefile: Removed no longer needed missing_%_mappings targets, since unmapped_terms.csv and new_terms.csv now serve the same purpose in a more efficient way

4610 09/11/2012 10:14 AM Aaron Marcuse-Kubitza

lib/mappings.Makefile: `ifndef` for $(termsSubdirs): Fixed bug where needed to be termsSubdirs instead of missingMappingsCmd

4609 09/11/2012 10:02 AM Aaron Marcuse-Kubitza

lib/mappings.Makefile: Require $(termsSubdirs)

4608 09/11/2012 10:00 AM Aaron Marcuse-Kubitza

Generated global unmapped_terms.csv, new_terms.csv

4607 09/11/2012 10:00 AM Aaron Marcuse-Kubitza

root Makefile: Maps validation: Added $(termsSubdirs) to enable generation of global unmapped_terms.csv, new_terms.csv

4606 09/11/2012 09:59 AM Aaron Marcuse-Kubitza

inputs/: Generated combined unmapped_terms.csv, new_terms.csv for all inputs

4605 09/11/2012 09:58 AM Aaron Marcuse-Kubitza

lib/mappings.Makefile: $(catTerms): Fixed bug where only existing $+ files (using $(+w)) could be included in the list (both to check and to use), because otherwise cat would raise an error or try to read stdin

4604 09/11/2012 09:56 AM Aaron Marcuse-Kubitza

Existing maps discovery: Fixed bug where new unmapped_terms.csv, new_terms.csv needed to be included in $(anyMap)

4603 09/11/2012 09:52 AM Aaron Marcuse-Kubitza

lib/common.Makefile: Added $(+w)

4602 09/11/2012 09:22 AM Aaron Marcuse-Kubitza

lib/common.Makefile: Added $(no/) to remove trailing /

4601 09/11/2012 09:18 AM Aaron Marcuse-Kubitza

Extracted %/unmapped_terms.csv, %/new_terms.csv as separate targets in the Maps validation section so they can be invoked even when %/.map.csv.last_cleanup is not a top-level target (in $(MAKECMDGOALS)). Continue to invoke them in %/.map.csv.last_cleanup by using $(selfMake).

4600 09/11/2012 08:56 AM Aaron Marcuse-Kubitza

input.Makefile: Maps validation: Set $(termsSubdirs) to enable unmapped_terms.csv, new_terms.csv generation

4599 09/11/2012 08:56 AM Aaron Marcuse-Kubitza

lib/mappings.Makefile: Added unmapped_terms.csv, new_terms.csv which are generated by combining the correspondingly-named files in $(termsSubdirs)

4598 09/11/2012 08:42 AM Aaron Marcuse-Kubitza

input.Makefile: Maps building: %/.map.csv.last_cleanup: $(newTerms): Autoremove empty terms lists to avoid clutter

4597 09/11/2012 08:40 AM Aaron Marcuse-Kubitza

Added autoremove

4596 09/11/2012 08:22 AM Aaron Marcuse-Kubitza

input.Makefile: Maps building: %/.map.csv.last_cleanup: $(newTerms): Remove the CSV header from the terms lists so that multiple terms lists can easily be appended together

4595 09/11/2012 08:16 AM Aaron Marcuse-Kubitza

input.Makefile: Maps building: %/.map.csv.last_cleanup: unmapped_terms.csv, new_terms.csv: Factored out commands into $(newTerms)

4594 09/11/2012 08:09 AM Aaron Marcuse-Kubitza

input.Makefile: Maps building: %/.map.csv.last_cleanup: Generate reports on new and unmapped terms in map.csv

4593 09/11/2012 08:07 AM Aaron Marcuse-Kubitza

Added filter_out_ci

4592 09/11/2012 07:26 AM Aaron Marcuse-Kubitza

input.Makefile: Maps building: %/.map.csv.last_cleanup: Translate map.csv using $(mappings)/$(via)-VegCore.csv

4591 09/11/2012 07:25 AM Aaron Marcuse-Kubitza

Added translate

4590 09/11/2012 07:08 AM Aaron Marcuse-Kubitza

mappings/Veg+-VegCore.csv: Removed no longer used Comments column. Use mappings/Veg+.terms.csv to cite term definitions instead.

4589 09/11/2012 07:06 AM Aaron Marcuse-Kubitza

mappings/Veg+-VegCore.csv: previousCatalogNumber: Removed no longer needed "According to" comment, because this is now documented in the mappings/Veg+.terms.csv entry. Note that the citation for any mapping is the overlap of the terms' definitions, and thus only the definitions need to be cited, not the mapping itself. (The definitions are provided in the links in mappings/Veg+.terms.csv.)

4588 09/11/2012 07:01 AM Aaron Marcuse-Kubitza

mappings/Veg+.terms.csv: previousCatalogNumber: Added Source link to DwC history entry, which documents the definition of this term

4587 09/11/2012 06:43 AM Aaron Marcuse-Kubitza

input.Makefile: Maps building: %/.map.csv.last_cleanup: Canonicalize map.csv using $(mappings)/$(via).vocab.csv

4586 09/11/2012 06:40 AM Aaron Marcuse-Kubitza

Added canon

4585 09/11/2012 06:29 AM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: Mapped min/max SlopeAspect/SlopeGradient. Note that this allows the min/maxSlopeAspect values to bypass the additional _compass filter that is applied to slopeAspect.