/trunk/inputs/.NCBI/nodes/test.xml.ref - Changes - BIEN 3 - NCEAS Projects

root/trunk/inputs/.NCBI/nodes/test.xml.ref @ 12642

#	Date	Author	Comment
11970	01/20/2014 11:33 AM	Aaron Marcuse-Kubitza	moved everything into /trunk/ to create the standard svn layout, for use with tools that require this (eg. git-svn). IMPORTANT: do NOT do an `svn up`. instead, re-use your working copy's existing files with `svn switch` (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.switch.html).
11396	10/21/2013 07:14 PM	Aaron Marcuse-Kubitza	fix: bin/map: put template: comment out the "Put template:" label so that the output is valid XML, and displays properly in a browser rather than showing a syntax error
11107	09/29/2013 08:58 PM	Aaron Marcuse-Kubitza	bugfix: mappings/VegCore-VegBIEN.csv: nest all taxonoccurrences inside a stratum event, so that the parent locationevent is always fully populated before child locationevents point to it. (previously, a stub parent event was created when the child event was imported first, which blocked the fully-populated parent event from being inserted later on.) this uses auto-folding (for VegBank/CVS) and auto-forwarding (for other datasources) to prune empty stratum events for taxonoccurrences that don't have strata. (see wiki.vegpath.org/Auto-folding, wiki.vegpath.org/Auto-forwarding for more info about these normalization techniques.) note that the inserted row counts stay exactly the same for all datasources except VegBank (which was being fixed), indicating that this signficant change to the mappings did not change the semantics of the import of taxonoccurrences.
10866	09/04/2013 11:06 PM	Aaron Marcuse-Kubitza	inputs///test.xml.ref: updated source.shortname for new datasource name, which now starts out with .new suffix
10098	06/27/2013 09:54 PM	Aaron Marcuse-Kubitza	inputs/.NCBI/: added new-style import runscripts, which renamed the staging table columns to VegCore
7181	01/11/2013 06:08 AM	Aaron Marcuse-Kubitza	inputs/.NCBI/nodes/test.xml.ref: Restored inserted row counts, which had gotten auto-accepted from a test run on a non-empty DB
7162	01/11/2013 02:03 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Removed TNRS input taxonlabels meant to cross-link to taxonlabels added by the TNRS import, because TNRS taxondeterminations are now created instead
6406	11/24/2012 07:50 AM	Aaron Marcuse-Kubitza	db_xml.py: put(): _setDefault(): Support setting multiple col_defaults at once by using the param names themselves as the column names
6403	11/24/2012 07:29 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Set the source_id col_default to the datasource name using the new _setDefault() built-in function and _env()
6035	11/06/2012 03:23 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Always map taxonNameOrEpithet to taxonomicname, now that it's globally unique at all ranks in the datasource that provides it (NCBI)
5834	10/30/2012 02:00 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: taxonlabel: taxonlabel_required_key constraint: Also allow taxonlabels with just a sourceaccessioncode, to support looking up parent taxonlabels using just their sourceaccessioncode (e.g. in NCBI)
5832	10/30/2012 01:20 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Don't create matched taxonlabel if taxonName was provided. This fixes a bug where an NCBI node was incorrectly pointing to a TNRS name, when the reference should only be the other way around. This may also fix the TNRS slowdown, if it was caused by circular matched_label_id references.
5767	10/25/2012 09:31 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: taxonoccurrence: Added taxonoccurrence_required_key check constraint to ensure that all taxonoccurrences are properly identified, and empty taxonoccurrences are properly pruned. This fixes a bug where taxon-only and stem-only data did not properly prune the taxonoccurrence that would otherwise get created because it's included in the mappings.
5733	10/23/2012 08:42 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: taxonName->taxonepithet: Use new _taxonomic_name_is_epithet() instead of _is_higher_taxon(), because it's more specific to the filtering task for this field
5731	10/23/2012 08:33 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: taxonName->taxonomicname: Use new _has_taxonomic_name() instead of _is_higher_taxon(), because it's more specific to the filtering task for this field
5730	10/23/2012 08:30 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: taxonName->taxonomicname: Use new _has_taxonomic_name() instead of _is_higher_taxon(), because it's more specific to the filtering task for this field
5727	10/23/2012 08:01 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: _is_higher_taxon() calls: Default to true if the rank can't be parsed to a taxonrank enum value
5703	10/23/2012 12:57 AM	Aaron Marcuse-Kubitza	inputs/.NCBI/: Renamed higher_taxa to nodes because it currently doesn't just contain the higher taxa
5688	10/19/2012 06:15 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: taxonName: Place it in taxonomicname instead of taxonepithet for lower taxa, because the only datasource that currently provides this field (NCBI) actually provides the full taxonomicname instead of the epithet at the current rank for lower taxa. (taxonomicname is not applicable to higher taxa because their names are not guaranteed to be globally unique.) taxonName may need to be renamed and/or redefined to account for this ambiguity in NCBI's usage.
5686	10/19/2012 06:04 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Do not include the taxonName in the concatenated taxonomicname because it is NOT globally unique. The same name may be used at different taxonomic ranks and mean different things, and lower taxa may have the name appear in multiple genuses or species, meaning different things.
5657	10/18/2012 04:21 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: Link taxondetermination to taxonverbatim (which is a subclass of taxonlabel) instead of directly to taxonlabel. This will enable later having multiple taxonverbatims for one taxonlabel.
5656	10/18/2012 04:04 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: taxonlabel: Renamed identifyingtaxonomicname to taxonomicname because the taxonomicname provided by the datasource is now in taxonverbatim, so there is no name collision. Note that both of these fields store the same type of information, but taxonlabel's is autogenerated while taxonverbatim's is verbatim (and is only set if provided by the datasource).
5655	10/18/2012 03:57 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: taxonlabel: Moved non-scoping fields to new taxonverbatim subclass table, which contains the component parts of the taxonlabel
5651	10/18/2012 02:44 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: taxonlabel: Require either an identifyingtaxonomicname or a taxonepithet. The NCBI inserted row count decreases by one because this prunes off a taxonlabel created for a parent node which was not contained in the first two rows (remember that NCBI taxa are not in dependency order, so parents are often imported after children).
5646	10/18/2012 01:51 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Also create the identifyingtaxonomicname on the verbatim taxonlabel supplied by the datasource, in addition to on the TNRS input taxonlabel that the verbatim taxonlabel is matched up with
5644	10/18/2012 01:21 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Always generate the concatenated identifyingtaxonomicname, even for higher taxa, to ensure that this field is always populated. Note that this will cause names of higher taxa to be scrubbed by TNRS, but this is usually not a problem because such names either have no match or not a close enough match based on the name only. Naming conventions generally cause names at different ranks to be different, so that collisions with lower ranks should not be a problem.
5608	10/17/2012 04:12 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: Renamed taxonconcept to taxonlabel per today's conference call, where it was decided that taxonconcept contained too many unrelated fields to be purely a taxon concept
5596	10/17/2012 12:43 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: taxonconcept: Renamed taxonname to taxonepithet for clarity and to be consistent with TCS's use of "epithet" to denote what the taxonname was intended to be (http://www.tdwg.org/standards/117/download/#/UserGuidev_1.3.pdf)
5513	10/15/2012 10:08 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: taxonconcept.parent_id when explicit parent provided: Set taxonconcept.parent_id using new _taxonconcept_set_parent_id() after creating the child taxonconcept, so that the parent_id will point to the already-inserted parent taxonconcept instead of creating a new, empty parent taxonconcept. This creates a two-step import, where first the taxonconcepts are imported, and then the parent_ids are matched up. This is necessary for column-based import because all the parent taxonconcepts are imported in a separate iteration from the child taxonconcepts with only their sourceaccessioncode, so this iteration must occur after the child taxonconcept iteration in order to match up with fully-populated taxonconcepts. Row-based import, on the other hand, does not require _taxonconcept_set_parent_id() but does require the taxonconcepts to be provided in dependency order (parents first), which is unfortunately not the case for NCBI.
5491	10/12/2012 05:11 PM	Aaron Marcuse-Kubitza	Added inputs/.NCBI/. This uses many of the new schema and mappings features, such as taxonconcept.sourceaccessioncode and parentTaxonID

Project

General

Profile