normalized VegCore to-dos¶

place.name: make it optional: there can be coordinate-only places with no name
taxon_scrub: rename to taxon_match
taxon_name.formal_name: rename to name_with_author
taxon_name.taxon_name: rename to name_no_author
taxon_match.parsed_taxon_assertion: rename to parsed_taxon
taxon_match.matched_taxon_concept: rename to matched_taxon
taxon_concept.accepted_taxon_concept: rename to accepted_taxon
rename all unique constraints to [table_abbr]_by_[field_abbrs] (removing _unique and abbreviating the table and fields)
- event_by_subject__date__participants -> evt_by_date (later obs_by_date)
- event_by_subject__name -> evt_by_name (later obs_by_name)
evt_by_date: document that this constraint is used for eg. a plot sampling event, which uses place, time, and collectors to define an event and scope its collector_numbers
evt_by_name: document that this constraint is used for eg. differentiating identification and collection events for the same specimen
traceable.id_by_source: document that collisions will most often happen on this field, not id (which stores the natural key). id collisions are rare and usually indicate inter-datasource duplication.
individual_observation.specimenholder_institutions: rename to specimen_duplicate_institutions
source.url: rename to uri as not all sources are locatable on the internet
- this is instead a globally unique identifier for the record, which has a structure similar to a URL
automate the creation of the hyperlinked image map from the coordinates in the MySQL Workbench document
- this will avoid the need to manually update the positions of tables for the following changes, which will require moving tables around to make room
add taxon_path.species_binomial
stem: remove inheritance from individual (an individual is a grouping of stems, and thus is a fundamentally separate entity from a stem)
stem_observation: remove inheritance from individual_observation
individual: move all fields to stem (individuals only have an id_within_dataset to identify the grouping of stems)
- add required identifying_stem that points to the stem that has the individual's identifying tag or identifying_place
  - for SALVIAS, this requires figuring out for each individual, which stem has the same tag #s as the individual
individual: remove inheritance from reobservable (only a stem is identifiable to a physical location or tagged stem)
- stem should extend reobservable instead
taxon_concept HAS-MANY instead of IS-A taxon_name (ie. taxon_name HAS-AN optional taxon_concept)
taxon_concept: rename to taxon because this table is used to store any kind of taxonomic group (including TNRS results and higher taxa), not just formally-described taxon concepts
- according_to: rename to defined_by because this is more general
add taxon_concept table, which extends taxon
- stores only formally-described taxon concepts
- adds according_to, which stores the literature reference that contains the description (this is also populated in defined_by)
add observation table, which extends event
- applicable event subclasses should inherit from it instead
move event.subject to observation and make it required
- also move associated unique constraints, and pull forward inherited fields used in them
add event.type, needed to form a full event text ID
- use it in every event unique constraint that uses name
- subclasses that want to use this must set type to their table name
  - this prevents event's unique constraints from inadvertently being used when the subclass's unique constraints use name together with some other info
- note that this field is not used in obs_by_name, because for observations, the type is always observation
add taxon_observation_by_collector (tax_obs_by_coll()) unique constraint on columns sampling_event, primary_collector, collector_number
add traceable.permalink, a URL which links directly to the traceable record itself
- this should be the phpPgAdmin URL, or an abbreviated redirect to it (eg. starting with vegbiendev.nceas.ucsb.edu)
add traceable.id_by_natural_key, autopopulated from id, which distinguishes any natural key from id_by_source
traceable: add optional source_record fkey to record
- use this instead of source in forming id_by_source
traceable.id_within_source: rename to source_record_section
traceable.source: rename to dataset and make it an fkey to dataset
- this also distinguishes it from dataset_source when this field is inherited by dataset
  - note that dataset.dataset (the dataset record's dataset) would be eg. Index Herbariorum, not the parent dataset (which is stored in dataset.parent)
- note that this change makes traceable mutually recursive with dataset. because dataset is required, populating the root dataset node requires deferring fkeys (using SET CONSTRAINTS DEFERRED) until traceable::dataset has been set to dataset.id.
- an entity that consists only of manually-entered data should point to a dataset which contains information about the person who populated it
  - the root dataset is for the database owner who populated the dataset metadata (eg. BIEN, which is the root dataset and populated the root dataset)
- dataset is autopopulated from source_record.attribution_dataset
source: extend traceable
move source.uri to traceable and rename to source_uri
dataset.dataset_source: rename to just source now that traceable no longer has a conflicting source field
record.scoping_dataset: rename to id_scoping_dataset
record.attribution_dataset: document that it can be set to a subset of the id_scoping_dataset when finer-grained attribution is available
project: document that it is not a type of dataset, for the reasons described in r11221
project.dataset: make it required
add dataset_indexed, which extends traceable
project.dataset: document that this is the dataset which defined the project (a project is actually an event, which is dataset-independent; multiple datasets may refer to the same project)
project: extend dataset_indexed
- use name together with inherited dataset in the proj_by_name unique constraint
person: extend dataset_indexed
- use name together with inherited dataset in the pers_by_src unique constraint
place_visit: extend dataset_indexed to make locating the place_visits in a dataset easier
- dataset is autopopulated from project.dataset
taxon_observation: extend dataset_indexed to make locating the taxon_observations in a dataset easier
- dataset is autopopulated from place_visit.dataset
for inheritance hierarchies with multiple unique constraints, add an id_by_... field for each unique constraint which contains the associated natural key
- this shows in the ERD which natural keys are available
- it also allows querying on a specific natural key when several exist for the same record
add custom_place, which extends dataset_indexed to make locating the places in a dataset easier
- used for specimen coordinates as well as plots
- contains optional defining_project
- places can be part of projects in the same way that samplings of places can be part of projects
- pull inherited name field into custom_place so it can be used together with defining_project in the unique constraint
place.name: make it optional
- custom_places for specimen coordinates usually often do not have a name, just a numeric ID
plot: rename to sampling_area (plot/subplot)
- a sampling_area is an area defined solely for the purpose of aggregating taxon_occurrences (what some might call a plot)
- note that a sampling_area is not the only type of place which can directly contain taxon_occurrences, because named places (regions) can contain these as well
add outer_plot, which extends sampling_area and custom_place
- an outer_plot is the outermost (largest possible) sampling_area in which taxa were sampled
subplot: add comment that this is a plot subdivision, but is not considered a first-class plot
subplot: add outer_plot pointer
taxa_sampling_event: document that a shared place_visit ties together all the sampling_areas in the same outer_plot
sampling_area.boundary_WKT: rename to shape_def_WKT
subplace: rename to place_element
- can store anything located within a place, not just other places
- derived classes: individual; anything that is a point within a place
add plot_element table, which extends place_element
- derived classes: subplot
- parent: fkey to sampling_area (not outer_plot, as this can be used for things in subplots as well)
add site table, which stores a place defined by a locality description (i.e. directions to it)
- extends place w/ rank=site
- contains locality_desc
- an outer_plot is contained within a site, and is not a subclass of it
- note that the CTFS Site table should actually be called Plot, and is not equivalent to this definition of a site. the CTFS usage is particularly confusing because some plots are named after the entire site they are within (eg. the bci plot named after the Barro Colorado Island site), even though they actually refer to a plot (the only plot) within that site.
add sampling_scope table
- has req. place; opt. stratum, size_class, taxon inclusion, subsetting_info (hstore)
  - place doesn't have to be a sampling_area (plot/subplot), as eg. specimens are not located in a plot
taxa_sampling_event: add req. sampling_scope
place_visit: document that usually only an outer_plot has one of these, unless the subplots were treated like first-class plots with their own elevation, soil, etc.
taxa_sampling_event.place_visit: document that this is usually for an outer_plot, not a subplot
taxon_name: add morphospecies_suffix
- not in taxon_path because not globally unique
taxon_occurrence extends taxon_concept
taxon_determination can be applied to any taxon_concept
taxon_scrub extends taxon_determination
community, geological_context: add list tables

Files (0)

Project

General

Profile

Wiki

normalized VegCore to-dos¶