normalized VegCore to-dos

  1. make it optional: there can be coordinate-only places with no name
  2. taxon_scrub: rename to taxon_match
  3. taxon_name.formal_name: rename to name_with_author
  4. taxon_name.taxon_name: rename to name_no_author
  5. taxon_match.parsed_taxon_assertion: rename to parsed_taxon
  6. taxon_match.matched_taxon_concept: rename to matched_taxon
  7. taxon_concept.accepted_taxon_concept: rename to accepted_taxon
  8. rename all unique constraints to [table_abbr]_by_[field_abbrs] (removing _unique and abbreviating the table and fields)
    • event_by_subject__date__participants -> evt_by_date (later obs_by_date)
    • event_by_subject__name -> evt_by_name (later obs_by_name)
  9. evt_by_date: document that this constraint is used for eg. a plot sampling event, which uses place, time, and collectors to define an event and scope its collector_numbers
  10. evt_by_name: document that this constraint is used for eg. differentiating identification and collection events for the same specimen
  11. traceable.id_by_source: document that collisions will most often happen on this field, not id (which stores the natural key). id collisions are rare and usually indicate inter-datasource duplication.
  12. individual_observation.specimenholder_institutions: rename to specimen_duplicate_institutions
  13. source.url: rename to uri as not all sources are locatable on the internet
  14. automate the creation of the hyperlinked image map from the coordinates in the MySQL Workbench document
    • this will avoid the need to manually update the positions of tables for the following changes, which will require moving tables around to make room
  15. add taxon_path.species_binomial
  16. stem: remove inheritance from individual (an individual is a grouping of stems, and thus is a fundamentally separate entity from a stem)
  17. stem_observation: remove inheritance from individual_observation
  18. individual: move all fields to stem (individuals only have an id_within_dataset to identify the grouping of stems)
    • add required identifying_stem that points to the stem that has the individual's identifying tag or identifying_place
      • for SALVIAS, this requires figuring out for each individual, which stem has the same tag #s as the individual
  19. individual: remove inheritance from reobservable (only a stem is identifiable to a physical location or tagged stem)
    • stem should extend reobservable instead
  20. taxon_concept HAS-MANY instead of IS-A taxon_name (ie. taxon_name HAS-AN optional taxon_concept)
  21. taxon_concept: rename to taxon because this table is used to store any kind of taxonomic group (including TNRS results and higher taxa), not just formally-described taxon concepts
    • according_to: rename to defined_by because this is more general
  22. add taxon_concept table, which extends taxon
    • stores only formally-described taxon concepts
    • adds according_to, which stores the literature reference that contains the description (this is also populated in defined_by)
  23. add observation table, which extends event
    • applicable event subclasses should inherit from it instead
  24. move event.subject to observation and make it required
    • also move associated unique constraints, and pull forward inherited fields used in them
  25. add event.type, needed to form a full event text ID
    • use it in every event unique constraint that uses name
    • subclasses that want to use this must set type to their table name
      • this prevents event's unique constraints from inadvertently being used when the subclass's unique constraints use name together with some other info
    • note that this field is not used in obs_by_name, because for observations, the type is always observation
  26. add taxon_observation_by_collector (tax_obs_by_coll()) unique constraint on columns sampling_event, primary_collector, collector_number
  27. add traceable.permalink, a URL which links directly to the traceable record itself
    • this should be the phpPgAdmin URL, or an abbreviated redirect to it (eg. starting with
  28. add traceable.id_by_natural_key, autopopulated from id, which distinguishes any natural key from id_by_source
  29. traceable: add optional source_record fkey to record
    • use this instead of source in forming id_by_source
  30. traceable.id_within_source: rename to source_record_section
  31. traceable.source: rename to dataset and make it an fkey to dataset
    • this also distinguishes it from dataset_source when this field is inherited by dataset
      • note that dataset.dataset (the dataset record's dataset) would be eg. Index Herbariorum, not the parent dataset (which is stored in dataset.parent)
    • note that this change makes traceable mutually recursive with dataset. because dataset is required, populating the root dataset node requires deferring fkeys (using SET CONSTRAINTS DEFERRED) until traceable::dataset has been set to
    • an entity that consists only of manually-entered data should point to a dataset which contains information about the person who populated it
      • the root dataset is for the database owner who populated the dataset metadata (eg. BIEN, which is the root dataset and populated the root dataset)
    • dataset is autopopulated from source_record.attribution_dataset
  32. source: extend traceable
  33. move source.uri to traceable and rename to source_uri
  34. dataset.dataset_source: rename to just source now that traceable no longer has a conflicting source field
  35. record.scoping_dataset: rename to id_scoping_dataset
  36. record.attribution_dataset: document that it can be set to a subset of the id_scoping_dataset when finer-grained attribution is available
  37. project: document that it is not a type of dataset, for the reasons described in r11221
  38. project.dataset: make it required
  39. add dataset_indexed, which extends traceable
  40. project.dataset: document that this is the dataset which defined the project (a project is actually an event, which is dataset-independent; multiple datasets may refer to the same project)
  41. project: extend dataset_indexed
    • use name together with inherited dataset in the proj_by_name unique constraint
  42. person: extend dataset_indexed
    • use name together with inherited dataset in the pers_by_src unique constraint
  43. place_visit: extend dataset_indexed to make locating the place_visits in a dataset easier
    • dataset is autopopulated from project.dataset
  44. taxon_observation: extend dataset_indexed to make locating the taxon_observations in a dataset easier
    • dataset is autopopulated from place_visit.dataset
  45. for inheritance hierarchies with multiple unique constraints, add an id_by_... field for each unique constraint which contains the associated natural key
    • this shows in the ERD which natural keys are available
    • it also allows querying on a specific natural key when several exist for the same record
  46. add custom_place, which extends dataset_indexed to make locating the places in a dataset easier
    • used for specimen coordinates as well as plots
    • contains optional defining_project
    • places can be part of projects in the same way that samplings of places can be part of projects
    • pull inherited name field into custom_place so it can be used together with defining_project in the unique constraint
  47. make it optional
    • custom_places for specimen coordinates usually often do not have a name, just a numeric ID
  48. plot: rename to sampling_area (plot/subplot)
    • a sampling_area is an area defined solely for the purpose of aggregating taxon_occurrences (what some might call a plot)
    • note that a sampling_area is not the only type of place which can directly contain taxon_occurrences, because named places (regions) can contain these as well
  49. add outer_plot, which extends sampling_area and custom_place
    • an outer_plot is the outermost (largest possible) sampling_area in which taxa were sampled
  50. subplot: add comment that this is a plot subdivision, but is not considered a first-class plot
  51. subplot: add outer_plot pointer
  52. taxa_sampling_event: document that a shared place_visit ties together all the sampling_areas in the same outer_plot
  53. sampling_area.boundary_WKT: rename to shape_def_WKT
  54. subplace: rename to place_element
    • can store anything located within a place, not just other places
    • derived classes: individual; anything that is a point within a place
  55. add plot_element table, which extends place_element
    • derived classes: subplot
    • parent: fkey to sampling_area (not outer_plot, as this can be used for things in subplots as well)
  56. add site table, which stores a place defined by a locality description (i.e. directions to it)
    • extends place w/ rank=site
    • contains locality_desc
    • an outer_plot is contained within a site, and is not a subclass of it
    • note that the CTFS Site table should actually be called Plot, and is not equivalent to this definition of a site. the CTFS usage is particularly confusing because some plots are named after the entire site they are within (eg. the bci plot named after the Barro Colorado Island site), even though they actually refer to a plot (the only plot) within that site.
  57. add sampling_scope table
    • has req. place; opt. stratum, size_class, taxon inclusion, subsetting_info (hstore)
      • place doesn't have to be a sampling_area (plot/subplot), as eg. specimens are not located in a plot
  58. taxa_sampling_event: add req. sampling_scope
  59. place_visit: document that usually only an outer_plot has one of these, unless the subplots were treated like first-class plots with their own elevation, soil, etc.
  60. taxa_sampling_event.place_visit: document that this is usually for an outer_plot, not a subplot
  61. taxon_name: add morphospecies_suffix
    • not in taxon_path because not globally unique
  62. taxon_occurrence extends taxon_concept
  63. taxon_determination can be applied to any taxon_concept
  64. taxon_scrub extends taxon_determination
  65. community, geological_context: add list tables