Project

General

Profile

2014-06-06 separate conference call on data dictionary

Martha's notes

the order in which Aaron should proceed with defining terms is:

1) analytical_stem_view
2) viewFullOccurrence [mapping to analytical_stem_view done]
3) the tables from the normalized VegBIEN schema that are necessary to create the analytical_stem_view and viewFullOccurrence tables [see analytical_stem_view tables]

Regarding what we are referring to as the "viewFullOccurrence" table:
Aaron, you have the DDL from Brad for the BIEN3 viewFullOccurrence table. As guidance on the VegBIEN tables that are used to create it, look at the scripts for the BIEN2 bien_web.observation table.

For priority #3 in the previous message (below), Brian McGill provided the list of the VegBIEN tables necessary to create analytical_stem_table:

plot
sourcelist
specimenreplicate
collector [party]
aggregateoccurrence
taxonverbatim
party identifiedby [party]
taxondetermination
taxon_scrub
taxonoccurrence
plantobservation
stemobservation
party_collector [party]
taxonlabel
family_higher_plant_group
cultivated_family_locations
threatened_taxonlabel

He said to first document each table in terms scientists can understand. After that, proceed with documenting the columns in the tables.

decisions

  • OK to do a first pass filling in the definitions from memory (Mark)
    • "for right now, we just need that list of definitions for attributes" (Mark)
  • the primary purpose of the data dictionary is to help the scientists understand the terms
  • the fields that are most useful to the scientists are the highest priority (Mark)
  • "it is also necessary to document the database in toto, so that someone can intelligently use and extend it" (Mark)
  • put data dictionary in Google spreadsheet for easier editing by scientists

to do for Aaron

  1. create Google spreadsheet for analytical_stem_view data dictionary see VegBIEN data dictionary spreadsheet .
    • include the following columns: column, type, definition or formula, comment, provenance (part of col name), normalized-VegBIEN equivalent (same as formula), approved by (for scientists' initials)
  2. fill in definitions using your understanding of the terms
    • start with fields that are most useful to the scientists
  3. indicate provenance for every attribute
  4. make column name links work in Chrome explicit hyperlinks have been added in the Google spreadsheet .
  5. switch to definitions from the source data dictionaries
  6. add table prefix to every attribute
  7. "Define the attributes for analytical_stem_view and viewFullOccurrence, and then move on to describing the tables from VegBIEN that are necessary to create those tables, followed by defining the terms in those VegBIEN tables." (Martha)
    • "the tables from VegBIEN that Brian McGill listed also need to be documented" (Martha)
  8. "we ultimately need descriptions/definitions of ALL schema objects, including e.g. Tables as Brian McGill has already requested" (Mark)
  9. in phpPgAdmin, make embedded links clickable
  10. for renamed terms, indicate what type of renaming was perfomed (eg. camelcased, etc.)