2012-11-29 prioritization UI group¶
- Brad's document uploaded to Plone and attached as BIEN User interface.docx
- cost, dependencies
UI¶
- tracking provenance, data providers (1)
- authentication (2)
- users/passwords
- federated authentication system
- content access control
- limit access to controlled datasets (3a)
- access to authenticated content
- logging IP addresses
- reporting details of data access to data owner (3b)
- point to text log, send e-mail, send digest e-mail
- setting that provider can control: opt-in/out
- itemize controlling content access
- limit access to controlled datasets (3a)
- control access of data by owner
- receive automated requests for data access
- receive invitations for co-authorship
- non-authenticated content access (3d)?
- providers want control of data
- so people will commit to providing info to BIEN
- make summaries of data access available to data owners
- how complex is notification that someone has requested your data
- e-mail link to approve person
- automated request generation
- picklists of datasets, users
- don't need to install policing mechanism
- what will be publicly visible
- embargo completely hidden data
- make maps available online after window expires (to MOBG?)
- maps are highly digested products, make available to everyone?
- push maps to Map of Life after window expires
- range maps, IUCN threat levels
- data loading
- who's the gatekeeper
- who to accept plot data from
Notification mechanism¶
- build in messaging within application
- craigslist: see only encrypted e-mail of person
- display user, need way to find user among list of many
- search on personal info?
- pulled back from exposing personal data
- permissions granted
- if asking for data, agree to reveal personal info
- people will not give data to anonymous user
- when person moved through different institutes, track changes
Data loading (5)¶
- can get data in, but more difficult to get data out
HTML mapping tool¶
- comprehensive UI tool
- or series of instructions w/ file format, mapping file
- then provide mapping to BIEN
- who our users are?
- novice? differing levels of ability
- map against our term for field
- complex to build interface to cover all scenarios
- where to upload data?
- mediated by website, not person
- UI, series of templates?
- VegX mapping tool to map spreadsheet data
- handcraft VegX? but not intent of VegX
- plots that needed to be connected to previous records
- tree measurements at different times connected together
- match on tree tag, ID
- history of tags
- each tree gets unique number
- number is unique at scope of plot
- measure individual trunks
- connect remeasurements together
- locate individual tree, identified by subplot, ind. ID
- different fields to identify individual
- template scenario: correct set of unique identifiers
- file template based on published schema, with mapping instructions
- user maps data to template
- error reports on import
- DataUP: CA digital library got grant
- validation, metadata plugin
- open spreadsheet, validate each column
- like Google Refine: transforms data into format
- DataONE node
- convert to CSV, stores it
- lightweight model
- transformations on data
- units
- save customized schema/mapping for re-use
- load spreadsheet -> do mappings, transformations
- access to mappings via repository
- managing mappings
- zip archive that gets extracted
UI¶
- user logs in, exports file to CSV
- instructions for format, how to transform to CSVs
- downloadable mapping template
- user's fields stay the same
- most important to store the map.csvs
- tomorrow: review VegCore
- Nick, Susan, Aaron
- harmonize with VegCSV terms
- metadata terms
- strategy
Structural changes to BIEN¶
- data upload adjusted to allow partial updates
- do TNRS on each incoming dataset
- geovalidation, TNRS on live data
- errors that prevent loading
- duplication of processes, adding new records
- granular editing tools
- data management tool: complex job
- process is automated
- used to be adding records->rescrub
- some providers will start correcting at the source
- data upload (4a)
- data refresh (4c)
Error reporting (4b)¶
- how to report data back to user
- digest of error reports
- join to original rows
- original records, issues in data
- download error log/CSV table
- report back to user via log file?
- generate errors report
- would error report provide necessary info for provider
- placename->records with it
- human-readable digest
- HTML display to read text error report
- e-mail link to report
- download error table, join to their table
- give data provider something to work from
- display list of errors, used as checklist
- status of upload: in import log
- exposed by website
- summary of data -> PDF
- HTML report or PDF summary
- user profile->folder with past reports
Download tracking¶
- track each download as unique event
- details of all imports
- status of import
- success/failure
- human moderator needed?
- uploaded->staging table
- monitoring of upload status (initial validation, staging, core, complete)
- certain date, time->version schema
- management tool for admin
Search/discovery¶
- query interfaces: API (5a), UI (HTML) (5b)
- lower priority to make available to public?
- Brian McGill's API
- SQL statement->API URL to perform SQL request
- UI, data people
- almost no query logic in UI
- UI just knows how to talk to API, not directly to DB
- level of separation
- separation of concerns
- augment API w/o breaking website
Backups¶
- need data on server
- expanded schema to support users, data access levels, profiles, user input
Schema changes¶
- authentication table
- user-driven uploads
- some points of pipeline need intervention
- stop at part of pipeline
- ontological soil schema
- metadata for plot
- storing soil data different for every plot schema
Traits¶
- DB of values
- like in taxonomy
- every revision, all info redone instead of storing raw observation data
- trait DB model to store actual measurements for recombining
- when info synthesized, lose info
To do¶
- override map spreadsheet name using dir name
- hierarchy of projects