Task #292
openVegBank metadata query mechanism
0%
Description
For data discovery of VegBank schema.
Mike Lee's suggestion: (e-mail on 2011-11-9)
I'm wondering if you all talked about mechanisms for transmitting metadata about plots between databases. I imagine someone being able to query VegBank or another database based on any number of parameters, including new data since date X, to return a document describing a summary of each plot we have matching that. This way you could keep updated on what's in VegBank and even maintain your own list of entities we have.
As opposed to full-scale data transfer, it would just enable the most important metadata to be transferred and help in data discovery. We have been contacted several times about how one might discover what's available in VegBank through an automated process. This would be a possible natural solution to this question. Does VegX have a component that addresses just metadata, or might a minimal implementation of VegX accomplish this?
If not, and if I understand RDF correctly (<idea:certainty>medium</idea:certainty>), RDF is well-suited to this task. I have attached a quick example of how one VegBank plot might be represented.
If anyone has feedback on this or has thoughts about the above questions, I'd be very interested in how to do this efficiently.
Aaron's suggestion:
DB metadata can be queried using the PostgreSQL system tables if you make the database publicly accessible. This can be done for VegBank by providing an empty version of the DB which is publicly accessible.
Summary statistics could be provided with a materialized view, which is created in a publicly-accessible schema.