Project

General

Profile

Statistics
| Revision:

# Date Author Comment
3924 08/09/2012 02:45 PM Aaron Marcuse-Kubitza

join: Added map_1_core_only option that uses only columns 0 and 1 of map_1. This is useful for one-time refactoring joins where the Source column, mappings comments, etc. shouldn't be part of the datasource's via map (although they will be part of the autogenerated VegBIEN map)

3923 08/09/2012 02:33 PM Aaron Marcuse-Kubitza

join: Use opts.env_usage() for usage message

3922 08/09/2012 02:04 PM Aaron Marcuse-Kubitza

mappings: Made VegCSV-VegBIEN.{plots,organisms,stems}.csv symlinks to VegCSV-VegBIEN.specimens.csv

3921 08/09/2012 01:46 PM Aaron Marcuse-Kubitza

mappings/Makefile: VegCSV-VegBIEN.specimens.csv: Commented out combining with DwC2-VegBIEN mappings, because merging DwC and VegX/VegCSV into one map is a lower priority than replacing all datasource VegX mappings with VegCSV (which does not require the merging but does require XPaths that don't collide, which is not yet the case)

3920 08/09/2012 01:40 PM Aaron Marcuse-Kubitza

lib/xml_func.py: _if(): Made then param optional, so that user can just map to the else branch as a shortcut for logically inverting the condition. (Note that a _not() XML function does not exist yet, so this is also a workaround.)

3919 08/09/2012 01:29 PM Aaron Marcuse-Kubitza

VegBIEN mappings: Wrapped dates in _date() and _dateRangeStart()/_dateRangeEnd(), to assist in importing date and date range values that PostgreSQL cannot parse. This will increase the import time, but hopefully also decrease the # of invalid values in the errors tables. (These functions can later be optimized to reduce the impact on import time.)

3918 08/09/2012 01:25 PM Aaron Marcuse-Kubitza

sql_io.py: put_table(): is_literals: is_function: Fixed bug where function call needed to be recreated in each iteration of the main loop, because the arguments to the function, which are based on mapping, may change as the result of error handling replacing invalid values with NULL

3917 08/09/2012 01:13 PM Aaron Marcuse-Kubitza

sql_io.py: put_table(): is_literals: Fixed bug where sql.select() that calls the function needed to be run recoverably, to auto-rollback errors. Made sql.select() cacheable because SQL functions are immutable, so it should be idempotent.

3916 08/09/2012 01:03 PM Aaron Marcuse-Kubitza

mappings/DwC2-VegBIEN.specimens.csv: Remapped taxonRemarks to taxondetermination.notes because http://rs.tdwg.org/dwc/terms/#taxonRemarks indicates that these notes are "about the taxon", not the specimen/plant in general

3915 08/09/2012 12:56 PM Aaron Marcuse-Kubitza

mappings/DwC2-VegBIEN.specimens.csv: Remapped eventDate to new aggregateoccurrence.collectiondate, which is a more accurate place than locationevent.obsstartdate/obsenddate because the date refers to a specific specimen. This also makes eventDate compatible with plots data.

3914 08/09/2012 12:44 PM Aaron Marcuse-Kubitza

mappings/DwC2-VegBIEN.specimens.csv: Moved sex user-defined mapping to plantobservation because it's a property of the plant rather than the specimen, and so that it can also apply to plots data

3913 08/09/2012 12:31 PM Aaron Marcuse-Kubitza

mappings: Remapped specimenreplicate.description to new aggregateoccurrence.notes because the notes don't necessarily refer specifically to the specimen, especially for plots data

3912 08/09/2012 12:31 PM Aaron Marcuse-Kubitza

mappings: Remapped specimenreplicate.description to new aggregateoccurrence.notes because the notes don't necessarily refer specifically to the specimen, especially for plots data

3911 08/09/2012 12:21 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: aggregateoccurrence: Added notes, to serve the purpose that specimenreplicate.description previously did. specimenreplicate.description is not appropriate for plots data, and often not appropriate even for specimens data, which uses fieldNotes as a general notes field rather than a description of the specimen.

3910 08/09/2012 12:07 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: aggregateoccurrence: Reordered linecover so it's near cover instead of at the end

3909 08/09/2012 12:02 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: Moved collectiondate from specimenreplicate to aggregateoccurrence because it's actually the SALVIAS census_date, which is the date the plant was sampled, rather than the DwC eventDate, which is the date the specimen was collected

3908 08/09/2012 11:56 AM Aaron Marcuse-Kubitza

mappings/DwC2-VegBIEN.specimens.csv: Mapped specimenreplicate via plantobservation for consistency with plots data. (This change is required for VegCSV table merging to work properly.) This is also a more accurate way of representing the data, because a specimen in fact comes from a plant, and it's natural to place the plant-related data (measurements, etc.) in the plantobservation table.

3907 08/09/2012 11:42 AM Aaron Marcuse-Kubitza

mappings/DwC2-VegBIEN.specimens.csv: Mapped specimenreplicate via plantobservation for consistency with plots data. (This change is required for VegCSV table merging to work properly.) This is also a more accurate way of representing the data, because a specimen in fact comes from a plant, and it's natural to place the plant-related data (measurements, etc.) in the plantobservation table.

3906 08/09/2012 10:41 AM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Remapped stem notes to new stemNotes term, and mapped new organism notes VegX XPath to now-available DwC fieldNotes

3905 08/09/2012 10:30 AM Aaron Marcuse-Kubitza

inputs/SALVIAS/maps/VegX.organisms.csv: Map organism notes to different place than stem notes, because these are separate fields

3904 08/09/2012 10:09 AM Aaron Marcuse-Kubitza

mappings/Makefile: VegCSV-VegBIEN.specimens.csv: Temporarily sort by input column rather than output column, to assist in finding terms that map to different places in the DwC- and VegX-VegBIEN mappings

3903 08/09/2012 10:02 AM Aaron Marcuse-Kubitza

mappings/Makefile: VegCSV-VegBIEN.specimens.csv: Use new all option to union, in order to manually review inputs which appear in both maps but map to different places

3902 08/09/2012 10:01 AM Aaron Marcuse-Kubitza

union: Added full flag to turn off merging mappings that are in both maps, in order to review inputs which appear in both maps but map to different places

3901 08/09/2012 09:57 AM Aaron Marcuse-Kubitza

mappings/Makefile: Merged .VegX-VegCSV.stems.csv.last_cleanup into .%.last_cleanup, since VegX-VegCSV.stems.csv now uses the same cleanup operations as the other non-derived maps. Note that this automatically creates a file in for_review for VegX-VegCSV.stems.csv, which is currently identical to it.

3900 08/09/2012 09:52 AM Aaron Marcuse-Kubitza

mappings/Makefile: .%.last_cleanup: Removed simplify_xpath because non-derived maps will now have VegX XPaths in their Source column URLs, which should not be modified

3899 08/09/2012 09:50 AM Aaron Marcuse-Kubitza

mappings/Makefile: VegX-VegCSV.stems.csv: Removed autogeneration command because once file has been generated, regeneration is no longer needed

3898 08/09/2012 09:42 AM Aaron Marcuse-Kubitza

mappings/Makefile: Fixed bug where VegX-VegCSV.stems.csv needed to be removed from $(vegcsvMaps) so it wouldn't be deleted on `make clean`

3897 08/09/2012 08:53 AM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Source: Put URLs in the order their terms appear in the VegCSV term name

3896 08/09/2012 08:38 AM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Comments: Changed "Table name" to "Table" to be concise

3895 08/09/2012 08:37 AM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Mapped VegX community fields

3894 08/09/2012 08:28 AM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Mapped VegX cover-related fields

3893 08/09/2012 08:26 AM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Changed authorPlantCode to the associated DwC term fieldNumber

3892 08/09/2012 08:04 AM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Changed locationNarrative to the associated DwC term locality

3891 08/09/2012 08:00 AM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Changed collectedDate to the associated DwC term eventDate

3890 08/09/2012 07:54 AM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Added plot prefix to eventStartDate/eventEndDate to distinguish it from the DwC eventDate, which is the date the specimen was collected

3889 08/09/2012 07:40 AM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Order within table: Updated order #s for salvias_plots terms that got changed to SALVIAS data dictionary terms

3888 08/09/2012 07:33 AM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Changed collector name parts to the associated DwC term recordedBy

3887 08/09/2012 07:11 AM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Mapped SALVIAS voucher type

3886 08/08/2012 11:09 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Mapped collector name parts

3885 08/08/2012 11:00 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Table names ("." prefixes) merged into name where possible, for consistency. computer taxonomic elements have not been merged because the field part should exactly match the corresponding DwC term.

3884 08/08/2012 10:53 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Order within table: If Source has multiple URLs, ensure each source has its own order

3883 08/08/2012 10:44 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Order within table: Separate orders of multiple elements with "," instead of ";", for consistency with the Source column

3882 08/08/2012 10:42 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Changed authorPlotCode terms to a variation of VegX's plotName, for standardization with VegX

3881 08/08/2012 10:37 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Changed uniqueIDs with table names to the table name + "ID", for standardization

3880 08/08/2012 10:26 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Changed terms with table names to DwC terms where possible

3879 08/08/2012 10:19 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Removed comments about alternate names, as these will be included in a separate "VegCSV-alt" mapping to "VegCSV-core" terms

3878 08/08/2012 10:17 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Clarified comments about the inclusion of the table name

3877 08/08/2012 10:12 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Mapped plotObservation user-defined terms

3876 08/08/2012 09:59 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Mapped VegX plotObservation fields

3875 08/08/2012 09:40 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Corrected sources of DwC terms to point to the actual DwC term, where needed. eventDate parts: Added source for VegBank field used as named suffix.

3874 08/08/2012 09:35 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Corrected sources of VegX names to point to the actual VegX field name, where needed

3873 08/08/2012 09:28 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Mapped SALVIAS stem tags

3872 08/08/2012 09:22 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Corrected parent plot-only mappings by prefixing "parentPlot."

3871 08/08/2012 09:18 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Mapped VegX //plot/plotName

3870 08/08/2012 09:14 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Mapped VegX //plot/plotUniqueIdentifier

3869 08/08/2012 09:00 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Source SALVIAS terms from the SALVIAS data dictionary when possible, to provide an automatic link to the description of the term. Having these direct links will also assist in creating a data dictionary for VegCSV and eventually VegBIEN (using mappings/VegCSV-VegBIEN.specimens.csv). Note that many SALVIAS terms exist only in the live database, as they are not part of the export format documented in the data dictionary.

3868 08/08/2012 08:31 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Source VegBank terms directly from the appropriate VegBank data dictionary page, to provide an automatic link to the description of the term. Having these direct links will also assist in creating a data dictionary for VegCSV and eventually VegBIEN (using mappings/VegCSV-VegBIEN.specimens.csv).

3867 08/08/2012 08:18 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Mapped VegX relativePlotPosition terms

3866 08/08/2012 08:02 PM Aaron Marcuse-Kubitza

maps with Order column: Renamed Order column to Order within table for clarity

3865 08/08/2012 08:00 PM Aaron Marcuse-Kubitza

maps with Order column: Renamed Order column to Order within table for clarity

3864 08/08/2012 07:57 PM Aaron Marcuse-Kubitza

maps with Source column: Added original column name to source URLs, so that source name is completely specified. For official DwC terms, this also allows linking directly to the term. Fixed nimoy phpMyAdmin links so that going to the link in a browser would take you straight there after login.

3863 08/08/2012 06:53 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Corrected SALVIAS stem diameter terms to place original name (before expansion for clarity) in the Comments column instead of appending it to the source URL, because the source URL should point just to the table the term is in. The actual term is identified directly by its order # and indirectly by the name of the VegCSV term, which should be similar (if not, the original term should be listed in the comments).

3862 08/08/2012 06:46 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Mapped SALVIAS stem diameter terms

3861 08/08/2012 06:35 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Mapped VegX project terms

3860 08/08/2012 06:29 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: VegX plot terms: Added order

3859 08/08/2012 06:25 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Mapped non-user-defined height XPath

3858 08/08/2012 06:23 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Changed source of height to VegX, because there is a VegX height field

3857 08/08/2012 06:20 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Mapped VegX plot terms except unique keys

3856 08/08/2012 06:11 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Mapped remaining sourceAccessionCode user-defined terms to <VegX-table>.uniqueID

3855 08/08/2012 06:06 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Corrected sources of VegX names to point to the appropriate element in veg.xsd, rather than the appropriate type, because the names we used actually came from veg.xsd's top-level elements rather than from the type names

3854 08/08/2012 05:57 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Changed plantObservation.sourceAccessionCode to individualOrganismObservation.uniqueID, to be consistent with VegX names. (*source*AccessionCode only applies to an aggregate DB that preserves info from its inputs. accessionCode made less sense, because this field is for the datasource's primary key, which it may or may not consider an accession code.)

3853 08/08/2012 05:39 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Mapped aggregateOrganismObservation terms

3852 08/08/2012 05:36 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Changed base back to baseSaturation to distinguish this pH-related concept from other meanings of base, and to match VegBank

3851 08/08/2012 05:26 PM Aaron Marcuse-Kubitza

mappings/DwC2-VegBIEN.specimens.csv: Removed no longer applicable comments, which were from the very first NY/SALVIAS->VegX/VegBank mapping and had been preserved by the map spreadsheet transformation scripts. Note that many comments have been left, because they either provide explanatory information or because we never reached a decision on the questions posed (such as many of Brad's "OMIT" comments).

3850 08/08/2012 05:18 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Removed no longer applicable comments, which were from the very first NY/SALVIAS->VegX/VegBank mapping and had been preserved by the map spreadsheet transformation scripts

3849 08/08/2012 05:15 PM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Mapped individualOrganismObservation user-defined terms

3848 08/08/2012 04:09 PM Aaron Marcuse-Kubitza

Regenerated vegbien.ERD exports

3847 08/08/2012 04:02 PM Aaron Marcuse-Kubitza

schemas/vegbien.ERD.mwb: Added link to VegBIEN schema wiki page

3846 08/08/2012 03:46 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import

3845 08/08/2012 03:40 PM Aaron Marcuse-Kubitza

README.TXT: After a new import: Added steps to check inputs' error counts and only continue with deleting previous imports, etc. if there were little to no errors. Added step to record the import times.

3844 08/07/2012 09:45 AM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Mapped VegBank and SALVIAS abioticObservation terms

3843 08/07/2012 09:08 AM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Resolved ambiguous terms that appeared twice on the output side

3842 08/07/2012 08:52 AM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Mapped VegX abioticObservation terms

3841 08/07/2012 08:36 AM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Mapped standard DwC terms

3840 08/07/2012 08:13 AM Aaron Marcuse-Kubitza

mappings/DwC2-VegBIEN.specimens.csv, DwC1-DwC2.specimens.csv: Sources: Replaced DwC with http://rs.tdwg.org/dwc/terms/, because DwC terms can come from many places but the DwC source referred specifically to this web page

3839 08/07/2012 08:06 AM Aaron Marcuse-Kubitza

mappings/DwC1-DwC2.specimens.csv: Corrected mapping for previousCatalogNumber

3838 08/07/2012 08:00 AM Aaron Marcuse-Kubitza

mappings/DwC1-DwC2.specimens.csv: Added source of datasources' custom terms

3837 08/07/2012 07:51 AM Aaron Marcuse-Kubitza

mappings/DwC1-DwC2.specimens.csv: Added source of DwC 1.2 (http://digir.net/schema/conceptual/darwin/2003/1.0/darwin2.xsd), aka DwC Classic, terms

3836 08/07/2012 07:43 AM Aaron Marcuse-Kubitza

mappings/DwC1-DwC2.specimens.csv: Added source of custom NY staging table terms in nimoy.bien2_staging.nybg_raw

3835 08/07/2012 07:27 AM Aaron Marcuse-Kubitza

mappings/DwC1-DwC2.specimens.csv: Added source of DwC 1.21 (http://digir.net/schema/conceptual/darwin/manis/1.21/darwin2.xsd) terms

3834 08/07/2012 07:02 AM Aaron Marcuse-Kubitza

mappings/DwC2-VegBIEN.specimens.csv, DwC1-DwC2.specimens.csv: Sources: Replaced DwC with http://rs.tdwg.org/dwc/terms/, because DwC terms can come from many places but the DwC source referred specifically to this web page

3833 08/07/2012 06:51 AM Aaron Marcuse-Kubitza

mappings/DwC1-DwC2.specimens.csv: Added source of remappings of DwC terms with /_alt added

3832 08/07/2012 06:46 AM Aaron Marcuse-Kubitza

mappings/DwC1-DwC2.specimens.csv: Added source of DwC terms with namespace removed

3831 08/07/2012 06:32 AM Aaron Marcuse-Kubitza

mappings/VegX-VegCSV.stems.csv: Added "computer." before taxonomic terms whose VegX mapping used the "computer" role. (This is useful for datasources that supply separate determinations in the same row, such as SALVIAS.)

3830 08/07/2012 06:23 AM Aaron Marcuse-Kubitza

mappings/DwC2-VegBIEN.specimens.csv: Added Source column containing "DwC" for every field with a an entry in the Order column, so that the source of the term can be tracked once we start combining DwC and VegCSV

3829 08/07/2012 06:07 AM Aaron Marcuse-Kubitza

inputs/SALVIAS*/maps/VegX.organisms.csv: Fixed missing join mappings for stemobservation-related fields

3828 08/07/2012 05:56 AM Aaron Marcuse-Kubitza

mappings/DwC2-VegBIEN.specimens.csv: Repopulated Order values for the few rows that had lost it in the process of copying and pasting mappings

3827 08/07/2012 05:49 AM Aaron Marcuse-Kubitza

mappings/DwC2-VegBIEN.specimens.csv: Added Source column containing "DwC" for every field with a an entry in the Order column, so that the source of the term can be tracked once we start combining DwC and VegCSV

3826 08/07/2012 05:38 AM Aaron Marcuse-Kubitza

mappings/Makefile: VegX-VegCSV.stems.csv: Clean up when edited using sort_map

3825 08/07/2012 05:27 AM Aaron Marcuse-Kubitza

Added mappings/VegCSV-VegBIEN.specimens.csv, which is generated from VegX-VegCSV.stems.csv