Project

General

Profile

Statistics
| Revision:

# Date Author Comment
1207 03/01/2012 05:34 PM Aaron Marcuse-Kubitza

DwC mappings: Fixed user-defined field mappings according to Brad Boyle's changes

1206 03/01/2012 05:33 PM Aaron Marcuse-Kubitza

vegbien.sql: Changed specimenreplicate_unique_collectionnumber constraint to include verbatimcollectorname because collection number is assigned by collector

1205 02/28/2012 07:41 PM Aaron Marcuse-Kubitza

Regenerated vegbien.ERD exports

1204 02/28/2012 07:39 PM Aaron Marcuse-Kubitza

vegbien.sql: Changed specimenreplicate_unique_collectionnumber constraint to include verbatimcollectorname because collection number is assigned by collector

1203 02/28/2012 07:36 PM Aaron Marcuse-Kubitza

VegBIEN: Moved taxonoccurrence.verbatimcollectorname to specimenreplicate and aggregateoccurrence so that it can be used in specimenreplicate duplicate elimination

1202 02/28/2012 07:21 PM Aaron Marcuse-Kubitza

mappings/DwC1-DwC2.specimens.csv: Notes mapping: Removed extraneous /_merge/1

1201 02/28/2012 05:51 PM Aaron Marcuse-Kubitza

input.Makefile: svn_props: Removed no longer needed items from input dir svn:ignore

1200 02/28/2012 05:49 PM Aaron Marcuse-Kubitza

input.Makefile: verify: Fixed bug for inputs without a .ref where $(wildcard) wouldn't recheck the file after verify/%.out is run, so the verify output wasn't printed

1199 02/28/2012 05:45 PM Aaron Marcuse-Kubitza

input.Makefile: Moved verify files into separate subdir

1198 02/28/2012 04:30 PM Aaron Marcuse-Kubitza

bin/map: Changed root label data format convention to datasrc[data_format] so datasource names containing hyphens would not have the part after the - treated as the data format

1197 02/28/2012 04:25 PM Aaron Marcuse-Kubitza

inputs maps: Changed input root labels to match dir names since verify expects these to be the same

1196 02/28/2012 04:22 PM Aaron Marcuse-Kubitza

input.Makefile: verify: Fixed bug where datasource name was not set for non-DB inputs

1195 02/28/2012 04:18 PM Aaron Marcuse-Kubitza

input.Makefile: Removed no longer needed default verify action for dirs with no verify.ref's

1194 02/28/2012 04:15 PM Aaron Marcuse-Kubitza

input.Makefile: verify: Made verifications table-specific

1193 02/28/2012 03:27 PM Aaron Marcuse-Kubitza

input.Makefile: import: Merged import and import-all because they do the same thing

1192 02/28/2012 03:26 PM Aaron Marcuse-Kubitza

input.Makefile: verify: Started rearranging to allow different verifies for each table

1191 02/28/2012 03:19 PM Aaron Marcuse-Kubitza

Moved verify.sql to mappings since it's mapping-related

1190 02/28/2012 02:31 PM Aaron Marcuse-Kubitza

input.Makefile: Changed option nolog to log so that options aren't specified in the negative

1189 02/28/2012 01:43 PM Aaron Marcuse-Kubitza

input.Makefile: svn ignore .trace files

1188 02/28/2012 01:41 PM Aaron Marcuse-Kubitza

input.Makefile: Profile imports into a .trace file unless env var profile=""

1187 02/28/2012 01:28 PM Aaron Marcuse-Kubitza

xml_func.py: _alt: On empty input, return None instead of raising SyntaxException because empty input should be OK

1186 02/27/2012 05:37 PM Aaron Marcuse-Kubitza

xml_func.py: _alt: Fixed bug where not specifying any item would crash the program instead of raising a SyntaxException

1185 02/27/2012 05:33 PM Aaron Marcuse-Kubitza

Factored verify.sql out into schemas dir

1184 02/27/2012 05:26 PM Aaron Marcuse-Kubitza

input.Makefile: verify: Print diff in two columns if verbose=1

1183 02/27/2012 05:03 PM Aaron Marcuse-Kubitza

inputs/SALVIAS/verify.sql: When filtering by datasource name, use an AND clause in the JOIN party's ON condition instead of a separate WHERE statement, so that the datasource filtering code is all on the same line

1182 02/27/2012 04:58 PM Aaron Marcuse-Kubitza

inputs/SALVIAS/verify.sql: Use new :datasource variable instead of literal 'SALVIAS'

1181 02/27/2012 04:58 PM Aaron Marcuse-Kubitza

input.Makefile: Provide the verify.sql script a :datasource variable set to the datasource name (in quotes)

1180 02/27/2012 04:39 PM Aaron Marcuse-Kubitza

vegbien.ERD.mwb: Re-marked aggregateoccurrence:plantobservation relationship as 1:1 in the ERD

1179 02/27/2012 03:55 PM Aaron Marcuse-Kubitza

bin/map: DB, CSV inputs: Use column indexes instead of column names to look up each field (optimization to avoid repeated dict lookups of the same key)

1178 02/27/2012 03:47 PM Aaron Marcuse-Kubitza

util.py: ListDict: str(): Print each entry on its own line, in the order the keys were provided

1177 02/27/2012 03:37 PM Aaron Marcuse-Kubitza

NYBG-DwC maps: Filter out MinimumElevation = "."

1176 02/27/2012 03:37 PM Aaron Marcuse-Kubitza

xml_dom.py: NodeTextEntryIter: Filter out empty entries (instead of producing an entry with an explicit None value, which causes problems with XML funcs that can't handle Nones)

1175 02/27/2012 03:34 PM Aaron Marcuse-Kubitza

NYBG-DwC maps: Map to input fields with XML func appended whenever possible (DwC1->DwC2 translation is done by DwC-VegBIEN.specimens.csv)

1174 02/27/2012 02:57 PM Aaron Marcuse-Kubitza

vegbien.sql: Renamed methodtaxonclass.description to methodtaxonclass.taxonclass and changed it to a closed list (enum taxonclass). method.description can still be used for freeform taxonclass inclusions/exclusions.

1173 02/27/2012 02:47 PM Aaron Marcuse-Kubitza

DwC1-DwC2.specimens.csv: Removed no longer needed /_alt/2 XML func from date mappings (you will only ever map either the full date or the year/month/day)

1172 02/27/2012 02:43 PM Aaron Marcuse-Kubitza

DwC mappings: Moved DwC1's CoordinatePrecision /_noCV/value XML func suffix to DwC2-VegBIEN.specimens.csv

1171 02/27/2012 02:38 PM Aaron Marcuse-Kubitza

mappings: Removed mappings for XML func suffixes of a path because they are now automatically created heuristically by join

1170 02/27/2012 02:37 PM Aaron Marcuse-Kubitza

join: Added heuristic search for a match on a parent path, so that every XML func suffix of a path doesn't need its own mapping

1169 02/27/2012 02:03 PM Aaron Marcuse-Kubitza

Regenerated vegbien.ERD exports

1168 02/27/2012 02:01 PM Aaron Marcuse-Kubitza

vegbien.sql: Added method.pointsperline. Rearranged ERD after removing role fkeys.

1167 02/27/2012 02:00 PM Aaron Marcuse-Kubitza

filter_ERD.csv: Remove role fkeys

1166 02/27/2012 01:45 PM Aaron Marcuse-Kubitza

vegbien.sql: aggregateoccurrence: Added linecover

1165 02/27/2012 01:37 PM Aaron Marcuse-Kubitza

vegbien.sql: methodtaxonclass: Added description comment with list of values (which may become a closed list)

1164 02/27/2012 01:10 PM Aaron Marcuse-Kubitza

Regenerated vegbien.ERD exports

1163 02/27/2012 01:02 PM Aaron Marcuse-Kubitza

vegbien.sql: Changed lengthunits to m in all comments

1162 02/27/2012 12:56 PM Aaron Marcuse-Kubitza

vegbien.sql: method: Added subplotspacing and subplotmethod_id

1161 02/27/2012 12:36 PM Aaron Marcuse-Kubitza

vegbien.sql: method: Removed lengthunits and instead require all length- or area-related measurements throughout VegBIEN to be converted to SI base units, e.g. cm -> m, ha -> m^2. Adjusted ERD to avoid some densely packed lines.

1160 02/27/2012 12:17 PM Aaron Marcuse-Kubitza

vegbien.sql: methodtaxonclass: Added description field for taxon classes that don't fit well into a plantconcept. Made at least one of plantconcept_id or description required. Added unique constraint.

1159 02/27/2012 12:07 PM Aaron Marcuse-Kubitza

SALVIAS verifications: Use count(DISTINCT) instead of nested SELECT DISTINCT

1158 02/27/2012 12:05 PM Aaron Marcuse-Kubitza

VegBIEN verifications: Select only the records for the datasource being verified

1157 02/27/2012 11:46 AM Aaron Marcuse-Kubitza

SALVIAS verifications: Fixed to exclude subplots from locations/location events and uniqify locations based on coords

1156 02/27/2012 11:25 AM Aaron Marcuse-Kubitza

inputs/SALVIAS/verify.sql: Updated for schema changes

1155 02/27/2012 10:24 AM Aaron Marcuse-Kubitza

Regenerated vegbien.ERD exports

1154 02/27/2012 10:22 AM Aaron Marcuse-Kubitza

vegbien.ERD.mwb: Re-marked aggregateoccurrence:plantobservation relationship as 1:1 in the ERD. (I think this will need to be manually re-marked whenever either of those tables is updated.)

1153 02/27/2012 10:18 AM Aaron Marcuse-Kubitza

vegbien.sql: Removed methodgrowthform and growthform, since growthforms can be accommodated by plantconcept in a similar way as higher-order taxonomic ranks

1152 02/27/2012 10:09 AM Aaron Marcuse-Kubitza

vegbien.sql: methodgrowthform, methodtaxonclass: Removed "included" default value so it's always obvious whether the author intended the classes to be inclusions or exclusions

1151 02/27/2012 10:04 AM Aaron Marcuse-Kubitza

vegbien.sql: aggregateoccurrence: Removed unneeded fields. Added aggregateoccurrence->coverindex fkey.

1150 02/27/2012 09:54 AM Aaron Marcuse-Kubitza

vegbien.sql: Added constraint to enforce 1:1 aggregateoccurrence:plantobservation relationship

1149 02/25/2012 08:16 PM Aaron Marcuse-Kubitza

vegbien.sql: Added plantname unique constraint

1148 02/25/2012 08:01 PM Aaron Marcuse-Kubitza

bin/map: Use new util.ListDict and util.WrapIter to simplify getting rows by column name instead of index, and to enable a row to be printed with its column names in error messages

1147 02/25/2012 08:00 PM Aaron Marcuse-Kubitza

util.py: Added WrapIter to wrap an iterator and ListDict to view a list as a dict

1146 02/25/2012 07:38 PM Aaron Marcuse-Kubitza

bin/map: Use new util.list_flip()

1145 02/25/2012 07:37 PM Aaron Marcuse-Kubitza

util.py: Added list_flip()

1144 02/25/2012 07:02 PM Aaron Marcuse-Kubitza

env_password: Fixed to set the environment variable in the calling shell. Do this by cc-ing the tty only on messages before the "Enter password" prompt, because the redirect creates a subshell which causes the env var to only be set within that subshell.

1143 02/25/2012 06:18 PM Aaron Marcuse-Kubitza

inputs/NYBG-CSV/maps/DwC.specimens.csv: Removed mappings that are already present in mappings/DwC1-DwC2.specimens.csv. This map now contains only the mappings where NYBG-CSV differs from standard DwC1.

1142 02/25/2012 06:14 PM Aaron Marcuse-Kubitza

inputs/NYBG/maps/DwC.specimens.csv: Removed mappings that are already present in mappings/DwC1-DwC2.specimens.csv. This map now contains only the mappings where NYBG differs from standard DwC1.

1141 02/25/2012 05:58 PM Aaron Marcuse-Kubitza

Remove accidentally-committed temp file inputs/NYBG/DwC.specimens2.csv

1140 02/25/2012 05:56 PM Aaron Marcuse-Kubitza

mappings/Makefile: Generate DwC.self.specimens.csv from DwC-VegBIEN.specimens.csv for use in creating full via maps for inputs

1139 02/25/2012 05:40 PM Aaron Marcuse-Kubitza

input.Makefile: Generate full via maps from input via maps by appending mappings from the via format to itself when available

1138 02/25/2012 04:30 PM Aaron Marcuse-Kubitza

inputs/NYBG/maps/DwC.specimens.csv: Changed label to "NYBG-DwC" to take advantage of automatic filling in of DwC mappings not specified in the NYBG map

1137 02/25/2012 04:28 PM Aaron Marcuse-Kubitza

subtract: Support custom column numbers to compare on (instead of just input col). Added ignore option to continue even if input columns don't match.

1136 02/25/2012 04:26 PM Aaron Marcuse-Kubitza

bin/map: DB inputs: Get all rows in one query (hopefully a significant optimization). Allow maps to contain entries for columns that are not in the DB table.

1135 02/25/2012 04:22 PM Aaron Marcuse-Kubitza

sql.py: select(): Select all fields if fields == None. Replaced col(cur, idx) with col_names(cur) because an iterator is easier to use than getting by index.

1134 02/25/2012 03:57 PM Aaron Marcuse-Kubitza

bin/map: Fixed bug in previous implementation of allowing maps for CSV inputs to contain entries for columns that are not in the CSV file

1133 02/25/2012 03:45 PM Aaron Marcuse-Kubitza

bin/map: Allow maps for CSV inputs to contain entries for columns that are not in the CSV file

1132 02/25/2012 02:54 PM Aaron Marcuse-Kubitza

Use new sort_map instead of manually specifying the sort order

1131 02/25/2012 02:54 PM Aaron Marcuse-Kubitza

Added sort_map to sort a map spreadsheet in the standard order

1130 02/25/2012 02:43 PM Aaron Marcuse-Kubitza

Removed no longer needed join_passthru, because join_union_sort now serves its purpose

1129 02/25/2012 02:42 PM Aaron Marcuse-Kubitza

Don't generate mappings/for_review/DwC-VegBIEN.specimens.csv because it's a derived map with lots of duplicated mappings for the various DwC versions

1128 02/25/2012 02:41 PM Aaron Marcuse-Kubitza

mappings/Makefile: Generate DwC-VegBIEN.specimens.csv directly from DwC1-DwC2 and DwC2-VegBIEN mappings by using join_union_sort with header_num=1, rather than via intermediate DwC1-VegBIEN.specimens.csv

1127 02/25/2012 02:37 PM Aaron Marcuse-Kubitza

union: Added header_num option to select which map's header to use as the output header

1126 02/25/2012 02:28 PM Aaron Marcuse-Kubitza

Rename join_sort to join_union_sort and have it run union in ignore mode. This will automatically append the joined map when the input map is a derivative of the joined map, such as for NYBG-DwC.

1125 02/25/2012 02:25 PM Aaron Marcuse-Kubitza

union: Pass through map 0, so that if ignore is set, the input map will still be output. Allow either map's input label to contain the other's input label to enable e.g. appending mappings for an older input version to those for a newer input version.

1124 02/25/2012 01:43 PM Aaron Marcuse-Kubitza

DwC1-DwC2 mapping: Changed input label to DwC1, which is allowed by the now relaxed label constraints imposed by union

1123 02/25/2012 01:42 PM Aaron Marcuse-Kubitza

union: Check if two maps can be combined based on whether map 0 column 0 label contains map 1 column 0 label instead of being equal. This allows map 0's input 0 root to contain the datasource name as well as a format that allows it to be combined with a more general map. Added ignore flag to not print an error if column labels don't match.

1122 02/25/2012 01:39 PM Aaron Marcuse-Kubitza

bin/map: Support optional data format tag in map spreadsheet labels, used by union to check if two maps can be combined

1121 02/25/2012 01:01 PM Aaron Marcuse-Kubitza

mappings: Added DwC1-DwC2.specimens.csv to core maps so it gets cleaned up

1120 02/25/2012 12:57 PM Aaron Marcuse-Kubitza

Only generate for_review mappings of core maps and end products

1119 02/25/2012 12:56 PM Aaron Marcuse-Kubitza

Generate DwC-VegBIEN mapping as union of DwC1 and DwC2 mappings

1118 02/24/2012 08:00 PM Aaron Marcuse-Kubitza

Generate DwC-VegBIEN mapping as union of DwC1 and DwC2 mappings

1117 02/24/2012 07:40 PM Aaron Marcuse-Kubitza

NYBG DB mapping: Removed IdentifiedDate and CollectedDate mappings because they are generated from the year/month/day

1116 02/24/2012 07:39 PM Aaron Marcuse-Kubitza

Added mappings/for_review/DwC1-VegBIEN.specimens.csv

1115 02/24/2012 07:35 PM Aaron Marcuse-Kubitza

Added DwC1-DwC mapping. Generate DwC1-VegBIEN mapping automatically.

1114 02/24/2012 07:11 PM Aaron Marcuse-Kubitza

Regenerated vegbien.ERD exports

1113 02/24/2012 07:08 PM Aaron Marcuse-Kubitza

vegbien.sql: Renamed _keys unique constraints/unique indexes to _unique to better reflect their purpose

1112 02/24/2012 06:54 PM Aaron Marcuse-Kubitza

vegbien.sql: Added method.diameterheight to store DBH height

1111 02/24/2012 06:44 PM Aaron Marcuse-Kubitza

VegBIEN: Moved plantstatus.plantlevel to plantname.rank because the taxonomic rank is a property of the name itself

1110 02/24/2012 06:43 PM Aaron Marcuse-Kubitza

PostgreSQL-MySQL.csv: Fixed custom types translation to match shorter type names

1109 02/24/2012 06:09 PM Aaron Marcuse-Kubitza

vegbien.sql: Added plantstatus unique constraint

1108 02/24/2012 06:07 PM Aaron Marcuse-Kubitza

DwC-VegBIEN mapping: Map datasource name via DwC institutionCode