/ - Changes - BIEN 3 - NCEAS Projects

root @ 1265

#	Date	Author	Comment
1265	03/05/2012 01:14 PM	Aaron Marcuse-Kubitza	mappings/DwC1-DwC2.specimens.csv: Remove date -> date/_alt/2 mappings because they prevent the original DwC2 date field from being mapped to without an extra /_alt/2 appended
1264	03/05/2012 01:10 PM	Aaron Marcuse-Kubitza	xml_func.py: Use new dates.strtotime(). When component date parts specified, year defaults to dates.epoch.year.
1263	03/05/2012 01:09 PM	Aaron Marcuse-Kubitza	dates.py: Added strtotime() to wrap dateutil.parser.parse() with default defaulting to epoch, so that e.g. months with day missing default to day 1 instead of the current day of the month
1262	03/05/2012 12:38 PM	Aaron Marcuse-Kubitza	mappings/DwC1-DwC2.specimens.csv: Map eventDate,dateIdentified using /_alt/2 and year/month/day using /_alt/1 so that inputs with both a date and date parts will select between the two
1261	03/05/2012 11:43 AM	Aaron Marcuse-Kubitza	input.Makefile: Added comment that self map must be made first if it's needed for maps/$(via).%.full.csv
1260	03/05/2012 11:40 AM	Aaron Marcuse-Kubitza	Makefiles: Use .SECONDARY with no prerequisites instead of setting a .PRECIOUS for each intermediate, to simplify turning off automatic deletion of intermediate files
1259	03/05/2012 11:23 AM	Aaron Marcuse-Kubitza	inputs/UArizona: Added initial maps/DwC.specimens.csv
1258	03/05/2012 11:10 AM	Aaron Marcuse-Kubitza	DwC mappings: Map datasource name via institutionID to avoid conflicting with existing institutionCode fields that many DwC data sources have
1257	03/05/2012 10:57 AM	Aaron Marcuse-Kubitza	input.Makefile: Don't profile by default because it appears to slow things down significantly on long imports
1256	03/05/2012 10:56 AM	Aaron Marcuse-Kubitza	Added inputs/UArizona/maps
1255	03/03/2012 05:56 PM	Aaron Marcuse-Kubitza	Makefile: python-Linux: Added python-profiler
1254	03/03/2012 05:44 PM	Aaron Marcuse-Kubitza	specimens verification: Added # binomials test
1253	03/03/2012 05:35 PM	Aaron Marcuse-Kubitza	vegbien.sql: specimenreplicate: Removed specimenreplicate_unique_collectionnumber index because the collectionnumber (NYBG FieldNumber) is not always unique within a collector, even though it should be. Changed specimenreplicate_unique_catalognumber to only operate on rows with no sourceaccessioncode (of which there are 8 in NYBG).
1252	03/03/2012 05:09 PM	Aaron Marcuse-Kubitza	mappings/verify.specimens.sql: # species test: Fixed to join separately on taxondeterminations for genus and species. # genera test: Removed no longer needed join on party.
1251	03/03/2012 05:04 PM	Aaron Marcuse-Kubitza	vegbien.sql: specimenreplicate: Added fki index on taxonoccurrence_id
1250	03/03/2012 04:25 PM	Aaron Marcuse-Kubitza	vegbien.sql: plantname: Added index on rank to speed up specimens verifications, where the query planner insists on joining from plantname to specimenreplicate instead of the other way around (which takes much longer without the index)
1249	03/03/2012 03:33 PM	Aaron Marcuse-Kubitza	mappings/verify.*: Use nested SELECT instead of JOIN on party to get datasource_id, so that party will not be joined on after other joins have already occurred (which slows things down)
1248	03/03/2012 03:26 PM	Aaron Marcuse-Kubitza	vegbien.sql: party: Changed party_unique_name to ignore NULL values and the organizationname (a first(+middle)+last name is considered unique)
1247	03/03/2012 03:15 PM	Aaron Marcuse-Kubitza	vegbien.sql: party: Added party_unique_organizationname constraint
1246	03/03/2012 02:11 PM	Aaron Marcuse-Kubitza	Specimens verification: Added # genera and # species
1245	03/03/2012 01:50 PM	Aaron Marcuse-Kubitza	input.Makefile: verify: Create target dir if it doesn't exist
1244	03/03/2012 01:42 PM	Aaron Marcuse-Kubitza	inputs/NYBG: Added verify/specimens.ref.sql
1243	03/03/2012 01:41 PM	Aaron Marcuse-Kubitza	Added mappings/verify.specimens.sql
1242	03/03/2012 01:41 PM	Aaron Marcuse-Kubitza	Added inputs/NYBG-CSV/verify/
1241	03/03/2012 01:40 PM	Aaron Marcuse-Kubitza	Makefile: Print done message after verify
1240	03/03/2012 01:29 PM	Aaron Marcuse-Kubitza	VegX-VegBIEN mapping: Use new lookup-only element syntax to ensure that stemtag 1 is not created if it doesn't exist when stemtag 2 tries to set its iscurrent status to false. This should fix the 136 "NullValueException: columns: tag" errors in the SALVIAS organisms import.
1239	03/03/2012 01:27 PM	Aaron Marcuse-Kubitza	xpath.py: get(): Added support for lookup-only elements which are not created if they don't exist
1238	03/03/2012 01:25 PM	Aaron Marcuse-Kubitza	xpath.py: parse(): Added support for lookup-only elements which are not created if they don't exist
1237	03/03/2012 01:15 PM	Aaron Marcuse-Kubitza	VegX-VegBIEN mapping: Map stemtags using [] instead of :[] for attrs that are really keys
1236	03/02/2012 07:54 PM	Aaron Marcuse-Kubitza	Regenerated vegbien.ERD exports
1235	03/02/2012 07:52 PM	Aaron Marcuse-Kubitza	VegX-VegBIEN mapping: Handle user-defined field voucherType (SALVIAS DetType) by mapping specimenreplicates for voucherTypes other than direct via voucher
1234	03/02/2012 06:58 PM	Aaron Marcuse-Kubitza	xml_func.py: Added _if and _eq. Added cast() to throw SyntaxException if can't cast and use it in conv_items(). _merge: Check types of input using conv_items(strings.ustr, items).
1233	03/02/2012 06:53 PM	Aaron Marcuse-Kubitza	util.py: Added all_not_none() and bool2str()
1232	03/02/2012 06:52 PM	Aaron Marcuse-Kubitza	strings.py: Added ustr() (like built-in str() but converts to unicode object)
1231	03/02/2012 05:32 PM	Aaron Marcuse-Kubitza	PostgreSQL-MySQL.csv: Fixed bug in removal of casts of default values, which treated NOT NULL as part of the datatype
1230	03/02/2012 05:30 PM	Aaron Marcuse-Kubitza	VegBIEN: soilobs: Added default value for horizon. Adjusted mappings to remove now-unecessary horizon value.
1229	03/02/2012 05:26 PM	Aaron Marcuse-Kubitza	repl: Removed automatic case-insensitivity because Python apparently only supports turning on case-insensitivity via (?i) but not off via (?-i) (as Java does)
1228	03/02/2012 05:09 PM	Aaron Marcuse-Kubitza	VegBIEN: soilobs: Removed soil* prefix from fields
1227	03/02/2012 05:05 PM	Aaron Marcuse-Kubitza	VegX-VegBIEN mapping: Map to new soilobs fields
1226	03/02/2012 04:57 PM	Aaron Marcuse-Kubitza	SALVIAS inputs: Use new _units:[units="%"] on soil fields that are percents. Replace "<..." values with 0.
1225	03/02/2012 04:55 PM	Aaron Marcuse-Kubitza	xml_func.py: Added _units
1224	03/02/2012 04:30 PM	Aaron Marcuse-Kubitza	vegbien.sql: soilobs: Converted user-defined fields to first-class. Labeled appropriate fields as "fraction".
1223	03/02/2012 04:08 PM	Aaron Marcuse-Kubitza	VegBIEN mappings: Changed tableRecord_ID to tablerecord_id to match PostgreSQL field name
1222	03/02/2012 04:05 PM	Aaron Marcuse-Kubitza	DwC2-VegBIEN mapping: Adjusted user-defined mappings
1221	03/02/2012 04:00 PM	Aaron Marcuse-Kubitza	vegbien.sql: userdefined: Made userdefinedname NOT NULL. userdefined, definedvalue: Added unique constraints.
1220	03/02/2012 03:54 PM	Aaron Marcuse-Kubitza	VegX-VegBIEN mapping: Mapped userdefined fields to new first-class fields
1219	03/02/2012 03:46 PM	Aaron Marcuse-Kubitza	xml_func.py: Added _map and _replace
1218	03/02/2012 02:33 PM	Aaron Marcuse-Kubitza	Regenerated vegbien.ERD exports
1217	03/02/2012 02:30 PM	Aaron Marcuse-Kubitza	vegbien.ERD.mwb: Fixed lines. Expanded truncated tables where there was room.
1216	03/02/2012 12:51 PM	Aaron Marcuse-Kubitza	Regenerated vegbien.ERD exports
1215	03/02/2012 12:51 PM	Aaron Marcuse-Kubitza	vegbien.sql: locationevent: Added temperature and precipitation
1214	03/02/2012 12:45 PM	Aaron Marcuse-Kubitza	vegbien.sql: aggregateoccurrence: Added growthform
1213	03/02/2012 12:39 PM	Aaron Marcuse-Kubitza	vegbien.ERD.mwb: Reversed the locations of soiltaxon and soilobs to give soilobs room to add new fields
1212	03/02/2012 12:36 PM	Aaron Marcuse-Kubitza	vegbien.sql: Removed embargo table and emb_* fields because we're using a central field, location.confidentialitystatus, for embargo information and coordinate fuzzing
1211	03/02/2012 12:22 PM	Aaron Marcuse-Kubitza	vegbien.sql: stemobservation: Added heightfirstbranch
1210	03/02/2012 12:17 PM	Aaron Marcuse-Kubitza	vegbien.sql: stemobservation: Added diameteraccuracy. Reordered fields.
1209	03/01/2012 05:55 PM	Aaron Marcuse-Kubitza	VegBIEN: stemobservation: Renamed diameter to diameterbreastheight to be more accurate
1208	03/01/2012 05:45 PM	Aaron Marcuse-Kubitza	vegbien.ERD.mwb: Expanded tables where there was room
1207	03/01/2012 05:34 PM	Aaron Marcuse-Kubitza	DwC mappings: Fixed user-defined field mappings according to Brad Boyle's changes
1206	03/01/2012 05:33 PM	Aaron Marcuse-Kubitza	vegbien.sql: Changed specimenreplicate_unique_collectionnumber constraint to include verbatimcollectorname because collection number is assigned by collector
1205	02/28/2012 07:41 PM	Aaron Marcuse-Kubitza	Regenerated vegbien.ERD exports
1204	02/28/2012 07:39 PM	Aaron Marcuse-Kubitza	vegbien.sql: Changed specimenreplicate_unique_collectionnumber constraint to include verbatimcollectorname because collection number is assigned by collector
1203	02/28/2012 07:36 PM	Aaron Marcuse-Kubitza	VegBIEN: Moved taxonoccurrence.verbatimcollectorname to specimenreplicate and aggregateoccurrence so that it can be used in specimenreplicate duplicate elimination
1202	02/28/2012 07:21 PM	Aaron Marcuse-Kubitza	mappings/DwC1-DwC2.specimens.csv: Notes mapping: Removed extraneous /_merge/1
1201	02/28/2012 05:51 PM	Aaron Marcuse-Kubitza	input.Makefile: svn_props: Removed no longer needed items from input dir svn:ignore
1200	02/28/2012 05:49 PM	Aaron Marcuse-Kubitza	input.Makefile: verify: Fixed bug for inputs without a .ref where $(wildcard) wouldn't recheck the file after verify/%.out is run, so the verify output wasn't printed
1199	02/28/2012 05:45 PM	Aaron Marcuse-Kubitza	input.Makefile: Moved verify files into separate subdir
1198	02/28/2012 04:30 PM	Aaron Marcuse-Kubitza	bin/map: Changed root label data format convention to datasrc[data_format] so datasource names containing hyphens would not have the part after the - treated as the data format
1197	02/28/2012 04:25 PM	Aaron Marcuse-Kubitza	inputs maps: Changed input root labels to match dir names since verify expects these to be the same
1196	02/28/2012 04:22 PM	Aaron Marcuse-Kubitza	input.Makefile: verify: Fixed bug where datasource name was not set for non-DB inputs
1195	02/28/2012 04:18 PM	Aaron Marcuse-Kubitza	input.Makefile: Removed no longer needed default verify action for dirs with no verify.ref's
1194	02/28/2012 04:15 PM	Aaron Marcuse-Kubitza	input.Makefile: verify: Made verifications table-specific
1193	02/28/2012 03:27 PM	Aaron Marcuse-Kubitza	input.Makefile: import: Merged import and import-all because they do the same thing
1192	02/28/2012 03:26 PM	Aaron Marcuse-Kubitza	input.Makefile: verify: Started rearranging to allow different verifies for each table
1191	02/28/2012 03:19 PM	Aaron Marcuse-Kubitza	Moved verify.sql to mappings since it's mapping-related
1190	02/28/2012 02:31 PM	Aaron Marcuse-Kubitza	input.Makefile: Changed option nolog to log so that options aren't specified in the negative
1189	02/28/2012 01:43 PM	Aaron Marcuse-Kubitza	input.Makefile: svn ignore .trace files
1188	02/28/2012 01:41 PM	Aaron Marcuse-Kubitza	input.Makefile: Profile imports into a .trace file unless env var profile=""
1187	02/28/2012 01:28 PM	Aaron Marcuse-Kubitza	xml_func.py: _alt: On empty input, return None instead of raising SyntaxException because empty input should be OK
1186	02/27/2012 05:37 PM	Aaron Marcuse-Kubitza	xml_func.py: _alt: Fixed bug where not specifying any item would crash the program instead of raising a SyntaxException
1185	02/27/2012 05:33 PM	Aaron Marcuse-Kubitza	Factored verify.sql out into schemas dir
1184	02/27/2012 05:26 PM	Aaron Marcuse-Kubitza	input.Makefile: verify: Print diff in two columns if verbose=1
1183	02/27/2012 05:03 PM	Aaron Marcuse-Kubitza	inputs/SALVIAS/verify.sql: When filtering by datasource name, use an AND clause in the JOIN party's ON condition instead of a separate WHERE statement, so that the datasource filtering code is all on the same line
1182	02/27/2012 04:58 PM	Aaron Marcuse-Kubitza	inputs/SALVIAS/verify.sql: Use new :datasource variable instead of literal 'SALVIAS'
1181	02/27/2012 04:58 PM	Aaron Marcuse-Kubitza	input.Makefile: Provide the verify.sql script a :datasource variable set to the datasource name (in quotes)
1180	02/27/2012 04:39 PM	Aaron Marcuse-Kubitza	vegbien.ERD.mwb: Re-marked aggregateoccurrence:plantobservation relationship as 1:1 in the ERD
1179	02/27/2012 03:55 PM	Aaron Marcuse-Kubitza	bin/map: DB, CSV inputs: Use column indexes instead of column names to look up each field (optimization to avoid repeated dict lookups of the same key)
1178	02/27/2012 03:47 PM	Aaron Marcuse-Kubitza	util.py: ListDict: str(): Print each entry on its own line, in the order the keys were provided
1177	02/27/2012 03:37 PM	Aaron Marcuse-Kubitza	NYBG-DwC maps: Filter out MinimumElevation = "."
1176	02/27/2012 03:37 PM	Aaron Marcuse-Kubitza	xml_dom.py: NodeTextEntryIter: Filter out empty entries (instead of producing an entry with an explicit None value, which causes problems with XML funcs that can't handle Nones)
1175	02/27/2012 03:34 PM	Aaron Marcuse-Kubitza	NYBG-DwC maps: Map to input fields with XML func appended whenever possible (DwC1->DwC2 translation is done by DwC-VegBIEN.specimens.csv)
1174	02/27/2012 02:57 PM	Aaron Marcuse-Kubitza	vegbien.sql: Renamed methodtaxonclass.description to methodtaxonclass.taxonclass and changed it to a closed list (enum taxonclass). method.description can still be used for freeform taxonclass inclusions/exclusions.
1173	02/27/2012 02:47 PM	Aaron Marcuse-Kubitza	DwC1-DwC2.specimens.csv: Removed no longer needed /_alt/2 XML func from date mappings (you will only ever map either the full date or the year/month/day)
1172	02/27/2012 02:43 PM	Aaron Marcuse-Kubitza	DwC mappings: Moved DwC1's CoordinatePrecision /_noCV/value XML func suffix to DwC2-VegBIEN.specimens.csv
1171	02/27/2012 02:38 PM	Aaron Marcuse-Kubitza	mappings: Removed mappings for XML func suffixes of a path because they are now automatically created heuristically by join
1170	02/27/2012 02:37 PM	Aaron Marcuse-Kubitza	join: Added heuristic search for a match on a parent path, so that every XML func suffix of a path doesn't need its own mapping
1169	02/27/2012 02:03 PM	Aaron Marcuse-Kubitza	Regenerated vegbien.ERD exports
1168	02/27/2012 02:01 PM	Aaron Marcuse-Kubitza	vegbien.sql: Added method.pointsperline. Rearranged ERD after removing role fkeys.
1167	02/27/2012 02:00 PM	Aaron Marcuse-Kubitza	filter_ERD.csv: Remove role fkeys
1166	02/27/2012 01:45 PM	Aaron Marcuse-Kubitza	vegbien.sql: aggregateoccurrence: Added linecover

Project

General

Profile

root @ 1265