Project

General

Profile

Statistics
| Revision:

# Date Author Comment
1276 03/05/2012 03:03 PM Aaron Marcuse-Kubitza

inputs/NYBG/maps/DwC.specimens.csv: Removed mappings already present in case-insensitive DwC2 mapping

1275 03/05/2012 02:48 PM Aaron Marcuse-Kubitza

mappings/DwC1-DwC2.specimens.csv: Removed fields already present in DwC2.ci-VegBIEN.specimens.csv

1274 03/05/2012 02:38 PM Aaron Marcuse-Kubitza

Makefiles: Moved remake into main Makefile. Fixed remake to run `make all` in a new make so that cache of existing files is reset. Have main remake run clean and then all instead of forwarding remake to subdirs, so that everything is cleaned before everything is remade.

1273 03/05/2012 02:21 PM Aaron Marcuse-Kubitza

input.Makefile: maps: maps/$(via).%.full.csv: Fixed bug where $(selfMap) would be ignored if it had not yet been made

1272 03/05/2012 02:02 PM Aaron Marcuse-Kubitza

mappings/Makefile: Reorganized into DwC and VegX sections

1271 03/05/2012 02:02 PM Aaron Marcuse-Kubitza

Added autogenerated mappings/DwC2.ci-VegBIEN.specimens.csv. Use it to include DwC2 fields with first letter uppercased in the full DwC mapping, so that datasources that use DwC2 terms with a different case can still use the DwC2 mapping.

1270 03/05/2012 01:57 PM Aaron Marcuse-Kubitza

Added autogenerated mappings/DwC2.ci-VegBIEN.specimens.csv. Use it to include DwC2 fields with first letter uppercased in the full DwC mapping, so that datasources that use DwC2 terms with a different case can still use the DwC2 mapping.

1269 03/05/2012 01:54 PM Aaron Marcuse-Kubitza

inputs/UArizona/maps/DwC.specimens.csv: Mapped CollectedDate to eventDate/_alt/2 even though it's not used because other datasources might copy these mappings and want it already filled in

1268 03/05/2012 01:52 PM Aaron Marcuse-Kubitza

Added ucase_first to uppercase the first character of columns in a spreadsheet

1267 03/05/2012 01:21 PM Aaron Marcuse-Kubitza

Added inputs/UArizona/maps/DwC.specimens.csv autogen maps

1266 03/05/2012 01:20 PM Aaron Marcuse-Kubitza

inputs/UArizona/maps/DwC.specimens.csv: Mapped more fields

1265 03/05/2012 01:14 PM Aaron Marcuse-Kubitza

mappings/DwC1-DwC2.specimens.csv: Remove date -> date/_alt/2 mappings because they prevent the original DwC2 date field from being mapped to without an extra /_alt/2 appended

1264 03/05/2012 01:10 PM Aaron Marcuse-Kubitza

xml_func.py: Use new dates.strtotime(). When component date parts specified, year defaults to dates.epoch.year.

1263 03/05/2012 01:09 PM Aaron Marcuse-Kubitza

dates.py: Added strtotime() to wrap dateutil.parser.parse() with default defaulting to epoch, so that e.g. months with day missing default to day 1 instead of the current day of the month

1262 03/05/2012 12:38 PM Aaron Marcuse-Kubitza

mappings/DwC1-DwC2.specimens.csv: Map eventDate,dateIdentified using /_alt/2 and year/month/day using /_alt/1 so that inputs with both a date and date parts will select between the two

1261 03/05/2012 11:43 AM Aaron Marcuse-Kubitza

input.Makefile: Added comment that self map must be made first if it's needed for maps/$(via).%.full.csv

1260 03/05/2012 11:40 AM Aaron Marcuse-Kubitza

Makefiles: Use .SECONDARY with no prerequisites instead of setting a .PRECIOUS for each intermediate, to simplify turning off automatic deletion of intermediate files

1259 03/05/2012 11:23 AM Aaron Marcuse-Kubitza

inputs/UArizona: Added initial maps/DwC.specimens.csv

1258 03/05/2012 11:10 AM Aaron Marcuse-Kubitza

DwC mappings: Map datasource name via institutionID to avoid conflicting with existing institutionCode fields that many DwC data sources have

1257 03/05/2012 10:57 AM Aaron Marcuse-Kubitza

input.Makefile: Don't profile by default because it appears to slow things down significantly on long imports

1256 03/05/2012 10:56 AM Aaron Marcuse-Kubitza

Added inputs/UArizona/maps

1255 03/03/2012 05:56 PM Aaron Marcuse-Kubitza

Makefile: python-Linux: Added python-profiler

1254 03/03/2012 05:44 PM Aaron Marcuse-Kubitza

specimens verification: Added # binomials test

1253 03/03/2012 05:35 PM Aaron Marcuse-Kubitza

vegbien.sql: specimenreplicate: Removed specimenreplicate_unique_collectionnumber index because the collectionnumber (NYBG FieldNumber) is not always unique within a collector, even though it should be. Changed specimenreplicate_unique_catalognumber to only operate on rows with no sourceaccessioncode (of which there are 8 in NYBG).

1252 03/03/2012 05:09 PM Aaron Marcuse-Kubitza

mappings/verify.specimens.sql: # species test: Fixed to join separately on taxondeterminations for genus and species. # genera test: Removed no longer needed join on party.

1251 03/03/2012 05:04 PM Aaron Marcuse-Kubitza

vegbien.sql: specimenreplicate: Added fki index on taxonoccurrence_id

1250 03/03/2012 04:25 PM Aaron Marcuse-Kubitza

vegbien.sql: plantname: Added index on rank to speed up specimens verifications, where the query planner insists on joining from plantname to specimenreplicate instead of the other way around (which takes much longer without the index)

1249 03/03/2012 03:33 PM Aaron Marcuse-Kubitza

mappings/verify.*: Use nested SELECT instead of JOIN on party to get datasource_id, so that party will not be joined on after other joins have already occurred (which slows things down)

1248 03/03/2012 03:26 PM Aaron Marcuse-Kubitza

vegbien.sql: party: Changed party_unique_name to ignore NULL values and the organizationname (a first(+middle)+last name is considered unique)

1247 03/03/2012 03:15 PM Aaron Marcuse-Kubitza

vegbien.sql: party: Added party_unique_organizationname constraint

1246 03/03/2012 02:11 PM Aaron Marcuse-Kubitza

Specimens verification: Added # genera and # species

1245 03/03/2012 01:50 PM Aaron Marcuse-Kubitza

input.Makefile: verify: Create target dir if it doesn't exist

1244 03/03/2012 01:42 PM Aaron Marcuse-Kubitza

inputs/NYBG: Added verify/specimens.ref.sql

1243 03/03/2012 01:41 PM Aaron Marcuse-Kubitza

Added mappings/verify.specimens.sql

1242 03/03/2012 01:41 PM Aaron Marcuse-Kubitza

Added inputs/NYBG-CSV/verify/

1241 03/03/2012 01:40 PM Aaron Marcuse-Kubitza

Makefile: Print done message after verify

1240 03/03/2012 01:29 PM Aaron Marcuse-Kubitza

VegX-VegBIEN mapping: Use new lookup-only element syntax to ensure that stemtag 1 is not created if it doesn't exist when stemtag 2 tries to set its iscurrent status to false. This should fix the 136 "NullValueException: columns: tag" errors in the SALVIAS organisms import.

1239 03/03/2012 01:27 PM Aaron Marcuse-Kubitza

xpath.py: get(): Added support for lookup-only elements which are not created if they don't exist

1238 03/03/2012 01:25 PM Aaron Marcuse-Kubitza

xpath.py: parse(): Added support for lookup-only elements which are not created if they don't exist

1237 03/03/2012 01:15 PM Aaron Marcuse-Kubitza

VegX-VegBIEN mapping: Map stemtags using [] instead of :[] for attrs that are really keys

1236 03/02/2012 07:54 PM Aaron Marcuse-Kubitza

Regenerated vegbien.ERD exports

1235 03/02/2012 07:52 PM Aaron Marcuse-Kubitza

VegX-VegBIEN mapping: Handle user-defined field voucherType (SALVIAS DetType) by mapping specimenreplicates for voucherTypes other than direct via voucher

1234 03/02/2012 06:58 PM Aaron Marcuse-Kubitza

xml_func.py: Added _if and _eq. Added cast() to throw SyntaxException if can't cast and use it in conv_items(). _merge: Check types of input using conv_items(strings.ustr, items).

1233 03/02/2012 06:53 PM Aaron Marcuse-Kubitza

util.py: Added all_not_none() and bool2str()

1232 03/02/2012 06:52 PM Aaron Marcuse-Kubitza

strings.py: Added ustr() (like built-in str() but converts to unicode object)

1231 03/02/2012 05:32 PM Aaron Marcuse-Kubitza

PostgreSQL-MySQL.csv: Fixed bug in removal of casts of default values, which treated NOT NULL as part of the datatype

1230 03/02/2012 05:30 PM Aaron Marcuse-Kubitza

VegBIEN: soilobs: Added default value for horizon. Adjusted mappings to remove now-unecessary horizon value.

1229 03/02/2012 05:26 PM Aaron Marcuse-Kubitza

repl: Removed automatic case-insensitivity because Python apparently only supports turning on case-insensitivity via (?i) but not off via (?-i) (as Java does)

1228 03/02/2012 05:09 PM Aaron Marcuse-Kubitza

VegBIEN: soilobs: Removed soil* prefix from fields

1227 03/02/2012 05:05 PM Aaron Marcuse-Kubitza

VegX-VegBIEN mapping: Map to new soilobs fields

1226 03/02/2012 04:57 PM Aaron Marcuse-Kubitza

SALVIAS inputs: Use new _units:[units="%"] on soil fields that are percents. Replace "<..." values with 0.

1225 03/02/2012 04:55 PM Aaron Marcuse-Kubitza

xml_func.py: Added _units

1224 03/02/2012 04:30 PM Aaron Marcuse-Kubitza

vegbien.sql: soilobs: Converted user-defined fields to first-class. Labeled appropriate fields as "fraction".

1223 03/02/2012 04:08 PM Aaron Marcuse-Kubitza

VegBIEN mappings: Changed tableRecord_ID to tablerecord_id to match PostgreSQL field name

1222 03/02/2012 04:05 PM Aaron Marcuse-Kubitza

DwC2-VegBIEN mapping: Adjusted user-defined mappings

1221 03/02/2012 04:00 PM Aaron Marcuse-Kubitza

vegbien.sql: userdefined: Made userdefinedname NOT NULL. userdefined, definedvalue: Added unique constraints.

1220 03/02/2012 03:54 PM Aaron Marcuse-Kubitza

VegX-VegBIEN mapping: Mapped userdefined fields to new first-class fields

1219 03/02/2012 03:46 PM Aaron Marcuse-Kubitza

xml_func.py: Added _map and _replace

1218 03/02/2012 02:33 PM Aaron Marcuse-Kubitza

Regenerated vegbien.ERD exports

1217 03/02/2012 02:30 PM Aaron Marcuse-Kubitza

vegbien.ERD.mwb: Fixed lines. Expanded truncated tables where there was room.

1216 03/02/2012 12:51 PM Aaron Marcuse-Kubitza

Regenerated vegbien.ERD exports

1215 03/02/2012 12:51 PM Aaron Marcuse-Kubitza

vegbien.sql: locationevent: Added temperature and precipitation

1214 03/02/2012 12:45 PM Aaron Marcuse-Kubitza

vegbien.sql: aggregateoccurrence: Added growthform

1213 03/02/2012 12:39 PM Aaron Marcuse-Kubitza

vegbien.ERD.mwb: Reversed the locations of soiltaxon and soilobs to give soilobs room to add new fields

1212 03/02/2012 12:36 PM Aaron Marcuse-Kubitza

vegbien.sql: Removed embargo table and emb_* fields because we're using a central field, location.confidentialitystatus, for embargo information and coordinate fuzzing

1211 03/02/2012 12:22 PM Aaron Marcuse-Kubitza

vegbien.sql: stemobservation: Added heightfirstbranch

1210 03/02/2012 12:17 PM Aaron Marcuse-Kubitza

vegbien.sql: stemobservation: Added diameteraccuracy. Reordered fields.

1209 03/01/2012 05:55 PM Aaron Marcuse-Kubitza

VegBIEN: stemobservation: Renamed diameter to diameterbreastheight to be more accurate

1208 03/01/2012 05:45 PM Aaron Marcuse-Kubitza

vegbien.ERD.mwb: Expanded tables where there was room

1207 03/01/2012 05:34 PM Aaron Marcuse-Kubitza

DwC mappings: Fixed user-defined field mappings according to Brad Boyle's changes

1206 03/01/2012 05:33 PM Aaron Marcuse-Kubitza

vegbien.sql: Changed specimenreplicate_unique_collectionnumber constraint to include verbatimcollectorname because collection number is assigned by collector

1205 02/28/2012 07:41 PM Aaron Marcuse-Kubitza

Regenerated vegbien.ERD exports

1204 02/28/2012 07:39 PM Aaron Marcuse-Kubitza

vegbien.sql: Changed specimenreplicate_unique_collectionnumber constraint to include verbatimcollectorname because collection number is assigned by collector

1203 02/28/2012 07:36 PM Aaron Marcuse-Kubitza

VegBIEN: Moved taxonoccurrence.verbatimcollectorname to specimenreplicate and aggregateoccurrence so that it can be used in specimenreplicate duplicate elimination

1202 02/28/2012 07:21 PM Aaron Marcuse-Kubitza

mappings/DwC1-DwC2.specimens.csv: Notes mapping: Removed extraneous /_merge/1

1201 02/28/2012 05:51 PM Aaron Marcuse-Kubitza

input.Makefile: svn_props: Removed no longer needed items from input dir svn:ignore

1200 02/28/2012 05:49 PM Aaron Marcuse-Kubitza

input.Makefile: verify: Fixed bug for inputs without a .ref where $(wildcard) wouldn't recheck the file after verify/%.out is run, so the verify output wasn't printed

1199 02/28/2012 05:45 PM Aaron Marcuse-Kubitza

input.Makefile: Moved verify files into separate subdir

1198 02/28/2012 04:30 PM Aaron Marcuse-Kubitza

bin/map: Changed root label data format convention to datasrc[data_format] so datasource names containing hyphens would not have the part after the - treated as the data format

1197 02/28/2012 04:25 PM Aaron Marcuse-Kubitza

inputs maps: Changed input root labels to match dir names since verify expects these to be the same

1196 02/28/2012 04:22 PM Aaron Marcuse-Kubitza

input.Makefile: verify: Fixed bug where datasource name was not set for non-DB inputs

1195 02/28/2012 04:18 PM Aaron Marcuse-Kubitza

input.Makefile: Removed no longer needed default verify action for dirs with no verify.ref's

1194 02/28/2012 04:15 PM Aaron Marcuse-Kubitza

input.Makefile: verify: Made verifications table-specific

1193 02/28/2012 03:27 PM Aaron Marcuse-Kubitza

input.Makefile: import: Merged import and import-all because they do the same thing

1192 02/28/2012 03:26 PM Aaron Marcuse-Kubitza

input.Makefile: verify: Started rearranging to allow different verifies for each table

1191 02/28/2012 03:19 PM Aaron Marcuse-Kubitza

Moved verify.sql to mappings since it's mapping-related

1190 02/28/2012 02:31 PM Aaron Marcuse-Kubitza

input.Makefile: Changed option nolog to log so that options aren't specified in the negative

1189 02/28/2012 01:43 PM Aaron Marcuse-Kubitza

input.Makefile: svn ignore .trace files

1188 02/28/2012 01:41 PM Aaron Marcuse-Kubitza

input.Makefile: Profile imports into a .trace file unless env var profile=""

1187 02/28/2012 01:28 PM Aaron Marcuse-Kubitza

xml_func.py: _alt: On empty input, return None instead of raising SyntaxException because empty input should be OK

1186 02/27/2012 05:37 PM Aaron Marcuse-Kubitza

xml_func.py: _alt: Fixed bug where not specifying any item would crash the program instead of raising a SyntaxException

1185 02/27/2012 05:33 PM Aaron Marcuse-Kubitza

Factored verify.sql out into schemas dir

1184 02/27/2012 05:26 PM Aaron Marcuse-Kubitza

input.Makefile: verify: Print diff in two columns if verbose=1

1183 02/27/2012 05:03 PM Aaron Marcuse-Kubitza

inputs/SALVIAS/verify.sql: When filtering by datasource name, use an AND clause in the JOIN party's ON condition instead of a separate WHERE statement, so that the datasource filtering code is all on the same line

1182 02/27/2012 04:58 PM Aaron Marcuse-Kubitza

inputs/SALVIAS/verify.sql: Use new :datasource variable instead of literal 'SALVIAS'

1181 02/27/2012 04:58 PM Aaron Marcuse-Kubitza

input.Makefile: Provide the verify.sql script a :datasource variable set to the datasource name (in quotes)

1180 02/27/2012 04:39 PM Aaron Marcuse-Kubitza

vegbien.ERD.mwb: Re-marked aggregateoccurrence:plantobservation relationship as 1:1 in the ERD

1179 02/27/2012 03:55 PM Aaron Marcuse-Kubitza

bin/map: DB, CSV inputs: Use column indexes instead of column names to look up each field (optimization to avoid repeated dict lookups of the same key)

1178 02/27/2012 03:47 PM Aaron Marcuse-Kubitza

util.py: ListDict: str(): Print each entry on its own line, in the order the keys were provided

1177 02/27/2012 03:37 PM Aaron Marcuse-Kubitza

NYBG-DwC maps: Filter out MinimumElevation = "."