/ - Changes - BIEN 3 - NCEAS Projects

root @ 4369

#	Date	Author	Comment
4369	08/30/2012 10:14 AM	Aaron Marcuse-Kubitza	inputs/CTFS/: Added TaxonOccurrence/ and its joined tables
4368	08/30/2012 10:13 AM	Aaron Marcuse-Kubitza	inputs/CTFS/: Added TaxonOccurrence/ and its joined tables
4367	08/30/2012 10:06 AM	Aaron Marcuse-Kubitza	inputs/CTFS/_archive/Organism.VegX/README.TXT: Added calculation of StemObservation rows distribution for each plot, which indicates that the bci plot actually contains 90% of the StemObservation rows. This brings the size inflation of VegX down to ~6x.
4366	08/30/2012 09:42 AM	Aaron Marcuse-Kubitza	inputs/CTFS/_archive/Organism.VegX/: Added README.TXT describing that this VegX export includes only one of 157 CTFS plots. This is important, because it indicates that VegX creates a ~1000x (!) increase in storage size (613.6 MB for bci.sql with 157 plots vs. 3.78 GB for VegX_CTFS_row_*.xml with 1 plot, assuming roughly equal #s of stems per plot).
4365	08/30/2012 09:08 AM	Aaron Marcuse-Kubitza	inputs/CTFS/StemObservation/map.csv: Remapped StemID to authorStemCode since it's only unique within the parent organism (Tree), not a globally unique ID as is required for stemID
4364	08/30/2012 09:05 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped authorStemCode
4363	08/30/2012 08:58 AM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added authorStemCode
4362	08/30/2012 08:58 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped stemID
4361	08/30/2012 08:52 AM	Aaron Marcuse-Kubitza	inputs/SALVIAS/2.stems/map.csv: Mapped stem_id
4360	08/30/2012 08:46 AM	Aaron Marcuse-Kubitza	README.TXT: Datasource setup: Added steps to install any MySQL export
4359	08/30/2012 08:13 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped stemID
4358	08/30/2012 08:10 AM	Aaron Marcuse-Kubitza	mappings/Veg+-VegCore.csv: Mapped stem_id
4357	08/30/2012 08:05 AM	Aaron Marcuse-Kubitza	repl: Support treating all patterns as plain text (non-regexp)
4356	08/30/2012 07:52 AM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added stem_id
4355	08/30/2012 07:51 AM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added stemID
4354	08/30/2012 07:44 AM	Aaron Marcuse-Kubitza	mappings/Veg+-VegCore.csv: Mapped speciesName, subSpeciesName
4353	08/30/2012 07:43 AM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added CTFS taxonomic name columns
4352	08/30/2012 07:28 AM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Removed comments not applicable to the term itself
4351	08/30/2012 07:25 AM	Aaron Marcuse-Kubitza	Inputs with multiple tables: Added explicit import_order.txt files, so that sort orders can later be removed from the subdir names
4350	08/29/2012 11:17 PM	Aaron Marcuse-Kubitza	inputs/CTFS/: Added StemObservation/ and tables it is joined from
4349	08/29/2012 11:09 PM	Aaron Marcuse-Kubitza	mappings/Veg+-VegCore.csv: Mapped stemTag
4348	08/29/2012 11:08 PM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added stemTag
4347	08/29/2012 11:04 PM	Aaron Marcuse-Kubitza	mappings/Veg+-VegCore.csv: Mapped DBH
4346	08/29/2012 11:02 PM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added DBH
4345	08/29/2012 10:58 PM	Aaron Marcuse-Kubitza	input.Makefile: Maps building: Added comment that you cannot make a subdir separately from the entire datasource dir
4344	08/29/2012 10:17 PM	Aaron Marcuse-Kubitza	inputs/CTFS/Plot/create.sql: Added newline at end of file
4343	08/29/2012 10:04 PM	Aaron Marcuse-Kubitza	inputs/CTFS/: Renamed Site.src to Plot.src to use a VegCSV name for the table
4342	08/29/2012 10:01 PM	Aaron Marcuse-Kubitza	README.TXT: Datasource setup: Adding input data for each table: `make inputs/<datasrc>/<table>/add`: Added note explaining why you need to use this command instead of just creating an empty directory of the desired name
4341	08/29/2012 08:44 PM	Aaron Marcuse-Kubitza	inputs/CTFS/: Added SubplotObservation/
4340	08/29/2012 08:38 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Redirect eventID, fieldNumber (authoreventcode) to parent locationevent when subplot columns exist
4339	08/29/2012 08:23 PM	Aaron Marcuse-Kubitza	inputs/CTFS/import_order.txt: Added PlotObservation
4338	08/29/2012 08:23 PM	Aaron Marcuse-Kubitza	inputs/CTFS/PlotObservation/: Remade (hadn't been automatically remade because it wasn't part of import_order.txt)
4337	08/29/2012 08:13 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Also redirect locationID/plotName to parent location if subplotID column was provided
4336	08/29/2012 08:08 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: location.authorlocationcode mappings: Use _first to remove specimens-related alternatives for this field from consideration when plots-related alternatives exist. This avoids unintentionally using specimens-related columns for this field in plots data.
4335	08/29/2012 08:06 PM	Aaron Marcuse-Kubitza	xml_func.py: Added _first() simplifying function
4334	08/29/2012 08:05 PM	Aaron Marcuse-Kubitza	xml_func.py: Added helper functions variadic_args() and map_names()
4333	08/29/2012 07:38 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: location.authorlocationcode mappings: Placed inside "if subplot" _if statement along with sourceaccessioncode to reduce the number of separate _if statements needing a condition mapping
4332	08/29/2012 07:32 PM	Aaron Marcuse-Kubitza	xml_dom.py: NodeEntryIter: Support entries with multiple children
4331	08/29/2012 07:20 PM	Aaron Marcuse-Kubitza	xml_dom.py: replace(): Support a list of new nodes to replace the old node with
4330	08/29/2012 07:01 PM	Aaron Marcuse-Kubitza	xml_dom.py: Moved only_child() near related method has_one_child()
4329	08/29/2012 07:00 PM	Aaron Marcuse-Kubitza	xml_dom.py: only_child(): Raise exception instead of failing assertion. Include invalid node in exception message for easier debugging.
4328	08/29/2012 06:57 PM	Aaron Marcuse-Kubitza	xml_dom.py: Added only_child() and use it where its definition was used
4327	08/29/2012 06:33 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Changed _merge to _join wherever the duplicate-eliminating functionality of _merge is not needed and a simple concatenation of non-NULL values is sufficient
4326	08/29/2012 06:24 PM	Aaron Marcuse-Kubitza	xml_func.py: Added _join() simplifying function
4325	08/29/2012 06:22 PM	Aaron Marcuse-Kubitza	schemas/functions.sql: Added _join()
4324	08/29/2012 06:18 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Moved "if subplot" _if statement around /location/parent_id and /location/sourceaccessioncode themselves, so that only one _if cond mapping for subplot is needed. Note that this is only possible because this _if statement uses _exists, allowing it to be fully evaluated by the XML template simplifying mechanism, which supports subtrees as arguments to _if.
4323	08/29/2012 06:06 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Removed no longer used parentLocationID, parentPlotName (locationID and plotName now automatically map to the correct location). mappings/Veg+-VegCore.csv: Removed no longer used parentPlotID.
4322	08/29/2012 05:57 PM	Aaron Marcuse-Kubitza	xml_func.py: passthru(): Use xml_dom.prune() so that after empty children are removed, the node itself is also removed if it's empty. This enables further pruning of any node that contains the pruned node.
4321	08/29/2012 05:55 PM	Aaron Marcuse-Kubitza	xml_dom.py: Added prune()
4320	08/29/2012 05:52 PM	Aaron Marcuse-Kubitza	xml_func.py: Removed no longer used prune() (use xml_dom.prune_children() instead)
4319	08/29/2012 05:51 PM	Aaron Marcuse-Kubitza	xml_func.py: Use new xml_dom.prune_children()
4318	08/29/2012 05:51 PM	Aaron Marcuse-Kubitza	xml_dom.py: Added prune_empty() and prune_children()
4317	08/29/2012 05:29 PM	Aaron Marcuse-Kubitza	inputs/CTFS/: Moved VegX export subdir to _archive and renamed it to remove ".disabled" suffix and have a VegCSV-like name
4316	08/29/2012 05:24 PM	Aaron Marcuse-Kubitza	inputs/CTFS/: Renamed README.TXT to DFtemp.analysis_query.txt because it relates only to a particular query from Shash, and moved it to the _archive/ subdir
4315	08/29/2012 05:21 PM	Aaron Marcuse-Kubitza	inputs/CTFS/: Moved source files into new _src/ subdir to avoid cluttering up the main dir
4314	08/29/2012 05:16 PM	Aaron Marcuse-Kubitza	Added inputs/CTFS/_src/
4313	08/29/2012 05:02 PM	Aaron Marcuse-Kubitza	inputs/CTFS/: Added non-data files that weren't under version control
4312	08/29/2012 04:59 PM	Aaron Marcuse-Kubitza	inputs/CTFS/: Moved _scripts_to_drop_extra_tables to _archive because they are for a different version of the CTFS database than the extract we received (bci.sql)
4311	08/29/2012 04:57 PM	Aaron Marcuse-Kubitza	inputs/CTFS/: Moved DBv5.txt to _archive because it's for a different version of the CTFS database than the extract we received (bci.sql)
4310	08/29/2012 04:49 PM	Aaron Marcuse-Kubitza	inputs/CTFS/: Moved CTFS_conversion_bci.php to _archive since it's just for the DFtemp (aggregated) mapping
4309	08/29/2012 04:48 PM	Aaron Marcuse-Kubitza	Added inputs/CTFS/_archive
4308	08/29/2012 04:39 PM	Aaron Marcuse-Kubitza	inputs/import.stats.xls: Updated with stats from latest import
4307	08/28/2012 07:56 PM	Aaron Marcuse-Kubitza	Added inputs/CTFS/PlotObservation/
4306	08/28/2012 07:54 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: fieldNumber (authoreventcode): Don't copy to location.authorlocationcode if an actual locationID was specified
4305	08/28/2012 07:51 PM	Aaron Marcuse-Kubitza	xml_func.py: simplify(): Removed no longer needed pass-through optimizations for XML functions, which are now handled by each function's own simplifying function
4304	08/28/2012 07:50 PM	Aaron Marcuse-Kubitza	xml_func.py: Added _name simplifying function
4303	08/28/2012 07:48 PM	Aaron Marcuse-Kubitza	xml_func.py: Added _alt, _merge simplifying functions
4302	08/28/2012 07:45 PM	Aaron Marcuse-Kubitza	xml_func.py: passthru(): First prune the node
4301	08/28/2012 07:43 PM	Aaron Marcuse-Kubitza	xml_func.py: simplify(): Use new passthru()
4300	08/28/2012 07:43 PM	Aaron Marcuse-Kubitza	xml_func.py: Added passthru()
4299	08/28/2012 07:36 PM	Aaron Marcuse-Kubitza	xml_func.py: simplify(): Use new prune()
4298	08/28/2012 07:36 PM	Aaron Marcuse-Kubitza	xml_func.py: Added prune()
4297	08/28/2012 07:26 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped eventID
4296	08/28/2012 07:24 PM	Aaron Marcuse-Kubitza	mappings/Veg+-VegCore.csv: Mapped CTFS Census terms
4295	08/28/2012 07:20 PM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added CTFS Census terms
4294	08/28/2012 07:17 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Changed plotEventStartDate, plotEventEndDate to startDate, endDate because a date range always applies to the event
4293	08/28/2012 07:13 PM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added startDate, endDate
4292	08/28/2012 06:59 PM	Aaron Marcuse-Kubitza	README.TXT: Testing: Mapping process: Added command to include column-based import tests
4291	08/28/2012 06:49 PM	Aaron Marcuse-Kubitza	README.TXT: Datasource setup: Update vegbiendev: Added step to run the tests, to make sure the staging tables were installed properly
4290	08/28/2012 06:45 PM	Aaron Marcuse-Kubitza	inputs/CTFS/Plot/: Added create.sql
4289	08/28/2012 06:44 PM	Aaron Marcuse-Kubitza	inputs/CTFS/: Added import_order.txt
4288	08/28/2012 06:40 PM	Aaron Marcuse-Kubitza	Added inputs/CTFS/Subplot/
4287	08/28/2012 06:36 PM	Aaron Marcuse-Kubitza	mappings/Veg+-VegCore.csv: Mapped CTFS QuadratID
4286	08/28/2012 06:26 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped subplotID
4285	08/28/2012 06:24 PM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added subplotID
4284	08/28/2012 06:22 PM	Aaron Marcuse-Kubitza	mappings/Veg+-VegCore.csv: Mapped CTFS Quadrat columns
4283	08/28/2012 06:18 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped subplotX, subplotY
4282	08/28/2012 06:14 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Removed empty mappings for unmapped DwC terms because these terms are now listed and maintained in mappings/Veg+.terms.csv
4281	08/28/2012 06:12 PM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added Brad's descriptive comments for several VegCore terms
4280	08/28/2012 06:07 PM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added subplotX, subplotY
4279	08/28/2012 06:03 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Made organismX, organismY the official VegCore terms and map relativePlotX, relativePlotY to them in mappings/Veg+-VegCore.csv
4278	08/28/2012 06:00 PM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added organismX, organismY as clearer alternatives to relativePlotX, relativePlotY
4277	08/28/2012 05:48 PM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added CTFS Quadrat columns
4276	08/28/2012 05:38 PM	Aaron Marcuse-Kubitza	Added inputs/CTFS/
4275	08/28/2012 05:36 PM	Aaron Marcuse-Kubitza	input.Makefile: Testing: Only run column-based tests if column-based mode enabled, because these tests are much slower than the row-based tests for small numbers of rows. Note that this involves explicitly turning off column-based mode in the row-based test, to prevent propagation of the by_col env var which both enables these extra tests and sets bin/map to run in column-based mode.
4274	08/28/2012 05:28 PM	Aaron Marcuse-Kubitza	input.Makefile: Testing: Added by-column test, which is compared to the row-based test's accepted output
4273	08/28/2012 05:20 PM	Aaron Marcuse-Kubitza	input.Makefile: Testing: Merged $(runTest) and $(test2Db) because all tests go to the database
4272	08/28/2012 05:19 PM	Aaron Marcuse-Kubitza	input.Makefile: Testing: Moved `$(foreach use_staged,1,...)` from $(test2Db) to $(runTest) because all tests now use the staging tables
4271	08/28/2012 05:15 PM	Aaron Marcuse-Kubitza	input.Makefile: Testing: Merged $(test2Db) and $(testStaged2Db) because all tests now use the staging tables
4270	08/28/2012 05:14 PM	Aaron Marcuse-Kubitza	input.Makefile: Testing: $(runTest): Always use $(map2db) because there are no tests that use other programs (and haven't been in awhile)

Project

General

Profile