/ - Changes - BIEN 3 - NCEAS Projects

root @ 4392

#	Date	Author	Comment
4392	08/31/2012 08:15 PM	Aaron Marcuse-Kubitza	schemas/Makefile: Added analytical_db target
4391	08/31/2012 08:09 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: Added make_analytical_db() and helper view analytical_db_view. Note that adding a view which depends on other tables will cause those tables to be reordered in dependency order to appear before the view, causing the svn diff to change completely even though the DB structure has only been added to.
4390	08/31/2012 08:05 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: Removed OIDs from tables because we don't use them (tables have primary keys instead)
4389	08/31/2012 02:23 PM	Aaron Marcuse-Kubitza	inputs/import.stats.xls: Updated with stats from latest import. This now includes CTFS.TaxonOccurrence (presence-only observations), FIA (11 million rows!), and Madidi.Organism. The addition of FIA almost doubles the # of rows to 26 million and increases the import time from 9.5 to 11.5 hours.
4388	08/30/2012 04:54 PM	Aaron Marcuse-Kubitza	sql_io.py: null_strs: Added 'UNKNOWN'
4387	08/30/2012 04:02 PM	Aaron Marcuse-Kubitza	Added inputs/FIA/
4386	08/30/2012 12:45 PM	Aaron Marcuse-Kubitza	inputs/: Renamed subfolders to VegCSV names, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV_subfolders#Rename-subfolders-to-VegCSV-names>
4385	08/30/2012 12:37 PM	Aaron Marcuse-Kubitza	inputs/Madidi/1.organisms/map.csv: Mapped columns
4384	08/30/2012 11:46 AM	Aaron Marcuse-Kubitza	inputs/Madidi/0.plots/map.csv: Remapped DMS Latitude/Longitude to verbatimLatitude/verbatimLongitude, since this is not the decimalLatitude/decimalLongitude
4383	08/30/2012 11:40 AM	Aaron Marcuse-Kubitza	input.Makefile: Testing: %-ok: Rename the test output to the accepted test output instead of copying it, because outputs of successful (including newly accepted) tests should be removed to reduce clutter (as $(runTest) does)
4382	08/30/2012 11:35 AM	Aaron Marcuse-Kubitza	mappings/Veg+-VegCore.csv: Remapped CTFS QuadratID to subplot rather than subplotID, because it's only unique within the parent plot, not globally unique, in CTFS
4381	08/30/2012 11:23 AM	Aaron Marcuse-Kubitza	inputs/import.stats.xls: Updated with stats from latest import. This now includes the core CTFS tables.
4380	08/30/2012 11:10 AM	Aaron Marcuse-Kubitza	Added inputs/VegBank/ with DB export
4379	08/30/2012 11:04 AM	Aaron Marcuse-Kubitza	input.Makefile: General targets: `%: %.make`: Don't always remake the target whenever it's visited, as other targets may depend on this file and it should not be remade whenever they are visited
4378	08/30/2012 11:00 AM	Aaron Marcuse-Kubitza	input.Makefile: General targets: `%: %.make`: Changed log file suffix to .log, because this log does not necessarily contain SQL statements
4377	08/30/2012 10:57 AM	Aaron Marcuse-Kubitza	input.Makefile: General targets: `%: %.make`: Time the creating command
4376	08/30/2012 10:55 AM	Aaron Marcuse-Kubitza	input.Makefile: General targets: Removed duplicate `%: %.make` rule
4375	08/30/2012 10:43 AM	Aaron Marcuse-Kubitza	inputs/CTFS/TaxonOccurrence/map.csv: Documented that InfraSpecificLevel is unused
4374	08/30/2012 10:42 AM	Aaron Marcuse-Kubitza	inputs/CTFS/TaxonOccurrence/map.csv: Documented that InfraSpecificLevel is unused
4373	08/30/2012 10:32 AM	Aaron Marcuse-Kubitza	mappings/Veg+-VegCore.csv: Mapped speciesInvID
4372	08/30/2012 10:27 AM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added speciesInvID
4371	08/30/2012 10:25 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped taxonOccurrenceID
4370	08/30/2012 10:22 AM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added taxonOccurrenceID
4369	08/30/2012 10:14 AM	Aaron Marcuse-Kubitza	inputs/CTFS/: Added TaxonOccurrence/ and its joined tables
4368	08/30/2012 10:13 AM	Aaron Marcuse-Kubitza	inputs/CTFS/: Added TaxonOccurrence/ and its joined tables
4367	08/30/2012 10:06 AM	Aaron Marcuse-Kubitza	inputs/CTFS/_archive/Organism.VegX/README.TXT: Added calculation of StemObservation rows distribution for each plot, which indicates that the bci plot actually contains 90% of the StemObservation rows. This brings the size inflation of VegX down to ~6x.
4366	08/30/2012 09:42 AM	Aaron Marcuse-Kubitza	inputs/CTFS/_archive/Organism.VegX/: Added README.TXT describing that this VegX export includes only one of 157 CTFS plots. This is important, because it indicates that VegX creates a ~1000x (!) increase in storage size (613.6 MB for bci.sql with 157 plots vs. 3.78 GB for VegX_CTFS_row_*.xml with 1 plot, assuming roughly equal #s of stems per plot).
4365	08/30/2012 09:08 AM	Aaron Marcuse-Kubitza	inputs/CTFS/StemObservation/map.csv: Remapped StemID to authorStemCode since it's only unique within the parent organism (Tree), not a globally unique ID as is required for stemID
4364	08/30/2012 09:05 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped authorStemCode
4363	08/30/2012 08:58 AM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added authorStemCode
4362	08/30/2012 08:58 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped stemID
4361	08/30/2012 08:52 AM	Aaron Marcuse-Kubitza	inputs/SALVIAS/2.stems/map.csv: Mapped stem_id
4360	08/30/2012 08:46 AM	Aaron Marcuse-Kubitza	README.TXT: Datasource setup: Added steps to install any MySQL export
4359	08/30/2012 08:13 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped stemID
4358	08/30/2012 08:10 AM	Aaron Marcuse-Kubitza	mappings/Veg+-VegCore.csv: Mapped stem_id
4357	08/30/2012 08:05 AM	Aaron Marcuse-Kubitza	repl: Support treating all patterns as plain text (non-regexp)
4356	08/30/2012 07:52 AM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added stem_id
4355	08/30/2012 07:51 AM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added stemID
4354	08/30/2012 07:44 AM	Aaron Marcuse-Kubitza	mappings/Veg+-VegCore.csv: Mapped speciesName, subSpeciesName
4353	08/30/2012 07:43 AM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added CTFS taxonomic name columns
4352	08/30/2012 07:28 AM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Removed comments not applicable to the term itself
4351	08/30/2012 07:25 AM	Aaron Marcuse-Kubitza	Inputs with multiple tables: Added explicit import_order.txt files, so that sort orders can later be removed from the subdir names
4350	08/29/2012 11:17 PM	Aaron Marcuse-Kubitza	inputs/CTFS/: Added StemObservation/ and tables it is joined from
4349	08/29/2012 11:09 PM	Aaron Marcuse-Kubitza	mappings/Veg+-VegCore.csv: Mapped stemTag
4348	08/29/2012 11:08 PM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added stemTag
4347	08/29/2012 11:04 PM	Aaron Marcuse-Kubitza	mappings/Veg+-VegCore.csv: Mapped DBH
4346	08/29/2012 11:02 PM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added DBH
4345	08/29/2012 10:58 PM	Aaron Marcuse-Kubitza	input.Makefile: Maps building: Added comment that you cannot make a subdir separately from the entire datasource dir
4344	08/29/2012 10:17 PM	Aaron Marcuse-Kubitza	inputs/CTFS/Plot/create.sql: Added newline at end of file
4343	08/29/2012 10:04 PM	Aaron Marcuse-Kubitza	inputs/CTFS/: Renamed Site.src to Plot.src to use a VegCSV name for the table
4342	08/29/2012 10:01 PM	Aaron Marcuse-Kubitza	README.TXT: Datasource setup: Adding input data for each table: `make inputs/<datasrc>/<table>/add`: Added note explaining why you need to use this command instead of just creating an empty directory of the desired name
4341	08/29/2012 08:44 PM	Aaron Marcuse-Kubitza	inputs/CTFS/: Added SubplotObservation/
4340	08/29/2012 08:38 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Redirect eventID, fieldNumber (authoreventcode) to parent locationevent when subplot columns exist
4339	08/29/2012 08:23 PM	Aaron Marcuse-Kubitza	inputs/CTFS/import_order.txt: Added PlotObservation
4338	08/29/2012 08:23 PM	Aaron Marcuse-Kubitza	inputs/CTFS/PlotObservation/: Remade (hadn't been automatically remade because it wasn't part of import_order.txt)
4337	08/29/2012 08:13 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Also redirect locationID/plotName to parent location if subplotID column was provided
4336	08/29/2012 08:08 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: location.authorlocationcode mappings: Use _first to remove specimens-related alternatives for this field from consideration when plots-related alternatives exist. This avoids unintentionally using specimens-related columns for this field in plots data.
4335	08/29/2012 08:06 PM	Aaron Marcuse-Kubitza	xml_func.py: Added _first() simplifying function
4334	08/29/2012 08:05 PM	Aaron Marcuse-Kubitza	xml_func.py: Added helper functions variadic_args() and map_names()
4333	08/29/2012 07:38 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: location.authorlocationcode mappings: Placed inside "if subplot" _if statement along with sourceaccessioncode to reduce the number of separate _if statements needing a condition mapping
4332	08/29/2012 07:32 PM	Aaron Marcuse-Kubitza	xml_dom.py: NodeEntryIter: Support entries with multiple children
4331	08/29/2012 07:20 PM	Aaron Marcuse-Kubitza	xml_dom.py: replace(): Support a list of new nodes to replace the old node with
4330	08/29/2012 07:01 PM	Aaron Marcuse-Kubitza	xml_dom.py: Moved only_child() near related method has_one_child()
4329	08/29/2012 07:00 PM	Aaron Marcuse-Kubitza	xml_dom.py: only_child(): Raise exception instead of failing assertion. Include invalid node in exception message for easier debugging.
4328	08/29/2012 06:57 PM	Aaron Marcuse-Kubitza	xml_dom.py: Added only_child() and use it where its definition was used
4327	08/29/2012 06:33 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Changed _merge to _join wherever the duplicate-eliminating functionality of _merge is not needed and a simple concatenation of non-NULL values is sufficient
4326	08/29/2012 06:24 PM	Aaron Marcuse-Kubitza	xml_func.py: Added _join() simplifying function
4325	08/29/2012 06:22 PM	Aaron Marcuse-Kubitza	schemas/functions.sql: Added _join()
4324	08/29/2012 06:18 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Moved "if subplot" _if statement around /location/parent_id and /location/sourceaccessioncode themselves, so that only one _if cond mapping for subplot is needed. Note that this is only possible because this _if statement uses _exists, allowing it to be fully evaluated by the XML template simplifying mechanism, which supports subtrees as arguments to _if.
4323	08/29/2012 06:06 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Removed no longer used parentLocationID, parentPlotName (locationID and plotName now automatically map to the correct location). mappings/Veg+-VegCore.csv: Removed no longer used parentPlotID.
4322	08/29/2012 05:57 PM	Aaron Marcuse-Kubitza	xml_func.py: passthru(): Use xml_dom.prune() so that after empty children are removed, the node itself is also removed if it's empty. This enables further pruning of any node that contains the pruned node.
4321	08/29/2012 05:55 PM	Aaron Marcuse-Kubitza	xml_dom.py: Added prune()
4320	08/29/2012 05:52 PM	Aaron Marcuse-Kubitza	xml_func.py: Removed no longer used prune() (use xml_dom.prune_children() instead)
4319	08/29/2012 05:51 PM	Aaron Marcuse-Kubitza	xml_func.py: Use new xml_dom.prune_children()
4318	08/29/2012 05:51 PM	Aaron Marcuse-Kubitza	xml_dom.py: Added prune_empty() and prune_children()
4317	08/29/2012 05:29 PM	Aaron Marcuse-Kubitza	inputs/CTFS/: Moved VegX export subdir to _archive and renamed it to remove ".disabled" suffix and have a VegCSV-like name
4316	08/29/2012 05:24 PM	Aaron Marcuse-Kubitza	inputs/CTFS/: Renamed README.TXT to DFtemp.analysis_query.txt because it relates only to a particular query from Shash, and moved it to the _archive/ subdir
4315	08/29/2012 05:21 PM	Aaron Marcuse-Kubitza	inputs/CTFS/: Moved source files into new _src/ subdir to avoid cluttering up the main dir
4314	08/29/2012 05:16 PM	Aaron Marcuse-Kubitza	Added inputs/CTFS/_src/
4313	08/29/2012 05:02 PM	Aaron Marcuse-Kubitza	inputs/CTFS/: Added non-data files that weren't under version control
4312	08/29/2012 04:59 PM	Aaron Marcuse-Kubitza	inputs/CTFS/: Moved _scripts_to_drop_extra_tables to _archive because they are for a different version of the CTFS database than the extract we received (bci.sql)
4311	08/29/2012 04:57 PM	Aaron Marcuse-Kubitza	inputs/CTFS/: Moved DBv5.txt to _archive because it's for a different version of the CTFS database than the extract we received (bci.sql)
4310	08/29/2012 04:49 PM	Aaron Marcuse-Kubitza	inputs/CTFS/: Moved CTFS_conversion_bci.php to _archive since it's just for the DFtemp (aggregated) mapping
4309	08/29/2012 04:48 PM	Aaron Marcuse-Kubitza	Added inputs/CTFS/_archive
4308	08/29/2012 04:39 PM	Aaron Marcuse-Kubitza	inputs/import.stats.xls: Updated with stats from latest import
4307	08/28/2012 07:56 PM	Aaron Marcuse-Kubitza	Added inputs/CTFS/PlotObservation/
4306	08/28/2012 07:54 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: fieldNumber (authoreventcode): Don't copy to location.authorlocationcode if an actual locationID was specified
4305	08/28/2012 07:51 PM	Aaron Marcuse-Kubitza	xml_func.py: simplify(): Removed no longer needed pass-through optimizations for XML functions, which are now handled by each function's own simplifying function
4304	08/28/2012 07:50 PM	Aaron Marcuse-Kubitza	xml_func.py: Added _name simplifying function
4303	08/28/2012 07:48 PM	Aaron Marcuse-Kubitza	xml_func.py: Added _alt, _merge simplifying functions
4302	08/28/2012 07:45 PM	Aaron Marcuse-Kubitza	xml_func.py: passthru(): First prune the node
4301	08/28/2012 07:43 PM	Aaron Marcuse-Kubitza	xml_func.py: simplify(): Use new passthru()
4300	08/28/2012 07:43 PM	Aaron Marcuse-Kubitza	xml_func.py: Added passthru()
4299	08/28/2012 07:36 PM	Aaron Marcuse-Kubitza	xml_func.py: simplify(): Use new prune()
4298	08/28/2012 07:36 PM	Aaron Marcuse-Kubitza	xml_func.py: Added prune()
4297	08/28/2012 07:26 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped eventID
4296	08/28/2012 07:24 PM	Aaron Marcuse-Kubitza	mappings/Veg+-VegCore.csv: Mapped CTFS Census terms
4295	08/28/2012 07:20 PM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added CTFS Census terms
4294	08/28/2012 07:17 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Changed plotEventStartDate, plotEventEndDate to startDate, endDate because a date range always applies to the event
4293	08/28/2012 07:13 PM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added startDate, endDate

Project

General

Profile

root @ 4392