/ - Changes - BIEN 3 - NCEAS Projects

root @ 4289

#	Date	Author	Comment
4289	08/28/2012 06:44 PM	Aaron Marcuse-Kubitza	inputs/CTFS/: Added import_order.txt
4288	08/28/2012 06:40 PM	Aaron Marcuse-Kubitza	Added inputs/CTFS/Subplot/
4287	08/28/2012 06:36 PM	Aaron Marcuse-Kubitza	mappings/Veg+-VegCore.csv: Mapped CTFS QuadratID
4286	08/28/2012 06:26 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped subplotID
4285	08/28/2012 06:24 PM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added subplotID
4284	08/28/2012 06:22 PM	Aaron Marcuse-Kubitza	mappings/Veg+-VegCore.csv: Mapped CTFS Quadrat columns
4283	08/28/2012 06:18 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped subplotX, subplotY
4282	08/28/2012 06:14 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Removed empty mappings for unmapped DwC terms because these terms are now listed and maintained in mappings/Veg+.terms.csv
4281	08/28/2012 06:12 PM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added Brad's descriptive comments for several VegCore terms
4280	08/28/2012 06:07 PM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added subplotX, subplotY
4279	08/28/2012 06:03 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Made organismX, organismY the official VegCore terms and map relativePlotX, relativePlotY to them in mappings/Veg+-VegCore.csv
4278	08/28/2012 06:00 PM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added organismX, organismY as clearer alternatives to relativePlotX, relativePlotY
4277	08/28/2012 05:48 PM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added CTFS Quadrat columns
4276	08/28/2012 05:38 PM	Aaron Marcuse-Kubitza	Added inputs/CTFS/
4275	08/28/2012 05:36 PM	Aaron Marcuse-Kubitza	input.Makefile: Testing: Only run column-based tests if column-based mode enabled, because these tests are much slower than the row-based tests for small numbers of rows. Note that this involves explicitly turning off column-based mode in the row-based test, to prevent propagation of the by_col env var which both enables these extra tests and sets bin/map to run in column-based mode.
4274	08/28/2012 05:28 PM	Aaron Marcuse-Kubitza	input.Makefile: Testing: Added by-column test, which is compared to the row-based test's accepted output
4273	08/28/2012 05:20 PM	Aaron Marcuse-Kubitza	input.Makefile: Testing: Merged $(runTest) and $(test2Db) because all tests go to the database
4272	08/28/2012 05:19 PM	Aaron Marcuse-Kubitza	input.Makefile: Testing: Moved `$(foreach use_staged,1,...)` from $(test2Db) to $(runTest) because all tests now use the staging tables
4271	08/28/2012 05:15 PM	Aaron Marcuse-Kubitza	input.Makefile: Testing: Merged $(test2Db) and $(testStaged2Db) because all tests now use the staging tables
4270	08/28/2012 05:14 PM	Aaron Marcuse-Kubitza	input.Makefile: Testing: $(runTest): Always use $(map2db) because there are no tests that use other programs (and haven't been in awhile)
4269	08/28/2012 05:09 PM	Aaron Marcuse-Kubitza	input.Makefile: Testing: Run the core test from the staging table, because derived tables only have a staging table and the flat-file test would produce inconsistent results
4268	08/28/2012 05:00 PM	Aaron Marcuse-Kubitza	mappings/Makefile: Fixed bug where rules needed to generate Veg+.self.csv ($(viaSelfMap)) were still using a pattern match that required a table (`.%.`, `.*.`), even though we are no longer using separate maps for separate tables
4267	08/28/2012 04:44 PM	Aaron Marcuse-Kubitza	mappings/Veg+-VegCore.csv: Mapped CTFS Country and Site columns
4266	08/28/2012 04:25 PM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added CTFS Country and Site columns
4265	08/28/2012 04:14 PM	Aaron Marcuse-Kubitza	README.TXT: Datasource setup: Adding input data: svn adding the generated map spreadsheets and related files: Added header.csv to the list of files added (for derived tables)
4264	08/28/2012 04:07 PM	Aaron Marcuse-Kubitza	README.TXT: Datasource setup: Adding input data: Documented how to create tables that will be joined together with another table, and how to create tables that are joins of other tables
4263	08/28/2012 04:01 PM	Aaron Marcuse-Kubitza	input.Makefile: Staging tables installation: %/install: Also create header.csv so that there is a CSV header that the map spreadsheets can be autogenerated from
4262	08/28/2012 02:22 PM	Aaron Marcuse-Kubitza	input.Makefile: Staging tables installation: %/install: Add row_num column to derived staging tables so they will have a pkey
4261	08/28/2012 02:21 PM	Aaron Marcuse-Kubitza	sql.py: pkey(): Use pkey_col constant if this column exists, to allow using a row_num column as the pkey even when it is placed at the end of the table (due to being added after the table was created)
4260	08/28/2012 01:59 PM	Aaron Marcuse-Kubitza	input.Makefile: Staging tables installation: %/install: Support alternative generation of a staging table by joining together other staging tables in a create.sql file
4259	08/28/2012 01:57 PM	Aaron Marcuse-Kubitza	input.Makefile: Staging tables installation: %/install: Don't create a row_num column when the table is a joined table because it collides during joins
4258	08/28/2012 01:49 PM	Aaron Marcuse-Kubitza	csv2db: Made input_cmd optional when errors_table_only is on, because the CSV header is not needed to create the errors table
4257	08/28/2012 01:47 PM	Aaron Marcuse-Kubitza	csv2db: Added has_row_num param to disable creating a row_num column
4256	08/28/2012 12:44 PM	Aaron Marcuse-Kubitza	input.Makefile: Existing maps discovery: $(allTables): When prepending unsorted (joined) tables, save them in $(joinedTables) for later use in determining which tables should have a row_num column
4255	08/28/2012 12:27 PM	Aaron Marcuse-Kubitza	README.TXT: Fixed indent
4254	08/28/2012 12:04 PM	Aaron Marcuse-Kubitza	input.Makefile: Staging tables installation: Install all tables, not just those present in import_order.txt. This will later allow staging tables to be derived by joining together other staging tables, which themselves are not imported but still need to be installed.
4253	08/28/2012 11:53 AM	Aaron Marcuse-Kubitza	input.Makefile: Existing maps discovery: $(tables): Prepend unsorted tables (those that are not present in import_order.txt)
4252	08/28/2012 11:04 AM	Aaron Marcuse-Kubitza	input.Makefile: Renamed "...-%" targets to "%/..." so they are more logically associated with a specific subdir
4251	08/28/2012 10:54 AM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added Madidi terms that don't exist in other datasources
4250	08/28/2012 10:47 AM	Aaron Marcuse-Kubitza	inputs/Madidi/0.plots/map.csv: Added [Veg+] to root to enable auto-mapping
4249	08/28/2012 10:35 AM	Aaron Marcuse-Kubitza	inputs/import.stats.xls: Updated with stats from latest import
4248	08/27/2012 10:47 PM	Aaron Marcuse-Kubitza	inputs/SALVIAS*/1.organisms/map.csv: Map directly to locationID, plotName instead of parentLocationID, parentPlotName because these terms now map correctly to the parent location when a subplot column exists
4247	08/27/2012 10:43 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: plotName -> /location/authorlocationcode mapping: When subplot is provided, remove this mapping using _if ... _exists instead of _alt so that a NULL subplot value will not cause the parent plot's name to be used for the subplot name
4246	08/27/2012 10:34 PM	Aaron Marcuse-Kubitza	input.Makefile: Testing: $(runTest): Remove outputs of successful tests to reduce clutter
4245	08/27/2012 10:32 PM	Aaron Marcuse-Kubitza	input.Makefile: Testing: %/test.staging.xml: Don't create test.staging.xml at all for non-flat-file inputs, because it is not needed (diff does not run in this case)
4244	08/27/2012 10:23 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Fixed bug where "if subplot" conditions would evaluate to true only if the subplot was NOT NULL, when they should actually evaluate to true if the datasource specified any subplot column, nullable or not
4243	08/27/2012 10:14 PM	Aaron Marcuse-Kubitza	xml_func.py: simplify(): Removed no longer needed hardcoded _if simplifying code now that there is an _if() simplifying function
4242	08/27/2012 10:10 PM	Aaron Marcuse-Kubitza	db_xml.py: input_col_prefix: Use value of xml_func.var_name_prefix, which is now the place where this value is configured
4241	08/27/2012 10:09 PM	Aaron Marcuse-Kubitza	db_xml.py: Moved input_col_prefix above the put() function that uses it
4240	08/27/2012 10:09 PM	Aaron Marcuse-Kubitza	xml_func.py: Added _if() simplifying function
4239	08/27/2012 10:07 PM	Aaron Marcuse-Kubitza	xml_func.py: Added is_var_name() and is_var()
4238	08/27/2012 10:06 PM	Aaron Marcuse-Kubitza	xml_dom.py: Added NodeEntryIter
4237	08/27/2012 09:33 PM	Aaron Marcuse-Kubitza	xml_func.py: Added _exists()
4236	08/27/2012 09:30 PM	Aaron Marcuse-Kubitza	xml_func.py: simplify(): Added support for custom simplifying functions, which are not hard-coded in simplify()
4235	08/27/2012 09:19 PM	Aaron Marcuse-Kubitza	xml_dom.py: replace_with_text(): Use new bool2str() so that False causes the node to be removed instead of replaced with the empty string
4234	08/27/2012 09:18 PM	Aaron Marcuse-Kubitza	xml_dom.py: Added bool2str()
4233	08/27/2012 08:56 PM	Aaron Marcuse-Kubitza	inputs/SALVIAS*/1.organisms/map.csv: Mapped subplot, Line to new subplot VegCore term
4232	08/27/2012 08:54 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped subplot, which involved replacing an _if with _alt to both remove plotName as the authorlocationcode and use subplot instead when subplot is specified
4231	08/27/2012 08:47 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: locationID, plotName: Redirect to /location/parent_id/location/* if subplot field is specified
4230	08/27/2012 08:42 PM	Aaron Marcuse-Kubitza	xml_func.py: simplify(): Also remove _if statements with only a condition. This is a required transformation, because such _if statements can't be handled by functions._if() due to there being no argument to provide the anyelement type.
4229	08/27/2012 08:06 PM	Aaron Marcuse-Kubitza	xml_func.py: simplify(): Added pruning optimization that removes empty children. Empty children are created when some mappings don't apply to the current datasource.
4228	08/27/2012 07:58 PM	Aaron Marcuse-Kubitza	xml_func.py: simplify(): Only generate children list if node is a function
4227	08/27/2012 07:33 PM	Aaron Marcuse-Kubitza	xml_func.py: simplify(): Refactored to support processing nodes that are not functions. Changed var names for clarity.
4226	08/27/2012 06:55 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: _simplifyPath() calls: Removed no longer needed `require` arg, and removed no longer needed table suffix from `next` arg
4225	08/27/2012 06:51 PM	Aaron Marcuse-Kubitza	db_xml.py: put(): _simplifyPath() built-in function: Removed `require` param, which is not used by this _simplifyPath() implementation because the database constraints handle this
4224	08/27/2012 05:56 PM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added subplot
4223	08/27/2012 05:30 PM	Aaron Marcuse-Kubitza	input.Makefile: SVN: add: Also add empty import_order.txt
4222	08/27/2012 05:30 PM	Aaron Marcuse-Kubitza	lib/common.Makefile: SVN: Added $(addFile)
4221	08/27/2012 05:26 PM	Aaron Marcuse-Kubitza	input.Makefile: SVN: add: Don't automatically add a Specimen subdir, because some plots datasources don't have that table
4220	08/27/2012 05:23 PM	Aaron Marcuse-Kubitza	README.TXT: Datasource setup: Adding input data: Added step to add <table> to inputs/<datasrc>/import_order.txt
4219	08/27/2012 04:48 PM	Aaron Marcuse-Kubitza	README.TXT: Datasource setup: Changed "<name>" to "<datasrc>" to distinguish it more clearly from "<table>", which is also a name
4218	08/27/2012 04:45 PM	Aaron Marcuse-Kubitza	README.TXT: Datasource setup: Adding input data: Changed steps to use new %/add command to add table's subdir
4217	08/27/2012 04:36 PM	Aaron Marcuse-Kubitza	input.Makefile: SVN: Added %/add to add a new table subdir. add: Changed default subdir name to Specimen to match suggested table names at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV#Suggested-table-names>. Use new %/add to add it.
4216	08/27/2012 04:18 PM	Aaron Marcuse-Kubitza	inputs/import.stats.xls: Updated with stats from latest import
4215	08/24/2012 07:56 PM	Aaron Marcuse-Kubitza	README.TXT: Datasource setup: Replaced fixed table names with link to VegCSV suggested table names
4214	08/24/2012 07:43 PM	Aaron Marcuse-Kubitza	input.Makefile: $(srcsOnly): Include only files ending in one of the data extensions: csv tsv txt xml. This allows the data provider to include other documentation files, such as SQL export queries, in the table subdirs.
4213	08/24/2012 07:24 PM	Aaron Marcuse-Kubitza	bin/map: Documented that it is duplicate-column safe (supports multiple columns of the same name)
4212	08/24/2012 07:10 PM	Aaron Marcuse-Kubitza	README.TXT: Datasource setup: Obtaining CSVs: Documented that when exporting relational databases to CSVs, you MUST ensure that embedded quotes are escaped by doubling them, not by preceding them with a "\" as is the default in phpMyAdmin
4211	08/24/2012 07:00 PM	Aaron Marcuse-Kubitza	csvs.py: delims: Added ";", which is phpMyAdmin's default CSV delimiter
4210	08/24/2012 06:50 PM	Aaron Marcuse-Kubitza	sql_io.py: null_strs: Added 'NULL', which is used by phpMyAdmin as the default "Replace NULL with" value for CSV exports
4209	08/24/2012 06:48 PM	Aaron Marcuse-Kubitza	sql_io.py: cleanup_table(): Refactored to use for loop with array constant, so that additional NULL-equivalent strings can easily be added
4208	08/24/2012 06:30 PM	Aaron Marcuse-Kubitza	mappings/roots/: Merged roots for different tables into one mappings/root.sh for Veg+, which handles all tables' mappings to VegBIEN
4207	08/24/2012 04:31 PM	Aaron Marcuse-Kubitza	sql_io.py: put_table(): When ignoring all rows for an iteration, return literal NULL value instead of column of NULLs as an optimization for callers using that iteration's pkeys
4206	08/24/2012 12:20 PM	Aaron Marcuse-Kubitza	inputs/import.stats.xls: Updated with stats from latest import
4205	08/23/2012 05:32 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Primary taxondetermination: Removed [role=identifier] because the role of the entity making the determination is unknown. Added [!isoriginal] filter to those mappings to ensure that primary taxondetermination XPaths map to a different taxondetermination than the [isoriginal=true] determination when both are present.
4204	08/23/2012 05:24 PM	Aaron Marcuse-Kubitza	inputs/SALVIAS/1.organisms/map.csv: Remapped cfaff to identificationQualifier, because it was previously mapped to the same taxondetermination as the Orig terms but does not have a corresponding Orig prefix to indicate that it should apply to the original determination instead of the primary TNRS one
4203	08/23/2012 05:19 PM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Removed no longer used computer.* taxonomic terms
4202	08/23/2012 05:19 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Removed no longer used computer.* taxonomic terms
4201	08/23/2012 05:18 PM	Aaron Marcuse-Kubitza	inputs: Regenerated VegBIEN.csv for several datasources, which had apparently not gotten regenerated when make was run after the taxonRank mapping addition
4200	08/23/2012 05:00 PM	Aaron Marcuse-Kubitza	backups/: svn:ignore: Also ignore .*, which includes temp files generated by rsync
4199	08/23/2012 04:58 PM	Aaron Marcuse-Kubitza	xml_func.py: simplify(): Also consider _name() to be an aggregate function
4198	08/23/2012 04:57 PM	Aaron Marcuse-Kubitza	xml_func.py: simplify(): Also consider _name() to be an aggregate function
4197	08/23/2012 04:49 PM	Aaron Marcuse-Kubitza	inputs/SALVIAS/1.organisms/map.csv: Removed computer. prefix from primary (TNRS) taxondetermination, so it would map to the main taxondetermination in VegBIEN
4196	08/23/2012 04:46 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped taxonRank analogously to computer.taxonRank
4195	08/23/2012 04:34 PM	Aaron Marcuse-Kubitza	inputs/SALVIAS/1.organisms/map.csv: Remapped OrigFamily/OrigGenus/OrigSpecies to new verbatim taxonomic names. Also remapped cfaff to verbatimIdentificationQualifier, because it was previously mapped to the same taxondetermination as the Orig* terms, but this will later need to be remapped to identificationQualifier (not in this commit because that is a separate change). Note that the switch to the verbatim* taxonomic names removes a concatenated binomial that was part of the previous mappings, which put OrigGenus and OrigSpecies together into one scientificName.
4194	08/23/2012 03:34 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped verbatimScientificName to taxonoccurrence.authortaxoncode as an alternative to scientificName
4193	08/23/2012 03:12 PM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: Mapped verbatim* taxonomic terms
4192	08/23/2012 03:10 PM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added verbatimIdentificationQualifier
4191	08/23/2012 03:07 PM	Aaron Marcuse-Kubitza	mappings/Veg+.terms.csv: Added verbatimScientificName
4190	08/23/2012 03:06 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: taxondetermination: taxondetermination_unique unique index: Added isoriginal so an "original" determination in the same row (as found in SALVIAS) will be seen as distinct from the scrubbed determination, even if they are to the same plant name

Project

General

Profile