/ - Changes - BIEN 3 - NCEAS Projects

root @ 1809

#	Date	Author	Comment
1809	04/09/2012 12:40 PM	Aaron Marcuse-Kubitza	xml_dom.py: Added clean_comment() and mk_comment() to properly sanitize comment contents (comments can't contain '--')
1808	04/09/2012 12:14 PM	Aaron Marcuse-Kubitza	Added inputs/TRTE
1807	04/03/2012 08:26 PM	Aaron Marcuse-Kubitza	inputs/QMOR/test: Added initial accepted test outputs
1806	04/03/2012 08:26 PM	Aaron Marcuse-Kubitza	inputs/QMOR/maps: Added maps
1805	04/03/2012 08:20 PM	Aaron Marcuse-Kubitza	Added inputs/QMOR
1804	04/03/2012 08:14 PM	Aaron Marcuse-Kubitza	inputs/MT/test: Added initial accepted test outputs
1803	04/03/2012 08:14 PM	Aaron Marcuse-Kubitza	inputs/MT/maps: Added maps
1802	04/03/2012 08:13 PM	Aaron Marcuse-Kubitza	mappings/Makefile: DwC-VegBIEN.specimens.csv: Don't call remove_empty to produce it, because join now deals with empty mappings correctly by still raising a warning. Removed no longer needed intermediate DwC.ci-VegBIEN.specimens.csv.
1801	04/03/2012 08:09 PM	Aaron Marcuse-Kubitza	join: Also print "No join mapping" warning if a join mapping was found but it was empty. The warning in that case is actually "No non-empty join mapping" to distinguish it from a mapping that's missing entirely. input.Makefile: missing_mappings: Support new "No join mapping" error message.
1800	04/03/2012 08:08 PM	Aaron Marcuse-Kubitza	join: Also print "No join mapping" warning if a join mapping was found but it was empty. The warning in that case is actually "No non-empty join mapping" to distinguish it from a mapping that's missing entirely. input.Makefile: missing_mappings: Support new "No join mapping" error message.
1799	04/03/2012 07:33 PM	Aaron Marcuse-Kubitza	Added inputs/MT
1798	04/03/2012 07:26 PM	Aaron Marcuse-Kubitza	Added disown_all to disown all running jobs
1797	04/03/2012 07:26 PM	Aaron Marcuse-Kubitza	stop_imports: Call jobspecs relative to $selfDir, rather than assuming it will be run from the svn root dir
1796	04/03/2012 07:18 PM	Aaron Marcuse-Kubitza	union: Call maps.merge_headers() using **dict(prefer=header_num) instead of just prefer=header_num in order to work on Python 2.5.2 (which nimoy is running)
1795	04/03/2012 07:00 PM	Aaron Marcuse-Kubitza	inputs/ACAD/test: Accepted initial test outputs
1794	04/03/2012 07:00 PM	Aaron Marcuse-Kubitza	Added inputs/ACAD/maps/ maps
1793	04/03/2012 06:59 PM	Aaron Marcuse-Kubitza	Accepted new test outputs resulting from the addition of the id -> occurrenceID mapping in mappings/DwC1-DwC2.specimens.csv
1792	04/03/2012 06:57 PM	Aaron Marcuse-Kubitza	inputs/SALVIAS*/maps: Cleaned up maps for the first time since all via maps became subject to cleanup
1791	04/03/2012 06:55 PM	Aaron Marcuse-Kubitza	input.Makefile: Removed no longer needed default "maps/.$(via).%.csv.last_cleanup" rule
1790	04/03/2012 06:54 PM	Aaron Marcuse-Kubitza	input.Makefile: Maps building: Via maps cleanup: Added `env ignore=1` since with the switch to subtracting $(coreMap), all inputs will attempt to subtract some map, even if it's not subtractable
1789	04/03/2012 06:47 PM	Aaron Marcuse-Kubitza	input.Makefile: Don't clean src maps, only build them
1788	04/03/2012 06:45 PM	Aaron Marcuse-Kubitza	inputs/ARIZ/maps/DwC.specimens.csv: Re-cleaned up to take advantage of additional entries now removed by subtract
1787	04/03/2012 06:36 PM	Aaron Marcuse-Kubitza	input.Makefile: Maps building: Via maps cleanup: Subtract $(coreMap) instead of $(coreSelfMap) so that entries whose input and output maps to the same place are subtracted as well
1786	04/03/2012 06:35 PM	Aaron Marcuse-Kubitza	subtract: Also remove mappings whose input and output maps to the same non-empty value in map_1
1785	04/03/2012 06:32 PM	Aaron Marcuse-Kubitza	util.py: Added all_equal(), all_equal_ignore_none(), have_same_value()
1784	04/03/2012 05:45 PM	Aaron Marcuse-Kubitza	mappings/DwC1-DwC2.specimens.csv: Added id -> occurrenceID mapping
1783	04/03/2012 05:43 PM	Aaron Marcuse-Kubitza	inputs/SALVIAS-CSV/maps/VegX.%.full.csv: Regenerated using new src maps
1782	04/03/2012 05:41 PM	Aaron Marcuse-Kubitza	mappings/DwC1-DwC2.specimens.csv: Added mappings from dcterms elements without namespace to with namespace
1781	04/03/2012 05:40 PM	Aaron Marcuse-Kubitza	inputs/SALVIAS-CSV: Built maps/src.%.csv
1780	04/03/2012 05:24 PM	Aaron Marcuse-Kubitza	Added inputs/ACAD/maps/src.specimens.csv
1779	04/03/2012 05:23 PM	Aaron Marcuse-Kubitza	input.Makefile: Maps building: Autogen src maps with known table names. Sources: $(withCatSrcs): Fixed bug where substitution pattern did not contain %.
1778	04/03/2012 05:22 PM	Aaron Marcuse-Kubitza	Added src_map to make a source map spreadsheet from a CSV header
1777	04/03/2012 04:32 PM	Aaron Marcuse-Kubitza	input.Makefile: Split Maps section into "Existing maps discovery" and "Maps building" sections. Sources: Added cat, cat-% to cat out sources.
1776	04/03/2012 04:17 PM	Aaron Marcuse-Kubitza	input.Makefile: Factored out sources-related code to new Sources section
1775	04/03/2012 04:08 PM	Aaron Marcuse-Kubitza	input.Makefile: $(srcMaps): Removed `$(filter-out maps/src.join.%.csv,...)` because maps/src.join.%.csv are no longer created
1774	04/03/2012 03:47 PM	Aaron Marcuse-Kubitza	README.TXT: Schema changes: Split updating graphical ERD exports into separate section. Update graphical ERD exports: Added schemas/vegbien.ERD.core.pdf .
1773	04/03/2012 03:42 PM	Aaron Marcuse-Kubitza	README.TXT: Added Datasource setup section with instructions to add a new datasource
1772	04/03/2012 03:38 PM	Aaron Marcuse-Kubitza	Added inputs/ACAD
1771	04/03/2012 03:37 PM	Aaron Marcuse-Kubitza	input.Makefile: Only setSvnIgnore the input dir, since it already exists and doesn't need to be added (inputs/Makefile adds it)
1770	04/03/2012 03:23 PM	Aaron Marcuse-Kubitza	inputs/*/maps/DwC.specimens.csv: Removed extranenous XML meta info from DwC column root, since it now just needs to be present in the core via map mappings/DwC-VegBIEN.specimens.csv
1769	04/03/2012 03:22 PM	Aaron Marcuse-Kubitza	union: Use new maps.merge_headers() to write properly combined header
1768	04/03/2012 03:21 PM	Aaron Marcuse-Kubitza	maps.py: join_combinable(): Fixed roots_combinable() to run on col names instead of roots, which were passed in. merge_mappings(): Factored out mapping column combining into merge_mapping_cols(), which handles an optional prefer param as well to take the header_num env var. Added merge_headers().
1767	04/03/2012 03:17 PM	Aaron Marcuse-Kubitza	util.py: Added sort_by_len(), shortest(), longest()
1766	04/03/2012 02:12 PM	Aaron Marcuse-Kubitza	join: Use new maps.join_combinable() to check if column names match
1765	04/03/2012 02:11 PM	Aaron Marcuse-Kubitza	maps.py: Added cols_combinable() and use it in combinable(). Added join_combinable() and associates helper functions. Added documentation labels to each section.
1764	04/03/2012 01:13 PM	Aaron Marcuse-Kubitza	xml_parse.py: ConsecXmlInputStream: Removed read() because that's now defined in streams.FilterStream
1763	04/03/2012 01:11 PM	Aaron Marcuse-Kubitza	xml_parse.py: parse_next(): Strip control characters from input stream because they mess up the parser
1762	04/03/2012 01:10 PM	Aaron Marcuse-Kubitza	streams.py: FilterStream: Forward all reads to readline()
1761	04/03/2012 01:08 PM	Aaron Marcuse-Kubitza	strings.py: Added is_ctrl() and strip_ctrl()
1760	04/03/2012 08:34 AM	Aaron Marcuse-Kubitza	xml_parse.py: parse_next(): On parser error, advance to next XML document since the rest of the current document is corrupted
1759	04/03/2012 08:33 AM	Aaron Marcuse-Kubitza	streams.py: Added consume(). Added documentation labels to each section.
1758	04/03/2012 08:23 AM	Aaron Marcuse-Kubitza	bin/map: For XML inputs, wrap sys.stdin in a LineCountStream and use new xml_parse.docs_iter() on_error() to add input line # to XML parsing exceptions
1757	04/03/2012 08:21 AM	Aaron Marcuse-Kubitza	xml_parse.py: Added on_error() handler to parse_next() (passed through by docs_iter()), so that the caller can add useful info like the input line # to the exception message, and decide not to suppress rather than re-raising the exception
1756	04/03/2012 07:19 AM	Aaron Marcuse-Kubitza	VegX-VegBIEN.organisms.csv: Renamed individualOrganismObservation user-defined field identificationLabel2 to identificationLabel. Distinguish what are now two identificationLabel fields of the same name by tagging each one with [@id=2] or [@id=1]. inputs/SALVIAS-CSV/maps/VegX.organisms.csv: Merge tag1/stem_tag1 and tag2/stem_tag2 using _alt, since they are never set to different values when both are not NULL (although sometimes just one or just the other is not NULL).
1755	04/02/2012 05:37 PM	Aaron Marcuse-Kubitza	VegX-VegBIEN.organisms.csv: Renamed individualOrganismObservation user-defined field tag2 to identificationLabel2 to reflect that it will become a second instance of identificationLabel
1754	04/02/2012 05:31 PM	Aaron Marcuse-Kubitza	VegX-VegBIEN.organisms.csv: Re-mapped individualOrganismObservation user-defined field lineCover to already existing volumeCanopy
1753	04/02/2012 05:29 PM	Aaron Marcuse-Kubitza	VegX-VegBIEN.organisms.csv: Re-mapped individualOrganismObservation user-defined field cover to already existing attribute.coverPercent
1752	04/02/2012 05:13 PM	Aaron Marcuse-Kubitza	VegX-VegBIEN.organisms.csv: Re-mapped individualOrganismObservation user-defined field count to already existing aggregateOrganismObservation.aggregateValue
1751	04/02/2012 04:44 PM	Aaron Marcuse-Kubitza	vegbien.ERD.mwb: Fixed lines
1750	04/02/2012 01:50 PM	Aaron Marcuse-Kubitza	README.TXT: Documented that `make reinstall_db` will delete your VegBIEN DB
1749	04/02/2012 01:48 PM	Aaron Marcuse-Kubitza	README.TXT: Documented that `make empty_db` will delete your VegBIEN DB
1748	04/02/2012 01:44 PM	Aaron Marcuse-Kubitza	root Makefile: empty_db: Confirm deletion just like for rm_db. rm_db: put $(confirmRmDb) on a separate line and move the $(error) call to the main $(confirm) macro since you always want to abort make if the user cancels (not just not run that command).
1747	04/02/2012 01:34 PM	Aaron Marcuse-Kubitza	root Makefile: rm_db: If user cancels, abort in case target was reinstall_db to prevent installing
1746	04/02/2012 01:28 PM	Aaron Marcuse-Kubitza	root Makefile: core, rm_core: Fixed bug where no longer existing prerequisites postgres_user, rm_postgres_user were not removed
1745	04/02/2012 01:25 PM	Aaron Marcuse-Kubitza	root Makefile: rm_db: Confirm deletion with user. Merged postgres_user, rm_postgres_user into db, rm_db so that deletion confirmation applies to user deletion as well (which would indirectly cause the DB to be deleted).
1744	04/02/2012 01:04 PM	Aaron Marcuse-Kubitza	README.TXT: Testing: Updated to add missing mappings
1743	04/02/2012 01:03 PM	Aaron Marcuse-Kubitza	root Makefile: test-all: Added missing_mappings
1742	04/02/2012 01:00 PM	Aaron Marcuse-Kubitza	Moved maps validation targets from main Makefile to input.Makefile. main Makefile: maps validation: Summarize the output of the inputs' maps validations.
1741	04/02/2012 12:22 PM	Aaron Marcuse-Kubitza	Makefile: Also find missing input mappings, in addition to missing join mappings
1740	04/02/2012 12:21 PM	Aaron Marcuse-Kubitza	join: Also produce warnings for no input mapping (if no comment explaining why no input mapping), in addition to no join mapping
1739	04/02/2012 12:21 PM	Aaron Marcuse-Kubitza	join: Also produce warnings for no input mapping (if no comment explaining why no input mapping), in addition to no join mapping
1738	04/02/2012 12:20 PM	Aaron Marcuse-Kubitza	inputs/NY/maps/DwC.specimens.csv: Documented why there is no input mapping for key
1737	04/02/2012 11:29 AM	Aaron Marcuse-Kubitza	VegX-VegBIEN.organisms.csv: Renamed individualOrganismObservation user-defined fields stem* to remove the stem* prefix to be consistent with VegBIEN
1736	04/02/2012 11:23 AM	Aaron Marcuse-Kubitza	VegX-VegBIEN.organisms.csv: Renamed individualOrganismObservation/plotObservation user-defined fields sourceaccessioncode to sourceAccessionCode to be consistent with VegX case sensitivity
1735	04/02/2012 11:19 AM	Aaron Marcuse-Kubitza	VegX-VegBIEN.organisms.csv: Renamed individualOrganismObservation user-defined field interceptCm to lineCover to be consistent with VegBIEN
1734	04/02/2012 11:18 AM	Aaron Marcuse-Kubitza	VegX-VegBIEN.organisms.csv: Renamed individualOrganismObservation user-defined field individualCode to authorPlantCode to be consistent with VegBIEN
1733	04/02/2012 11:17 AM	Aaron Marcuse-Kubitza	VegX-VegBIEN.organisms.csv: Renamed individualOrganismObservation user-defined field htFirstBranchM to heightFirstBranch to be consistent with VegBIEN
1732	04/02/2012 11:15 AM	Aaron Marcuse-Kubitza	VegX-VegBIEN.organisms.csv: Renamed individualOrganismObservation user-defined field coverPercent to cover to be consistent with VegBIEN
1731	04/02/2012 11:12 AM	Aaron Marcuse-Kubitza	VegX-VegBIEN.organisms.csv: Renamed abioticObservation user-defined field siltPercent to silt to be consistent with VegBIEN
1730	04/02/2012 11:11 AM	Aaron Marcuse-Kubitza	VegX-VegBIEN.organisms.csv: Renamed abioticObservation user-defined field sandPercent to sand to be consistent with VegBIEN
1729	04/02/2012 11:10 AM	Aaron Marcuse-Kubitza	VegX-VegBIEN.organisms.csv: Renamed abioticObservation user-defined field pottasium to potassium to be consistent with VegBIEN
1728	04/02/2012 11:08 AM	Aaron Marcuse-Kubitza	VegX-VegBIEN.organisms.csv: Renamed abioticObservation user-defined field organicPercent to organic to be consistent with VegBIEN
1727	04/02/2012 11:07 AM	Aaron Marcuse-Kubitza	VegX-VegBIEN.organisms.csv: Renamed abioticObservation user-defined field clayPercent to clay to be consistent with VegBIEN
1726	04/02/2012 11:06 AM	Aaron Marcuse-Kubitza	VegX-VegBIEN.organisms.csv: Renamed abioticObservation user-defined field cationCap to cationExchangeCapacity to be consistent with VegBIEN
1725	04/02/2012 11:02 AM	Aaron Marcuse-Kubitza	VegX-VegBIEN.organisms.csv: Renamed plotObservation user-defined field precipMm to precipitation to be consistent with VegBIEN
1724	04/02/2012 10:56 AM	Aaron Marcuse-Kubitza	VegX-VegBIEN.organisms.csv: Changed plotObservation user-defined field plotMethodology to /simpleUserdefined[name=method]/*ID/method/name
1723	04/02/2012 09:47 AM	Aaron Marcuse-Kubitza	schemas/postgresql.nimoy.conf: Increased default_statistics_target to 8.4 default value to improve execution query plans
1722	04/02/2012 09:43 AM	Aaron Marcuse-Kubitza	Added schemas/postgresql.Mac.conf (for tuning developers' local testing DBs)
1721	04/02/2012 09:42 AM	Aaron Marcuse-Kubitza	schemas/postgresql*.conf: Increased checkpoint_segments and checkpoint_completion_target so that checkpoints (performance intensive) are written less often and load-balanced better
1720	04/02/2012 08:35 AM	Aaron Marcuse-Kubitza	xml_dom.py: Don't print whitespace from parsed XML document when pretty-printing XML. minidom modifications section: Added subsection labels for the class each modification applies to.
1719	04/02/2012 08:20 AM	Aaron Marcuse-Kubitza	Parser.py: Renamed SyntaxException to SyntaxError because it's an unexpected condition that should exit the program, a.k.a. an error
1718	04/02/2012 08:05 AM	Aaron Marcuse-Kubitza	bin/map: process_rows(): When iterating over each row, only retrieve the next row if the end (limit of # of rows) has not been reached. This prevents the next row from being fetched, possibly causing an entire additional consecutive XML document to be parsed, if the limit has already been reached. This is primarily useful for XML inputs with a ".0.top" segment prepended before the other documents, which contains just the first two nodes for fast parsing of this smaller XML document when only the first two nodes are needed for testing. Without this fix, the ".0.top" segment would have needed to contain the first three nodes instead.
1717	04/02/2012 07:55 AM	Aaron Marcuse-Kubitza	inputs/XAL: Accepted initial test outputs
1716	04/02/2012 07:54 AM	Aaron Marcuse-Kubitza	inputs/XAL: Added maps
1715	04/02/2012 07:52 AM	Aaron Marcuse-Kubitza	bin/map: Extended consecutive XML document support to direct-XML inputs (without a map spreadsheet). Factored out consecutive XML document row-iteration code into helper method get_rows() which does the iters.flatten() and itertools.imap() calls.
1714	04/02/2012 07:37 AM	Aaron Marcuse-Kubitza	bin/map: Fixed bug in iteration over consecutive XML documents where only the first element of the first document was processed. Use of iters.flatten() and itertools.imap() fixes this problem so that the consecutive XML documents are regarded as a continuous stream of rows.
1713	04/02/2012 07:16 AM	Aaron Marcuse-Kubitza	bin/map: Use new xml_parse.docs_iter() to iterate over each consecutive XML document in stdin
1712	04/02/2012 07:16 AM	Aaron Marcuse-Kubitza	xml_parse.py: Added support for parsing consecutive XML documents in a stream
1711	04/02/2012 07:01 AM	Aaron Marcuse-Kubitza	Added iters.py
1710	03/29/2012 10:33 PM	Aaron Marcuse-Kubitza	streams.py: Added FilterStream. Changed TracedStream to use FilterStream.

Project

General

Profile