Project

General

Profile

Statistics
| Revision:

# Date Author Comment
4200 08/23/2012 05:00 PM Aaron Marcuse-Kubitza

backups/: svn:ignore: Also ignore .*, which includes temp files generated by rsync

4199 08/23/2012 04:58 PM Aaron Marcuse-Kubitza

xml_func.py: simplify(): Also consider _name() to be an aggregate function

4198 08/23/2012 04:57 PM Aaron Marcuse-Kubitza

xml_func.py: simplify(): Also consider _name() to be an aggregate function

4197 08/23/2012 04:49 PM Aaron Marcuse-Kubitza

inputs/SALVIAS*/1.organisms/map.csv: Removed computer.* prefix from primary (TNRS) taxondetermination, so it would map to the main taxondetermination in VegBIEN

4196 08/23/2012 04:46 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: Mapped taxonRank analogously to computer.taxonRank

4195 08/23/2012 04:34 PM Aaron Marcuse-Kubitza

inputs/SALVIAS*/1.organisms/map.csv: Remapped OrigFamily/OrigGenus/OrigSpecies to new verbatim* taxonomic names. Also remapped cfaff to verbatimIdentificationQualifier, because it was previously mapped to the same taxondetermination as the Orig* terms, but this will later need to be remapped to identificationQualifier (not in this commit because that is a separate change). Note that the switch to the verbatim* taxonomic names removes a concatenated binomial that was part of the previous mappings, which put OrigGenus and OrigSpecies together into one scientificName.

4194 08/23/2012 03:34 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: Mapped verbatimScientificName to taxonoccurrence.authortaxoncode as an alternative to scientificName

4193 08/23/2012 03:12 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: Mapped verbatim* taxonomic terms

4192 08/23/2012 03:10 PM Aaron Marcuse-Kubitza

mappings/Veg+.terms.csv: Added verbatimIdentificationQualifier

4191 08/23/2012 03:07 PM Aaron Marcuse-Kubitza

mappings/Veg+.terms.csv: Added verbatimScientificName

4190 08/23/2012 03:06 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: taxondetermination: taxondetermination_unique unique index: Added isoriginal so an "original" determination in the same row (as found in SALVIAS) will be seen as distinct from the scrubbed determination, even if they are to the same plant name

4189 08/23/2012 02:57 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: taxonomic terms: Removed ":[isoriginal=true]" because there may be multiple determinations for an organism (either in separate rows or, for SALVIAS, in separate columns), and not all will be the original determination

4188 08/23/2012 02:43 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: taxondetermination.role: Default to 'unknown' so that the field is optional

4187 08/23/2012 02:41 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: role enum: Added 'unknown' value

4186 08/23/2012 02:20 PM Aaron Marcuse-Kubitza

mappings/Veg+.terms.csv: Added verbatim* taxonomic terms

4185 08/23/2012 02:12 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import

4184 08/22/2012 04:56 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import

4183 08/22/2012 04:31 PM Aaron Marcuse-Kubitza

inputs: Regenerated maps for changes to bin/union, which removes empty mappings. Added /_alt suffix where needed.

4182 08/22/2012 03:23 PM Aaron Marcuse-Kubitza

inputs: Move src subdir into main dir, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV_subfolders#Move-src-subdir-into-main-dir>

4181 08/22/2012 02:02 PM Aaron Marcuse-Kubitza

input.Makefile: $(tables): Allow datasource to specify custom import order in src/import_order.txt

4180 08/22/2012 01:29 PM Aaron Marcuse-Kubitza

mappings/Veg+.terms.csv: growthForm: Documented source of standard terms

4179 08/22/2012 10:21 AM Aaron Marcuse-Kubitza

inputs/SALVIAS*/src/1.organisms/map.csv: Removed no longer applicable comments, which related to mappings that were in effect long ago

4178 08/22/2012 10:09 AM Aaron Marcuse-Kubitza

inputs/SALVIAS/src/2.stems/map.csv: Added comments from corresponding SALVIAS-CSV organisms columns

4177 08/22/2012 09:54 AM Aaron Marcuse-Kubitza

inputs/SALVIAS*/src/1.organisms/map.csv: Habit: Mapped to new Veg+ habit term

4176 08/22/2012 09:53 AM Aaron Marcuse-Kubitza

inputs/SALVIAS*/src/1.organisms/map.csv: Habit: Don't filter out values not part of the provided terms list, because such values should be flagged as invalid in the error maps rather than silently discarded. This also ensures that any valid values which are not part of the provided terms list are kept.

4175 08/22/2012 09:45 AM Aaron Marcuse-Kubitza

mappings/Veg+-VegCore.csv: habit: Map to new verbatimGrowthForm since this field is not necessarily standardized

4174 08/22/2012 09:42 AM Aaron Marcuse-Kubitza

mappings/Makefile: Veg+.cs-VegBIEN.csv: Join new Veg+-VegCore.to_self.csv (self-join), instead of Veg+-VegCore.csv, to VegCore-VegBIEN.csv, to support two-level chains of mappings in Veg+-VegCore.csv

4173 08/22/2012 09:40 AM Aaron Marcuse-Kubitza

mappings/Veg+-VegCore.csv: /_alt pass through mappings: Removed comment because the two-level mapping propagates it to all fields ending in /_alt, even though it doesn't apply to them, causing the main VegBIEN map and several datasources' maps to change unnecessarily. Also, the comment is not completely accurate because /_alt pass throughs are now used primarily to support idempotent self-joins of Veg+-VegCore.csv.

4172 08/22/2012 09:21 AM Aaron Marcuse-Kubitza

union: Don't eliminate duplicate rows based on matches between map_0's output column and map_1's input column, because union is now being used for self-joins and it is legitimate for a term to appear as both an input and an output

4171 08/22/2012 09:10 AM Aaron Marcuse-Kubitza

sql_io.py: put_table(): MissingCastException: Use strings.repr_no_u() instead of strings.urepr() in order to remove the u in u'...' for Unicode strings

4170 08/21/2012 09:48 AM Aaron Marcuse-Kubitza

README.TXT: After a new import: Updated commands for new subdirs layout

4169 08/21/2012 09:42 AM Aaron Marcuse-Kubitza

Regenerated vegbien.ERD exports

4168 08/21/2012 09:34 AM Aaron Marcuse-Kubitza

mappings: Added autogen Veg+-VegCore.to_self.csv, which is Veg+-VegCore.csv joined to itself, and use it as an intermediate map to join to VegCore-VegBIEN.csv. This provides support for two-level chains of mappings in Veg+-VegCore.csv.

4167 08/21/2012 09:31 AM Aaron Marcuse-Kubitza

mappings/Veg+-VegCore.csv: Changed output root to Veg+, to allow mappings/Veg+-VegCore.csv to be joined with itself idempotently, for supporting multi-level chains of mappings

4166 08/21/2012 09:27 AM Aaron Marcuse-Kubitza

mappings/Veg+-VegCore.csv: Add pass through /_alt mapping for all terms in this map that are merged with _alt, to allow datasource to define custom mappings that don't pass through the default mapping. This also allows mappings/Veg+-VegCore.csv to be joined with itself idempotently, to support multi-level chains of mappings.

4165 08/21/2012 09:19 AM Aaron Marcuse-Kubitza

mappings/Veg+-VegCore.csv: authorPlantCode: Added _alt suffix to create the correct priority

4164 08/21/2012 09:13 AM Aaron Marcuse-Kubitza

union: Exclude empty rows from the output, so that empty mappings from map_0 aren't included when map_1 contains a non-empty mapping for the same term. Note that this causes "No non-empty join mapping" warnings to turn into "No join mapping".

4163 08/21/2012 09:08 AM Aaron Marcuse-Kubitza

ci_map: Run join_union_sort in quiet mode so that it doesn't add lots of "No non-empty join mapping" warnings to the Comments column

4162 08/21/2012 09:06 AM Aaron Marcuse-Kubitza

mappings/Veg+-VegCore.csv: scientificNameAuthor: Added scientificNameAuthorship mapping with /_alt/1, to ensure that it has priority over scientificNameAuthor and to ensure that it has an _alt suffix when a datasource contains both scientificNameAuthor and scientificNameAuthorship (such as SpeciesLink)

4161 08/21/2012 09:00 AM Aaron Marcuse-Kubitza

inputs/SpeciesLink/src/specimens/map.csv: Added explicit _alt suffix when multiple terms map to the same place

4160 08/21/2012 08:58 AM Aaron Marcuse-Kubitza

mappings/Veg+-VegCore.csv: scientificNameAuthor: Added scientificNameAuthorship mapping with /_alt/1, to ensure that it has priority over scientificNameAuthor and to ensure that it has an _alt suffix when a datasource contains both scientificNameAuthor and scientificNameAuthorship (such as SpeciesLink)

4159 08/21/2012 08:31 AM Aaron Marcuse-Kubitza

inputs/ARIZ/src/specimens/map.csv: RelatedCatalogItem mappings: Added _alt suffixes

4158 08/21/2012 08:09 AM Aaron Marcuse-Kubitza

union: Multi-support: When an input appears in both maps, treat an empty mapping as if it didn't exist so that it doesn't overwrite a non-empty mapping in the other map

4157 08/21/2012 07:51 AM Aaron Marcuse-Kubitza

mappings/Makefile: Veg+.cs-VegBIEN.csv: Join Veg+-VegCore.csv to VegCore-VegBIEN.csv in quiet mode, to avoid adding "No non-empty join mapping" to the Comments column

4156 08/21/2012 07:50 AM Aaron Marcuse-Kubitza

join: quiet mode: Turn off all warnings, not just "No input mapping" warnings. This is useful when join-unioning a synonymy to a primary map, which may have "No non-empty join mapping" for some terms but this should not be stored in the resulting map's Comments column.

4155 08/21/2012 07:30 AM Aaron Marcuse-Kubitza

mappings/Makefile: Rewrapped lines

4154 08/21/2012 07:28 AM Aaron Marcuse-Kubitza

mappings/Veg+-VegCore.csv: Added verbatimGrowthForm mapping

4153 08/21/2012 07:09 AM Aaron Marcuse-Kubitza

mappings/Veg+.terms.csv: verbatimGrowthForm: Added comment that additional values come from SALVIAS. As other datasources' custom growth form values are added, they can be added to this comment.

4152 08/21/2012 07:00 AM Aaron Marcuse-Kubitza

mappings/Veg+.terms.csv: Added verbatimGrowthForm

4151 08/21/2012 06:44 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: locationdetermination: Added verbatimlatitude, verbatimlongitude, verbatimcoordinates

4150 08/21/2012 06:22 AM Aaron Marcuse-Kubitza

schemas/functions.sql: Made aggregating functions polymorphic

4149 08/21/2012 06:16 AM Aaron Marcuse-Kubitza

xml_func.py: Removed no longer used _collapse()

4148 08/21/2012 06:13 AM Aaron Marcuse-Kubitza

xml_func.py: Removed no longer needed _if(), which has been translated to a SQL function

4147 08/21/2012 06:13 AM Aaron Marcuse-Kubitza

schemas/functions.sql: Added _if()

4146 08/21/2012 06:12 AM Aaron Marcuse-Kubitza

sql.py: function_exists(): Support overloaded functions

4145 08/21/2012 06:09 AM Aaron Marcuse-Kubitza

sql.py: run_query(): Parse "more than one" errors as DuplicateExceptions

4144 08/21/2012 05:42 AM Aaron Marcuse-Kubitza

xml_func.py: XML function specification documentation: Updated parameters

4143 08/21/2012 05:39 AM Aaron Marcuse-Kubitza

xml_func.py: Removed no longer needed _eq(), which has been translated to a SQL function

4142 08/21/2012 05:38 AM Aaron Marcuse-Kubitza

schemas/functions.sql: Added _eq()

4141 08/21/2012 05:37 AM Aaron Marcuse-Kubitza

sql.py: run_query(): Parse "could not determine polymorphic type because input has type "unknown"" errors as MissingCastExceptions to type text. This adds support for polymorphic SQL functions whose parameters are anyelement, etc.

4140 08/21/2012 05:35 AM Aaron Marcuse-Kubitza

sql_io.py: put_table(): sql.MissingCastException: Support unknown (None) columns, by casting all columns

4139 08/21/2012 05:30 AM Aaron Marcuse-Kubitza

sql.py: MissingCastException: Support unknown (None) columns

4138 08/21/2012 05:29 AM Aaron Marcuse-Kubitza

xml_dom.py: replace_with_text(): Support bool `new` values

4137 08/21/2012 04:22 AM Aaron Marcuse-Kubitza

input.Makefile: Determine import order from sorted order of all non-hidden subdirs, instead of from fixed constant. This allows datasources to specify arbitrary tables, rather than being limited to 0.plots, 1.organisms, 2.stems, specimens.

4136 08/21/2012 04:14 AM Aaron Marcuse-Kubitza

lib/common.Makefile: Added $(wildcard/) (needed because builtin $(wildcard) doesn't do / suffix correctly)

4135 08/21/2012 04:11 AM Aaron Marcuse-Kubitza

input.Makefile: src/%/map.full.csv: Fixed bug where couldn't have $(srcMap) in prerequisites because this would for some reason cause src/%/map.full.csv to always be remade

4134 08/21/2012 03:47 AM Aaron Marcuse-Kubitza

input.Makefile: Src maps cleanup: Fixed bug where src.csv was using .map.csv.last_cleanup instead of .src.csv.last_cleanup as its .last_cleanup file

4133 08/21/2012 03:30 AM Aaron Marcuse-Kubitza

input.Makefile: Maps building: Moved src/%/map.full.csv after src/%/map.csv now that the filenames are fixed, so pattern matching order isn't an issue

4132 08/21/2012 03:27 AM Aaron Marcuse-Kubitza

input.Makefile: Maps building: $(makeFullCsv): Removed no longer needed test for whether the $(coreSelfMap) exists, because Veg+'s self map always exists

4131 08/21/2012 03:12 AM Aaron Marcuse-Kubitza

input.Makefile: Src maps cleanup: Fixed bug where src.csv was using .map.csv.last_cleanup instead of .src.csv.last_cleanup as its .last_cleanup file

4130 08/21/2012 02:34 AM Aaron Marcuse-Kubitza

inputs/CTFS/src/1.organisms/: Added "_" prefix to prevent it from being treated as a data table subdir, before the DB export is mapped

4129 08/21/2012 02:20 AM Aaron Marcuse-Kubitza

inputs/CTFS/src/ERD.jpg: Made it a symlink to "STRI2011_DB v5.jpg" instead of a copy of it

4128 08/21/2012 02:11 AM Aaron Marcuse-Kubitza

Added inputs/CTFS/src/bci_01April2011.zip.url, which contains the original download URL for our copy of the CTFS database

4127 08/21/2012 01:31 AM Aaron Marcuse-Kubitza

inputs/CTFS/src/: Added "_" prefix to scripts_to_drop_extra_tables subdir to prevent it from being treated as a data table subdir

4126 08/21/2012 01:10 AM Aaron Marcuse-Kubitza

inputs/Makefile: Input data sync: Updated rsync filter for new subdirs layout

4125 08/21/2012 12:55 AM Aaron Marcuse-Kubitza

README.TXT: Datasource setup: Updated for new subdirs layout

4124 08/21/2012 12:17 AM Aaron Marcuse-Kubitza

input.Makefile: SVN: add: Updated svn:ignores for new subdirs layout

4123 08/21/2012 12:08 AM Aaron Marcuse-Kubitza

inputs/Makefile: Import logs: Fixed bug where excluded install logs needed to be renamed according to the new name format (from <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV_subfolders#Move-log-files-into-subfolders&gt;)

4122 08/20/2012 11:59 PM Aaron Marcuse-Kubitza

inputs: Moved log files into subfolders, using steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV_subfolders#Move-log-files-into-subfolders>

4121 08/20/2012 11:01 PM Aaron Marcuse-Kubitza

input.Makefile: Merged Installation and Staging tables sections into Staging tables installation, since no other installation is performed. Removed "import/" prefix from non-file import-related targets.

4120 08/20/2012 10:20 PM Aaron Marcuse-Kubitza

inputs: Moved test outputs into subfolders, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV_subfolders#Move-test-outputs-into-subfolders>

4119 08/20/2012 09:58 PM Aaron Marcuse-Kubitza

input.Makefile: Import to VegBIEN: Removed extra test for $(inputFiles), because when there are no inputs, $(tables) will be empty and import will automatically do nothing. Removed no longer needed $(inputFiles).

4118 08/20/2012 08:46 PM Aaron Marcuse-Kubitza

inputs: Moved maps into subfolders, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV_subfolders#Move-maps-into-subfolders>

4117 08/20/2012 07:16 PM Aaron Marcuse-Kubitza

inputs: Replaced Veg+ prefix with map on via maps, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV_subfolders#Replace-Veg-prefix-with-map-on-via-maps>

4116 08/20/2012 06:39 PM Aaron Marcuse-Kubitza

strings.py: concat(): Apply length limits by shrinking max_len by new raw_extra_len() of the strings. This also fixes a bug where multi-byte characters in str0 were not properly taken into account, leading to overly long strings. Added doc comment.

4115 08/20/2012 06:29 PM Aaron Marcuse-Kubitza

strings.py: Added raw_extra_len()

4114 08/20/2012 06:17 PM Aaron Marcuse-Kubitza

sql_gen.py: NoUnderlyingTableException: Take a (required) parameter for the item that had no underlying table, and provide this wherever a NoUnderlyingTableException is created

4113 08/20/2012 06:16 PM Aaron Marcuse-Kubitza

strings.py: concat(): Perform substring operation on Unicode strings so that substring does not split Unicode characters. Still use to_raw_str() to calculate the str1 length because Unicode characters can be multi-byte, and length limits often apply to the byte length, not the character length.

4112 08/20/2012 06:13 PM Aaron Marcuse-Kubitza

exc.py: add_msg(): Fixed bug where needed to convert the Unicode string back into a raw string because Python's top-level exception handler doesn't support Unicode strings as exception messages

4111 08/20/2012 05:22 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import

4110 08/17/2012 07:53 PM Aaron Marcuse-Kubitza

inputs: Renamed stems table to 2.stems so import order would be inherent in the dir name, using steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV_subfolders#Rename-subfolders-with-import-order>

4109 08/17/2012 07:49 PM Aaron Marcuse-Kubitza

inputs: Renamed organisms table to 1.organisms so import order would be inherent in the dir name, using steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV_subfolders#Rename-subfolders-with-import-order>

4108 08/17/2012 07:30 PM Aaron Marcuse-Kubitza

inputs: Renamed plots table to 0.plots so import order would be inherent in the dir name, using steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV_subfolders#Rename-subfolders-with-import-order>

4107 08/17/2012 07:30 PM Aaron Marcuse-Kubitza

inputs: Renamed plots table to 0.plots so import order would be inherent in the dir name, using steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV_subfolders#Rename-subfolders-with-import-order>

4106 08/17/2012 07:00 PM Aaron Marcuse-Kubitza

input.Makefile: Mapping: If table subdir contains no input files, print warning instead of aborting. This situation occurs when renaming a version-controlled directory, whose previous version persists as an empty dir until committing.

4105 08/17/2012 06:41 PM Aaron Marcuse-Kubitza

input.Makefile: Mapping: Removed no longer used $(<in) and test for it in $(map)

4104 08/17/2012 06:37 PM Aaron Marcuse-Kubitza

input.Makefile: Mapping: $(map): Removed no longer used test for $(mapEnv)

4103 08/17/2012 05:50 PM Aaron Marcuse-Kubitza

sql.py: run_query(): Exception handling: Fixed bug where PostgreSQL 9.1 PL/Python errors have a different format than PostgreSQL 9.0 which needs to be supported separately. This format was already supported in sql_gen.plpythonu_error_handler, but also needed to be supported for exceptions that propagate back to the client.

4102 08/17/2012 05:34 PM Aaron Marcuse-Kubitza

inputs/SALVIAS-CSV/src/: Removed source files because they shouldn't be under version control. (They are synchronized via `make inputs/download`.)

4101 08/17/2012 05:15 PM Aaron Marcuse-Kubitza

inputs: Moved src files into VegCSV subfolders (https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV#CSV-representation), with table suffixes removed, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV_subfolders>