/ - Changes - BIEN 3 - NCEAS Projects

root @ 10473

#	Date	Author	Comment
10473	07/27/2013 10:05 AM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: taxon_determination: changed IS-A relationship with taxon_observation to HAS-A so that a separate taxon_observation doesn't need to be created for each taxon_determination (even though each taxon_determination event is theoretically a reobservation of the specimen, etc.). instead, inherit from sampling_event to include the necessary event-related fields.
10472	07/27/2013 09:45 AM	Aaron Marcuse-Kubitza	bugfix: schemas/VegCore/VegCore.ERD.mwb: geopath: made country NOT NULL so that every geoplace (for input to geovalidation) has something on the geopath side. geocoords: made latitude_deg/longitude_deg NOT NULL so that every geoplace (for input to geovalidation) has something on the geocoords side. added geocoords_unique constraint since this is a global table with one entry for each lat/long.
10471	07/27/2013 09:30 AM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: place: added coords hstore extender, for verbatim coordinates, etc.
10470	07/27/2013 09:14 AM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: coordinates: abbreviated to coords (unambiguous abbreviation)
10469	07/27/2013 08:59 AM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: replaced parsed_taxon_assertion with taxon_scrub, which HAS-A parsed taxon_assertion rather than BEING-A parsed_taxon_assertion. (multiple TNRS results may parse to the same thing.)
10468	07/27/2013 08:08 AM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: geovalidatable_place: renamed to geoplace, since this uniquification is useful independently of geovalidation. note that the MySQL upgrade on vegbiendev has now reordered the fkeys again, this time in forwards order.
10467	07/27/2013 08:02 AM	Aaron Marcuse-Kubitza	planning/timeline/timeline.2013.xls: updated for July progress
10466	07/27/2013 07:43 AM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: place tables that are absolute within Earth rather than relative to a parent place: prefixed geo- to table name for clarity
10465	07/27/2013 07:23 AM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: plot, subplot: added hstore extenders (dimensions, coordinates)
10464	07/27/2013 07:17 AM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: fixed inheritance connectors to be 1:1, optional on subclass
10463	07/27/2013 07:11 AM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: plot: added shape. bounding_box: changed units to rect, since this just needs a width/height (the x/y coord is the lat/long).
10462	07/27/2013 07:05 AM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: plot: added footprint_geom_WKT. bounding_box: added units (WKT).
10461	07/27/2013 06:51 AM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: back-synced from staging copy on vegbiendev to flush out sync changes that it kept trying to re-make
10460	07/27/2013 06:47 AM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: event: moved method to separate sampling_event subclass
10459	07/27/2013 06:28 AM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: fixed lines
10458	07/27/2013 06:25 AM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: aggregate_observation: inherit from taxon_presence, since this is a type of taxon_presence and it avoids duplicating the taxon_concept field
10457	07/27/2013 06:11 AM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: added taxon_absence, to avoid including absence observations in the same table as presence observations (which needlessly complicates queries). note that the fkey order now gets set back to forwards whenever a table is changed.
10456	07/27/2013 06:07 AM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: re-saved. the fkey order is now apparently reversed for recently-changed tables.
10455	07/26/2013 11:07 PM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: collector, identified_by: allow multiple parties for these fields, using the new party_list array table
10454	07/26/2013 10:44 PM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: party arrays: use new party_list array table instead of adding a separate many:many table for each table that uses a party array. this also allows using the party_list ID in a unique constraint, because it is now a first-class field.
10453	07/26/2013 10:06 PM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: party: added party_list array table
10452	07/26/2013 09:45 PM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: party: added optional fkey to organization
10451	07/26/2013 09:32 PM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: geovalidation: renamed lat_long_in_ranks to lat_long_in_place_ranks for clarity
10450	07/26/2013 09:12 PM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: individual: added tag_history hstore to store custom identity attributes
10449	07/26/2013 08:39 PM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: taxon_string: documented that to get the parsed_taxon_assertion (TNRS result) for a taxon_string, you would join using the SQL dotpath taxon_string.string<-taxon_assertion(string)::parsed_taxon_assertion[source='TNRS.version'] (see wiki.vegpath.org/SQL_dotpaths). important how-to comments such as this one are now included in the version-controlled MySQL schema file itself, not just the .mwb file and the staging copy on vegbiendev.
10448	07/26/2013 08:16 PM	Aaron Marcuse-Kubitza	bin/my2pg: use s!...!...! when either the regexp or the replacement contains / , to avoid unnecessary \-s
10447	07/26/2013 08:09 PM	Aaron Marcuse-Kubitza	bin/my2pg: commenting out table options: added explanatory comment, because it is not obvious from the regexp what this does
10446	07/26/2013 08:06 PM	Aaron Marcuse-Kubitza	lib/sh/db.sh: mysqldump(): don't use --compatible=postgresql when the table structure is being exported, because this removes the table options (which include the COMMENT attribute). --compatible=postgresql remains on in data-only mode because embedded ` in data cannot easily be distinguished from ` around column names, so ANSI_QUOTES is needed to do the translation to " (and data sections do not contain table options). note that all --compatible modes that offer ANSI_QUOTES unfortunately exclude the table options, and there is no way to run a SQL query to set the SQL mode before beginning the dump, so ANSI_QUOTES translation must be handled by my2pg instead.
10445	07/26/2013 06:35 PM	Aaron Marcuse-Kubitza	bin/my2pg: comment out table options (http://dev.mysql.com/doc/refman/5.5/en/server-sql-mode.html#sqlmode_no_table_options) instead of removing them, because they include table COMMENTs, which contain important metadata such as table definitions. (note that table COMMENTs use a slightly different syntax than column COMMENTs, so the table COMMENTs will not be commented out twice.)
10444	07/26/2013 06:19 PM	Aaron Marcuse-Kubitza	bin/my2pg: comment out COMMENTs instead of removing them so that they will be included in the PostgreSQL translation. COMMENTs contain important metadata about columns, such as definitions and the meanings of integer flag values.
10443	07/26/2013 05:58 PM	Aaron Marcuse-Kubitza	inputs/{.,}/.schema.sql: regenerated using the instructions in bin/my2pg. this primarily replaces timestamp with text/timestamp/ (to preserve indefinite dates).
10442	07/26/2013 05:56 PM	Aaron Marcuse-Kubitza	bin/my2pg: added instructions for regenerating *.schema.sql whenever this script is changed
10441	07/26/2013 05:22 PM	Aaron Marcuse-Kubitza	bin/my2pg: COMMENT: also match COMMENTs with embedded ', because there will only be one COMMENT per line, so the contents of the COMMENT can just extend to the last ' on the line
10440	07/26/2013 05:16 PM	Aaron Marcuse-Kubitza	bugfix: lib/sh/util.sh: $sed_cmd: make output unbuffered, so that running e.g. bin/my2pg at the command line produces output as each line is read
10439	07/26/2013 04:29 PM	Aaron Marcuse-Kubitza	bin/my2pg: replace MySQL ` quotes with " quotes to support exports that were generated without ANSI_QUOTES mode. (this replacement only applies to schema exports, not data.) ANSI_QUOTES is only available with mysqldump --compatible modes that also include NO_TABLE_OPTIONS, which omits important table options such as comments. in particular, these comments are part of schemas/VegCore/VegCore.ERD.mwb but were not being included in VegCore.my.sql.
10438	07/26/2013 01:41 PM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: taxon_string: removed parsed_taxon_assertion field, since there may be more than one parsing (TNRS result) for a given taxon_string. the parsing relationship can better be represented by adding a parsed_taxon_assertion whose taxon_assertion.string points to the parsed taxon_string. getting the parsed_taxon_assertion for a taxon_string now requires joining on parsed_taxon_assertion using a backwards instead of forwards fkey, and filtering the corresponding assertions to include only the ones for TNRS (of the desired TNRS version). documented that taxon_assertion.string was previously the concatenated matched name, but is now the TNRS input name. the concatenated matched name is still in parsed_taxon_assertion.matched_taxon_concept->:taxon_name.unique_name.
10437	07/26/2013 01:22 PM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.my.sql: regenerated from .mwb schema, which apparently reverses the order of the fkeys (possibly a Linux MySQL bug?)
10436	07/26/2013 12:26 PM	Aaron Marcuse-Kubitza	inputs/SpeciesLink/Specimen/map.csv: remapped Darwin Core synonyms to DUPLICATE. this avoids the need to translate these to postprocessing derived columns for new-style import, and also speeds up column-based import because there are less automatic alts to perform to resolve filter-less collisions. the svn diff was verified by replacing DUPLICATE#of:dwc_terms<term>#... with <term>, removing the comment, and checking that this removes the diff (except where VegCore has renamed a DwC term).
10435	07/26/2013 12:17 PM	Aaron Marcuse-Kubitza	bugfix: inputs/SpeciesLink/Specimen/map.csv: *scientificName: remapped to scientificName instead of taxonName to match the DwC term's name (this is the same dwc_terms_scientificName mismapping that was fixed in r10434)
10434	07/26/2013 11:56 AM	Aaron Marcuse-Kubitza	bugfix: inputs/SpeciesLink/Specimen/map.csv: dwc_terms_scientificName: remapped to scientificName instead of taxonName to match that DwC term name, as well as the mappings of other *scientificName terms
10433	07/26/2013 11:06 AM	Aaron Marcuse-Kubitza	inputs/SpeciesLink/Specimen/map.csv: marked dwc_geospatial_VerbatimLatitude,Longitude as exact duplicates of dwc_terms_*
10432	07/26/2013 10:52 AM	Aaron Marcuse-Kubitza	inputs/SpeciesLink/Specimen/map.csv: remapped identical _alt-ed fields to DUPLICATE. this avoids the need to translate these to postprocessing derived columns for new-style import, and also speeds up column-based import because there are less automatic _alts to perform to resolve filter-less collisions.
10431	07/26/2013 10:06 AM	Aaron Marcuse-Kubitza	bugfix: inputs/SpeciesLink/Specimen/map.csv: *CollectorNumber: moved these to the same _alt group as recordNumber, because they are actually duplicates
10430	07/26/2013 09:43 AM	Aaron Marcuse-Kubitza	correction: inputs/SpeciesLink/Specimen/map.csv: *FieldNumber: fixed incorrect comment that these fields are identical to recordNumber, when instead they have the same meaning but not the same values. instead, values are stored under either** of the two terms. the previous conclusion had been based on an incorrect query, which used != instead of the NULL-sensitive IS NOT DISTINCT FROM.
10429	07/25/2013 08:14 PM	Aaron Marcuse-Kubitza	planning/timeline/timeline.2013.xls: Adding derived columns: extended to overlap with all subtasks
10428	07/25/2013 08:12 PM	Aaron Marcuse-Kubitza	planning/timeline/timeline.2013.xls: Geoscrubbing: split into separate re-run and automated pipeline tasks
10427	07/25/2013 08:09 PM	Aaron Marcuse-Kubitza	planning/timeline/timeline.2013.xls: moved Data provider validations before Adding derived columns because ensuring that the source data is in the database is more important than the derived data, which can always be added later
10426	07/25/2013 08:00 PM	Aaron Marcuse-Kubitza	planning/timeline/timeline.2013.xls: Data provider validations: added dot in July because some amount of datasource-level validation happens when mappings issues are discovered during the refactoring
10425	07/25/2013 07:34 PM	Aaron Marcuse-Kubitza	bugfix: inputs///map.csv for specimen tables: remapped eventDate,day,month,year to *Collected, because a general date always applies to the observation itself rather than to any parent event (specimens don't have a parent event)
10424	07/25/2013 07:34 PM	Aaron Marcuse-Kubitza	inputs///map.csv for IndividualObservation tables: also mapped eventDate,day,month,year to *Collected, because a general date always applies to the observation itself in addition to any parent event which it may be a part of
10423	07/25/2013 06:27 PM	Aaron Marcuse-Kubitza	bugfix: inputs/XAL/Specimen/, NY/Ecatalog_all/: *JulianDay: remapped to dayOfYear instead of day (the day of the month)
10422	07/25/2013 05:08 PM	Aaron Marcuse-Kubitza	inputs/SpeciesLink/Specimen/map.csv: remapped *dayOfYear-related terms to UNUSED
10421	07/25/2013 04:53 PM	Aaron Marcuse-Kubitza	bugfix: inputs/SpeciesLink/Specimen/map.csv: remapped conceptual_darwin_2003_1_0_JulianDay, dwc_dwcore_DayOfYear to dayOfYear instead of day (the day of the month)
10420	07/25/2013 04:43 PM	Aaron Marcuse-Kubitza	mappings/VegCore.htm: regenerated from wiki. added dayOfYear (=julianDay), which is different from startDayOfYear/endDayOfYear.
10419	07/25/2013 01:59 PM	Aaron Marcuse-Kubitza	inputs/CTFS/: switched to new-style import, using the steps at wiki.vegpath.org/Adding_new-style_import_to_a_datasource
10418	07/25/2013 01:50 PM	Aaron Marcuse-Kubitza	inputs/CTFS/StemObservation/: translated collisions (missing filters) to postprocessing derived columns, using the steps at wiki.vegpath.org/Adding_new-style_import_to_a_datasource#Translating-filters-to-postprocessing-derived-columns
10417	07/25/2013 10:57 AM	Aaron Marcuse-Kubitza	planning/timeline/timeline.2013.xls: rebalanced tasks across the remaining months, taking into account priority changes made in the conference call (e.g. that we should not be handling people's individual data requests (Brad, wiki.vegpath.org/2013-07-25_conference_call#Decisions-made))
10416	07/25/2013 10:50 AM	Aaron Marcuse-Kubitza	planning/timeline/timeline.2013.xls: updated with additional tasks added in conference call: translate source-specific derived columns to plain SQL, flatten the datasources, automated geoscrubbing pipeline
10415	07/25/2013 08:43 AM	Aaron Marcuse-Kubitza	planning/goals/BIEN_3_derived_data_products_NormalizedDB_only.docx: removed BIEN species-level phylogeny, which Brad says is out of scope for the BIEN DB
10414	07/25/2013 08:24 AM	Aaron Marcuse-Kubitza	removed planning/workflow/bien3_architecture.odp because the current version is now in bien3_architecture.pptx
10413	07/25/2013 08:13 AM	Aaron Marcuse-Kubitza	added planning/workflow/validation/TNRS_results.ppt symlink to inputs/test_taxonomic_names/_scrub/TNRS_results.ppt
10412	07/25/2013 08:10 AM	Aaron Marcuse-Kubitza	inputs/test_taxonomic_names/_scrub/TNRS_results.ppt: highlighted the sample row and related rows
10411	07/25/2013 08:04 AM	Aaron Marcuse-Kubitza	inputs/test_taxonomic_names/_scrub/TNRS_results.xls: moved arrows to TNRS_results.ppt so they can be changed more easily
10410	07/25/2013 07:51 AM	Aaron Marcuse-Kubitza	inputs/test_taxonomic_names/_scrub/TNRS_results.ppt: TNRS.tnrs: added diagram labels for the various names and steps
10409	07/25/2013 07:32 AM	Aaron Marcuse-Kubitza	inputs/test_taxonomic_names/_scrub/TNRS_results.xls: use "Poa annua var. eriolepis"->"Poaceae Poa annua L." as the synonym example instead of "Poa annua fo. lanuginosa"->"Poaceae Poa annua var. annua" because the input name is simpler and it's closer to the beginning of the list
10408	07/25/2013 07:20 AM	Aaron Marcuse-Kubitza	inputs/test_taxonomic_names/_scrub/run: exports/make(): tnrs.csv: include Name_matched instead of Genus_matched+Specific_epithet_matched because this also contains lower ranks, which are used in the TNRS synonymizing
10407	07/25/2013 07:06 AM	Aaron Marcuse-Kubitza	inputs/test_taxonomic_names/_scrub/TNRS_results.ppt: added annotations explaining the import steps
10406	07/25/2013 06:36 AM	Aaron Marcuse-Kubitza	added inputs/test_taxonomic_names/_scrub/TNRS_results.ppt, containing the *.png screenshots with tables labeled
10405	07/25/2013 06:35 AM	Aaron Marcuse-Kubitza	added inputs/test_taxonomic_names/_scrub/*.png, screenshots of the TNRS_results.xls tabs (LibreOffice does not preserve the formatting when pasting a spreadsheet to a PowerPoint as a table, and the table editing options are limited)
10404	07/25/2013 06:31 AM	Aaron Marcuse-Kubitza	added inputs/test_taxonomic_names/_scrub/TNRS_results.xls with formatted versions of the *.csv tables
10403	07/24/2013 05:15 PM	Aaron Marcuse-Kubitza	inputs/test_taxonomic_names/_scrub/run: exports/make(): subset the columns to include only the most important to demo how the data is represented
10402	07/24/2013 05:13 PM	Aaron Marcuse-Kubitza	lib/sh/db.sh: mk_select(): support passing $cols as array instead of SQL string, which is easier to enter in a shell script (less quotes, \ , etc.)
10401	07/24/2013 05:12 PM	Aaron Marcuse-Kubitza	lib/sh/db.sh: added cols2list()
10400	07/24/2013 05:10 PM	Aaron Marcuse-Kubitza	lib/sh/util.sh: added is_array()
10399	07/24/2013 04:38 PM	Aaron Marcuse-Kubitza	inputs/test_taxonomic_names/_scrub/run: exports/make(): allow specifying an explicit columns list for each table using cols=... (initially set to all columns)
10398	07/24/2013 04:09 PM	Aaron Marcuse-Kubitza	added inputs/test_taxonomic_names/_scrub/*.csv exports
10397	07/24/2013 04:09 PM	Aaron Marcuse-Kubitza	added inputs/test_taxonomic_names/_scrub/run, which exports the test_scrub-populated tables to CSV
10396	07/24/2013 04:08 PM	Aaron Marcuse-Kubitza	lib/sh/db_make.sh: added pg_export_table_to_dir(), pg_export_tables_to_dir(). unlike db.sh pg_export_table_to_dir_no_header(), these functions are make-aware and will not clobber an existing file.
10395	07/24/2013 03:15 PM	Aaron Marcuse-Kubitza	reran inputs/test_taxonomic_names/test_scrub, which generates the public.test_taxonomic_names sample schema
10394	07/24/2013 01:50 PM	Aaron Marcuse-Kubitza	inputs/CTFS/Plot/map.csv: DescriptionOfSite: remapped to locationRemarks, not locality
10393	07/24/2013 01:38 PM	Aaron Marcuse-Kubitza	inputs/CTFS/AggregateObservation/: translated multi-column filters to postprocessing derived columns, using the steps at wiki.vegpath.org/Adding_new-style_import_to_a_datasource#Translating-filters-to-postprocessing-derived-columns
10392	07/24/2013 01:24 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: geoscrub_input_new: updated for VegCore-renamed geoscrub_output column names
10391	07/24/2013 01:09 PM	Aaron Marcuse-Kubitza	schemas/util.sql: added ?>= operator with is_more_complete_than() function
10390	07/24/2013 12:44 PM	Aaron Marcuse-Kubitza	inputs/.geoscrub/: switched to new-style import, using the steps at wiki.vegpath.org/Adding_new-style_import_to_a_datasource
10389	07/24/2013 12:15 PM	Aaron Marcuse-Kubitza	inputs/.geoscrub/geoscrub_output/: translated single-column filters to postprocessing derived columns, using the steps at wiki.vegpath.org/Adding_new-style_import_to_a_datasource#Translating-filters-to-postprocessing-derived-columns
10388	07/24/2013 11:18 AM	Aaron Marcuse-Kubitza	schemas/util.sql: SQL-language IMMUTABLE functions marked STRICT: removed STRICT to enable dynamic inlining, which speeds up the function up to 7x. STRICT was not removed where the function was particularly complex and the STRICT optimization would likely be more significant than inlining.
10387	07/24/2013 11:07 AM	Aaron Marcuse-Kubitza	bugfix: inputs/BRIT/specimen_flat/postprocess.sql: diameterBreastHeight_cm, height_m: use newly NULL-mapped versions of columns instead of the *_verbatim columns
10386	07/24/2013 11:04 AM	Aaron Marcuse-Kubitza	inputs/BRIT/: switched to new-style import, using the steps at wiki.vegpath.org/Adding_new-style_import_to_a_datasource
10385	07/24/2013 10:49 AM	Aaron Marcuse-Kubitza	inputs/BRIT/specimen_flat/: translated multi-column filters with _join() to postprocessing derived columns, using the steps at wiki.vegpath.org/Adding_new-style_import_to_a_datasource#Translating-filters-to-postprocessing-derived-columns
10384	07/24/2013 10:43 AM	Aaron Marcuse-Kubitza	inputs/BRIT/specimen_flat/map.csv: Habitat_Summary: remapped to UNUSED
10383	07/24/2013 10:16 AM	Aaron Marcuse-Kubitza	inputs/BRIT/specimen_flat/postprocess.sql: diameterBreastHeight_cm, height_m: updated runtimes
10382	07/24/2013 10:15 AM	Aaron Marcuse-Kubitza	inputs/BRIT/specimen_flat/: DBH_, Height_: mapped NULL-equivalent values, using the steps at wiki.vegpath.org/Adding_new-style_import_to_a_datasource#Translating-filters-to-postprocessing-derived-columns
10381	07/24/2013 09:27 AM	Aaron Marcuse-Kubitza	inputs/.../: translated multi-column filters with _avg() to postprocessing derived columns, using the steps at wiki.vegpath.org/Adding_new-style_import_to_a_datasource#Translating-filters-to-postprocessing-derived-columns
10380	07/24/2013 08:18 AM	Aaron Marcuse-Kubitza	inputs/BRIT/specimen_flat/: translated single-column filters to postprocessing derived columns, using the steps at wiki.vegpath.org/Switching_to_new-style_import#stage-I-source-specific > "translate single-column filters to postprocessing derived columns"
10379	07/20/2013 05:25 AM	Aaron Marcuse-Kubitza	/README.TXT: Maintenance: added instructions for what to do if http://vegbiendev.nceas.ucsb.edu/phppgadmin/ goes down (sometimes displaying a Not found error)
10378	07/20/2013 05:21 AM	Aaron Marcuse-Kubitza	schemas/util.sql: schema comment: added note that IMMUTABLE SQL-language functions should never be declared STRICT, because this prevents them from being inlined. inlining can create a significant speed improvement (7x+), by avoiding function calls and enabling additional constant folding.
10377	07/20/2013 05:09 AM	Aaron Marcuse-Kubitza	inputs/REMIB/Specimen/postprocess.sql: map_nulls() derived cols: documented total runtime (7.5 min on vegbiendev)
10376	07/20/2013 05:07 AM	Aaron Marcuse-Kubitza	inputs/REMIB/Specimen/postprocess.sql: map_nulls() derived cols: updated runtimes for map_nulls() inlining, which created a speed improvement of 7x for the numeric columns and 2.5x for the text columns (292563.362->41929.772 ms and 83640.424->35690.797 ms, respectively). note that the map_nulls__coord__*() calls could be optimized further by combining the successive map_nulls() calls into one, with the hstores merged.
10375	07/20/2013 04:37 AM	Aaron Marcuse-Kubitza	schemas/util.sql: map_nulls(): documented that inputs/REMIB/Specimen/postprocess.sql > country also shows that inlining is now happening properly. note that the speed improvement due to inlining is not as much, %wise, when the values util._map() is run on are long strings instead of the short strings used in the initial profiling. this is because a greater % of the time is spent in system functions such as hstore>text, which are not affected by the inlining because they are run either way.
10374	07/20/2013 04:18 AM	Aaron Marcuse-Kubitza	schemas/util.sql: map_nulls(): use new nulls_map(). proper inlining (i.e. same runtime before and after change) has been verified with the following profiling query: SELECT util.map_nulls(array[1, 2, 3]::text[], v) FROM unnest(array_fill(1, array¹⁰⁰⁰⁰⁰)) f (v)

Project

General

Profile