/ - Changes - BIEN 3 - NCEAS Projects

root @ 10450

#	Date	Author	Comment
10450	07/26/2013 09:12 PM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: individual: added tag_history hstore to store custom identity attributes
10449	07/26/2013 08:39 PM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: taxon_string: documented that to get the parsed_taxon_assertion (TNRS result) for a taxon_string, you would join using the SQL dotpath taxon_string.string<-taxon_assertion(string)::parsed_taxon_assertion[source='TNRS.version'] (see wiki.vegpath.org/SQL_dotpaths). important how-to comments such as this one are now included in the version-controlled MySQL schema file itself, not just the .mwb file and the staging copy on vegbiendev.
10448	07/26/2013 08:16 PM	Aaron Marcuse-Kubitza	bin/my2pg: use s!...!...! when either the regexp or the replacement contains / , to avoid unnecessary \-s
10447	07/26/2013 08:09 PM	Aaron Marcuse-Kubitza	bin/my2pg: commenting out table options: added explanatory comment, because it is not obvious from the regexp what this does
10446	07/26/2013 08:06 PM	Aaron Marcuse-Kubitza	lib/sh/db.sh: mysqldump(): don't use --compatible=postgresql when the table structure is being exported, because this removes the table options (which include the COMMENT attribute). --compatible=postgresql remains on in data-only mode because embedded ` in data cannot easily be distinguished from ` around column names, so ANSI_QUOTES is needed to do the translation to " (and data sections do not contain table options). note that all --compatible modes that offer ANSI_QUOTES unfortunately exclude the table options, and there is no way to run a SQL query to set the SQL mode before beginning the dump, so ANSI_QUOTES translation must be handled by my2pg instead.
10445	07/26/2013 06:35 PM	Aaron Marcuse-Kubitza	bin/my2pg: comment out table options (http://dev.mysql.com/doc/refman/5.5/en/server-sql-mode.html#sqlmode_no_table_options) instead of removing them, because they include table COMMENTs, which contain important metadata such as table definitions. (note that table COMMENTs use a slightly different syntax than column COMMENTs, so the table COMMENTs will not be commented out twice.)
10444	07/26/2013 06:19 PM	Aaron Marcuse-Kubitza	bin/my2pg: comment out COMMENTs instead of removing them so that they will be included in the PostgreSQL translation. COMMENTs contain important metadata about columns, such as definitions and the meanings of integer flag values.
10443	07/26/2013 05:58 PM	Aaron Marcuse-Kubitza	inputs/{.,}/.schema.sql: regenerated using the instructions in bin/my2pg. this primarily replaces timestamp with text/timestamp/ (to preserve indefinite dates).
10442	07/26/2013 05:56 PM	Aaron Marcuse-Kubitza	bin/my2pg: added instructions for regenerating *.schema.sql whenever this script is changed
10441	07/26/2013 05:22 PM	Aaron Marcuse-Kubitza	bin/my2pg: COMMENT: also match COMMENTs with embedded ', because there will only be one COMMENT per line, so the contents of the COMMENT can just extend to the last ' on the line
10440	07/26/2013 05:16 PM	Aaron Marcuse-Kubitza	bugfix: lib/sh/util.sh: $sed_cmd: make output unbuffered, so that running e.g. bin/my2pg at the command line produces output as each line is read
10439	07/26/2013 04:29 PM	Aaron Marcuse-Kubitza	bin/my2pg: replace MySQL ` quotes with " quotes to support exports that were generated without ANSI_QUOTES mode. (this replacement only applies to schema exports, not data.) ANSI_QUOTES is only available with mysqldump --compatible modes that also include NO_TABLE_OPTIONS, which omits important table options such as comments. in particular, these comments are part of schemas/VegCore/VegCore.ERD.mwb but were not being included in VegCore.my.sql.
10438	07/26/2013 01:41 PM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.ERD.mwb: taxon_string: removed parsed_taxon_assertion field, since there may be more than one parsing (TNRS result) for a given taxon_string. the parsing relationship can better be represented by adding a parsed_taxon_assertion whose taxon_assertion.string points to the parsed taxon_string. getting the parsed_taxon_assertion for a taxon_string now requires joining on parsed_taxon_assertion using a backwards instead of forwards fkey, and filtering the corresponding assertions to include only the ones for TNRS (of the desired TNRS version). documented that taxon_assertion.string was previously the concatenated matched name, but is now the TNRS input name. the concatenated matched name is still in parsed_taxon_assertion.matched_taxon_concept->:taxon_name.unique_name.
10437	07/26/2013 01:22 PM	Aaron Marcuse-Kubitza	schemas/VegCore/VegCore.my.sql: regenerated from .mwb schema, which apparently reverses the order of the fkeys (possibly a Linux MySQL bug?)
10436	07/26/2013 12:26 PM	Aaron Marcuse-Kubitza	inputs/SpeciesLink/Specimen/map.csv: remapped Darwin Core synonyms to DUPLICATE. this avoids the need to translate these to postprocessing derived columns for new-style import, and also speeds up column-based import because there are less automatic alts to perform to resolve filter-less collisions. the svn diff was verified by replacing DUPLICATE#of:dwc_terms<term>#... with <term>, removing the comment, and checking that this removes the diff (except where VegCore has renamed a DwC term).
10435	07/26/2013 12:17 PM	Aaron Marcuse-Kubitza	bugfix: inputs/SpeciesLink/Specimen/map.csv: *scientificName: remapped to scientificName instead of taxonName to match the DwC term's name (this is the same dwc_terms_scientificName mismapping that was fixed in r10434)
10434	07/26/2013 11:56 AM	Aaron Marcuse-Kubitza	bugfix: inputs/SpeciesLink/Specimen/map.csv: dwc_terms_scientificName: remapped to scientificName instead of taxonName to match that DwC term name, as well as the mappings of other *scientificName terms
10433	07/26/2013 11:06 AM	Aaron Marcuse-Kubitza	inputs/SpeciesLink/Specimen/map.csv: marked dwc_geospatial_VerbatimLatitude,Longitude as exact duplicates of dwc_terms_*
10432	07/26/2013 10:52 AM	Aaron Marcuse-Kubitza	inputs/SpeciesLink/Specimen/map.csv: remapped identical _alt-ed fields to DUPLICATE. this avoids the need to translate these to postprocessing derived columns for new-style import, and also speeds up column-based import because there are less automatic _alts to perform to resolve filter-less collisions.
10431	07/26/2013 10:06 AM	Aaron Marcuse-Kubitza	bugfix: inputs/SpeciesLink/Specimen/map.csv: *CollectorNumber: moved these to the same _alt group as recordNumber, because they are actually duplicates
10430	07/26/2013 09:43 AM	Aaron Marcuse-Kubitza	correction: inputs/SpeciesLink/Specimen/map.csv: *FieldNumber: fixed incorrect comment that these fields are identical to recordNumber, when instead they have the same meaning but not the same values. instead, values are stored under either** of the two terms. the previous conclusion had been based on an incorrect query, which used != instead of the NULL-sensitive IS NOT DISTINCT FROM.
10429	07/25/2013 08:14 PM	Aaron Marcuse-Kubitza	planning/timeline/timeline.2013.xls: Adding derived columns: extended to overlap with all subtasks
10428	07/25/2013 08:12 PM	Aaron Marcuse-Kubitza	planning/timeline/timeline.2013.xls: Geoscrubbing: split into separate re-run and automated pipeline tasks
10427	07/25/2013 08:09 PM	Aaron Marcuse-Kubitza	planning/timeline/timeline.2013.xls: moved Data provider validations before Adding derived columns because ensuring that the source data is in the database is more important than the derived data, which can always be added later
10426	07/25/2013 08:00 PM	Aaron Marcuse-Kubitza	planning/timeline/timeline.2013.xls: Data provider validations: added dot in July because some amount of datasource-level validation happens when mappings issues are discovered during the refactoring
10425	07/25/2013 07:34 PM	Aaron Marcuse-Kubitza	bugfix: inputs///map.csv for specimen tables: remapped eventDate,day,month,year to *Collected, because a general date always applies to the observation itself rather than to any parent event (specimens don't have a parent event)
10424	07/25/2013 07:34 PM	Aaron Marcuse-Kubitza	inputs///map.csv for IndividualObservation tables: also mapped eventDate,day,month,year to *Collected, because a general date always applies to the observation itself in addition to any parent event which it may be a part of
10423	07/25/2013 06:27 PM	Aaron Marcuse-Kubitza	bugfix: inputs/XAL/Specimen/, NY/Ecatalog_all/: *JulianDay: remapped to dayOfYear instead of day (the day of the month)
10422	07/25/2013 05:08 PM	Aaron Marcuse-Kubitza	inputs/SpeciesLink/Specimen/map.csv: remapped *dayOfYear-related terms to UNUSED
10421	07/25/2013 04:53 PM	Aaron Marcuse-Kubitza	bugfix: inputs/SpeciesLink/Specimen/map.csv: remapped conceptual_darwin_2003_1_0_JulianDay, dwc_dwcore_DayOfYear to dayOfYear instead of day (the day of the month)
10420	07/25/2013 04:43 PM	Aaron Marcuse-Kubitza	mappings/VegCore.htm: regenerated from wiki. added dayOfYear (=julianDay), which is different from startDayOfYear/endDayOfYear.
10419	07/25/2013 01:59 PM	Aaron Marcuse-Kubitza	inputs/CTFS/: switched to new-style import, using the steps at wiki.vegpath.org/Adding_new-style_import_to_a_datasource
10418	07/25/2013 01:50 PM	Aaron Marcuse-Kubitza	inputs/CTFS/StemObservation/: translated collisions (missing filters) to postprocessing derived columns, using the steps at wiki.vegpath.org/Adding_new-style_import_to_a_datasource#Translating-filters-to-postprocessing-derived-columns
10417	07/25/2013 10:57 AM	Aaron Marcuse-Kubitza	planning/timeline/timeline.2013.xls: rebalanced tasks across the remaining months, taking into account priority changes made in the conference call (e.g. that we should not be handling people's individual data requests (Brad, wiki.vegpath.org/2013-07-25_conference_call#Decisions-made))
10416	07/25/2013 10:50 AM	Aaron Marcuse-Kubitza	planning/timeline/timeline.2013.xls: updated with additional tasks added in conference call: translate source-specific derived columns to plain SQL, flatten the datasources, automated geoscrubbing pipeline
10415	07/25/2013 08:43 AM	Aaron Marcuse-Kubitza	planning/goals/BIEN_3_derived_data_products_NormalizedDB_only.docx: removed BIEN species-level phylogeny, which Brad says is out of scope for the BIEN DB
10414	07/25/2013 08:24 AM	Aaron Marcuse-Kubitza	removed planning/workflow/bien3_architecture.odp because the current version is now in bien3_architecture.pptx
10413	07/25/2013 08:13 AM	Aaron Marcuse-Kubitza	added planning/workflow/validation/TNRS_results.ppt symlink to inputs/test_taxonomic_names/_scrub/TNRS_results.ppt
10412	07/25/2013 08:10 AM	Aaron Marcuse-Kubitza	inputs/test_taxonomic_names/_scrub/TNRS_results.ppt: highlighted the sample row and related rows
10411	07/25/2013 08:04 AM	Aaron Marcuse-Kubitza	inputs/test_taxonomic_names/_scrub/TNRS_results.xls: moved arrows to TNRS_results.ppt so they can be changed more easily
10410	07/25/2013 07:51 AM	Aaron Marcuse-Kubitza	inputs/test_taxonomic_names/_scrub/TNRS_results.ppt: TNRS.tnrs: added diagram labels for the various names and steps
10409	07/25/2013 07:32 AM	Aaron Marcuse-Kubitza	inputs/test_taxonomic_names/_scrub/TNRS_results.xls: use "Poa annua var. eriolepis"->"Poaceae Poa annua L." as the synonym example instead of "Poa annua fo. lanuginosa"->"Poaceae Poa annua var. annua" because the input name is simpler and it's closer to the beginning of the list
10408	07/25/2013 07:20 AM	Aaron Marcuse-Kubitza	inputs/test_taxonomic_names/_scrub/run: exports/make(): tnrs.csv: include Name_matched instead of Genus_matched+Specific_epithet_matched because this also contains lower ranks, which are used in the TNRS synonymizing
10407	07/25/2013 07:06 AM	Aaron Marcuse-Kubitza	inputs/test_taxonomic_names/_scrub/TNRS_results.ppt: added annotations explaining the import steps
10406	07/25/2013 06:36 AM	Aaron Marcuse-Kubitza	added inputs/test_taxonomic_names/_scrub/TNRS_results.ppt, containing the *.png screenshots with tables labeled
10405	07/25/2013 06:35 AM	Aaron Marcuse-Kubitza	added inputs/test_taxonomic_names/_scrub/*.png, screenshots of the TNRS_results.xls tabs (LibreOffice does not preserve the formatting when pasting a spreadsheet to a PowerPoint as a table, and the table editing options are limited)
10404	07/25/2013 06:31 AM	Aaron Marcuse-Kubitza	added inputs/test_taxonomic_names/_scrub/TNRS_results.xls with formatted versions of the *.csv tables
10403	07/24/2013 05:15 PM	Aaron Marcuse-Kubitza	inputs/test_taxonomic_names/_scrub/run: exports/make(): subset the columns to include only the most important to demo how the data is represented
10402	07/24/2013 05:13 PM	Aaron Marcuse-Kubitza	lib/sh/db.sh: mk_select(): support passing $cols as array instead of SQL string, which is easier to enter in a shell script (less quotes, \ , etc.)
10401	07/24/2013 05:12 PM	Aaron Marcuse-Kubitza	lib/sh/db.sh: added cols2list()
10400	07/24/2013 05:10 PM	Aaron Marcuse-Kubitza	lib/sh/util.sh: added is_array()
10399	07/24/2013 04:38 PM	Aaron Marcuse-Kubitza	inputs/test_taxonomic_names/_scrub/run: exports/make(): allow specifying an explicit columns list for each table using cols=... (initially set to all columns)
10398	07/24/2013 04:09 PM	Aaron Marcuse-Kubitza	added inputs/test_taxonomic_names/_scrub/*.csv exports
10397	07/24/2013 04:09 PM	Aaron Marcuse-Kubitza	added inputs/test_taxonomic_names/_scrub/run, which exports the test_scrub-populated tables to CSV
10396	07/24/2013 04:08 PM	Aaron Marcuse-Kubitza	lib/sh/db_make.sh: added pg_export_table_to_dir(), pg_export_tables_to_dir(). unlike db.sh pg_export_table_to_dir_no_header(), these functions are make-aware and will not clobber an existing file.
10395	07/24/2013 03:15 PM	Aaron Marcuse-Kubitza	reran inputs/test_taxonomic_names/test_scrub, which generates the public.test_taxonomic_names sample schema
10394	07/24/2013 01:50 PM	Aaron Marcuse-Kubitza	inputs/CTFS/Plot/map.csv: DescriptionOfSite: remapped to locationRemarks, not locality
10393	07/24/2013 01:38 PM	Aaron Marcuse-Kubitza	inputs/CTFS/AggregateObservation/: translated multi-column filters to postprocessing derived columns, using the steps at wiki.vegpath.org/Adding_new-style_import_to_a_datasource#Translating-filters-to-postprocessing-derived-columns
10392	07/24/2013 01:24 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: geoscrub_input_new: updated for VegCore-renamed geoscrub_output column names
10391	07/24/2013 01:09 PM	Aaron Marcuse-Kubitza	schemas/util.sql: added ?>= operator with is_more_complete_than() function
10390	07/24/2013 12:44 PM	Aaron Marcuse-Kubitza	inputs/.geoscrub/: switched to new-style import, using the steps at wiki.vegpath.org/Adding_new-style_import_to_a_datasource
10389	07/24/2013 12:15 PM	Aaron Marcuse-Kubitza	inputs/.geoscrub/geoscrub_output/: translated single-column filters to postprocessing derived columns, using the steps at wiki.vegpath.org/Adding_new-style_import_to_a_datasource#Translating-filters-to-postprocessing-derived-columns
10388	07/24/2013 11:18 AM	Aaron Marcuse-Kubitza	schemas/util.sql: SQL-language IMMUTABLE functions marked STRICT: removed STRICT to enable dynamic inlining, which speeds up the function up to 7x. STRICT was not removed where the function was particularly complex and the STRICT optimization would likely be more significant than inlining.
10387	07/24/2013 11:07 AM	Aaron Marcuse-Kubitza	bugfix: inputs/BRIT/specimen_flat/postprocess.sql: diameterBreastHeight_cm, height_m: use newly NULL-mapped versions of columns instead of the *_verbatim columns
10386	07/24/2013 11:04 AM	Aaron Marcuse-Kubitza	inputs/BRIT/: switched to new-style import, using the steps at wiki.vegpath.org/Adding_new-style_import_to_a_datasource
10385	07/24/2013 10:49 AM	Aaron Marcuse-Kubitza	inputs/BRIT/specimen_flat/: translated multi-column filters with _join() to postprocessing derived columns, using the steps at wiki.vegpath.org/Adding_new-style_import_to_a_datasource#Translating-filters-to-postprocessing-derived-columns
10384	07/24/2013 10:43 AM	Aaron Marcuse-Kubitza	inputs/BRIT/specimen_flat/map.csv: Habitat_Summary: remapped to UNUSED
10383	07/24/2013 10:16 AM	Aaron Marcuse-Kubitza	inputs/BRIT/specimen_flat/postprocess.sql: diameterBreastHeight_cm, height_m: updated runtimes
10382	07/24/2013 10:15 AM	Aaron Marcuse-Kubitza	inputs/BRIT/specimen_flat/: DBH_, Height_: mapped NULL-equivalent values, using the steps at wiki.vegpath.org/Adding_new-style_import_to_a_datasource#Translating-filters-to-postprocessing-derived-columns
10381	07/24/2013 09:27 AM	Aaron Marcuse-Kubitza	inputs/.../: translated multi-column filters with _avg() to postprocessing derived columns, using the steps at wiki.vegpath.org/Adding_new-style_import_to_a_datasource#Translating-filters-to-postprocessing-derived-columns
10380	07/24/2013 08:18 AM	Aaron Marcuse-Kubitza	inputs/BRIT/specimen_flat/: translated single-column filters to postprocessing derived columns, using the steps at wiki.vegpath.org/Switching_to_new-style_import#stage-I-source-specific > "translate single-column filters to postprocessing derived columns"
10379	07/20/2013 05:25 AM	Aaron Marcuse-Kubitza	/README.TXT: Maintenance: added instructions for what to do if http://vegbiendev.nceas.ucsb.edu/phppgadmin/ goes down (sometimes displaying a Not found error)
10378	07/20/2013 05:21 AM	Aaron Marcuse-Kubitza	schemas/util.sql: schema comment: added note that IMMUTABLE SQL-language functions should never be declared STRICT, because this prevents them from being inlined. inlining can create a significant speed improvement (7x+), by avoiding function calls and enabling additional constant folding.
10377	07/20/2013 05:09 AM	Aaron Marcuse-Kubitza	inputs/REMIB/Specimen/postprocess.sql: map_nulls() derived cols: documented total runtime (7.5 min on vegbiendev)
10376	07/20/2013 05:07 AM	Aaron Marcuse-Kubitza	inputs/REMIB/Specimen/postprocess.sql: map_nulls() derived cols: updated runtimes for map_nulls() inlining, which created a speed improvement of 7x for the numeric columns and 2.5x for the text columns (292563.362->41929.772 ms and 83640.424->35690.797 ms, respectively). note that the map_nulls__coord__*() calls could be optimized further by combining the successive map_nulls() calls into one, with the hstores merged.
10375	07/20/2013 04:37 AM	Aaron Marcuse-Kubitza	schemas/util.sql: map_nulls(): documented that inputs/REMIB/Specimen/postprocess.sql > country also shows that inlining is now happening properly. note that the speed improvement due to inlining is not as much, %wise, when the values util._map() is run on are long strings instead of the short strings used in the initial profiling. this is because a greater % of the time is spent in system functions such as hstore>text, which are not affected by the inlining because they are run either way.
10374	07/20/2013 04:18 AM	Aaron Marcuse-Kubitza	schemas/util.sql: map_nulls(): use new nulls_map(). proper inlining (i.e. same runtime before and after change) has been verified with the following profiling query: SELECT util.map_nulls(array[1, 2, 3]::text[], v) FROM unnest(array_fill(1, array¹⁰⁰⁰⁰⁰)) f (v)
10373	07/20/2013 04:05 AM	Aaron Marcuse-Kubitza	schemas/util.sql: added nulls_map(), for use with _map()
10372	07/20/2013 03:39 AM	Aaron Marcuse-Kubitza	lib/runscripts/table.run: postprocess(): added remake action that calls trim_table()
10371	07/20/2013 03:37 AM	Aaron Marcuse-Kubitza	lib/runscripts/table.run: added trim_table(), which calls util.trim(regclass, regclass)
10370	07/20/2013 03:23 AM	Aaron Marcuse-Kubitza	lib/runscripts/table.run: map_table(): added remake action that calls reset_col_names()
10369	07/20/2013 03:21 AM	Aaron Marcuse-Kubitza	lib/runscripts/table.run: added reset_col_names(), which calls util.reset_col_names()
10368	07/20/2013 03:19 AM	Aaron Marcuse-Kubitza	bugfix: lib/runscripts/table.run: map_table(): moved $map_table to global var so it can be used by other functions
10367	07/20/2013 03:09 AM	Aaron Marcuse-Kubitza	bugfix: lib/runscripts/table.run: postprocess(): don't propagate $remake to remake_VegBIEN_mappings(), since this will cause map.csv to be remade, which is not related to the postprocessing.
10366	07/20/2013 03:08 AM	Aaron Marcuse-Kubitza	lib/runscripts/table.run: map_table(): util.set_col_names_with_metadata(): removed unnecessary cast to regclass, which is performed implicitly. this used to be needed when the polymorphic util.rename_cols() was used instead.
10365	07/20/2013 02:57 AM	Aaron Marcuse-Kubitza	schemas/util.sql: added trim(), which trims a table to include only original columns, as defined by a map table
10364	07/20/2013 02:53 AM	Aaron Marcuse-Kubitza	schemas/util.sql: added derived_cols(), which gets table_'s derived columns (all the columns not in the names table)
10363	07/20/2013 02:29 AM	Aaron Marcuse-Kubitza	schemas/util.sql: added eval2set()
10362	07/20/2013 02:14 AM	Aaron Marcuse-Kubitza	schemas/util.sql: added drop_column()
10361	07/20/2013 01:27 AM	Aaron Marcuse-Kubitza	inputs/REMIB/Specimen/postprocess.sql: map_nulls__*(): turned off STRICT to allow dynamic inlining, which speeds up the mk_derived_col() statements by 5x (342799.823 ms -> 71533.252 ms (6 min -> 1 min) for latitude_sec)
10360	07/19/2013 07:23 PM	Aaron Marcuse-Kubitza	inputs/REMIB/Specimen/postprocess.sql: runtimes: updated for vegbiendev, before dynamic inlining. the times are about twice as fast as on starscream, so vegbiendev is faster at whatever is the limiting speed factor (probably not CPU, based on other benchmarks).
10359	07/19/2013 07:05 PM	Aaron Marcuse-Kubitza	schemas/util.sql: map_nulls(): documented that due to dynamic inlining, this is just as fast as util._map() which it wraps. dynamic inlining now brings altogether a 40x speed improvement to map_nulls() (4000 ms -> 100 ms), and would likely bring a comparable improvement for other functions that are run repeatedly and call other user-defined functions.
10358	07/19/2013 06:35 PM	Aaron Marcuse-Kubitza	bugfix: schemas/util.sql: map_nulls(): updated to use hstore(text[], anyelement), which has replaced hstore(anyarray, anyelement)
10357	07/19/2013 06:30 PM	Aaron Marcuse-Kubitza	schemas/util.sql: removed hstore(anyarray, anyelement), which did not support dynamic inlining, to avoid confusion over which hstore() function to use. use new hstore(text[], anyelement) instead (with explicit cast on the keys array if needed).
10356	07/19/2013 06:23 PM	Aaron Marcuse-Kubitza	schemas/util.sql: added hstore(text[], anyelement), which dynamically inlines properly, unlike hstore(anyarray, anyelement). this can be selected by explicitly casting the keys array to text[], which now provides a 6x speed improvement (380 ms -> 60 ms) for map_nulls().
10355	07/19/2013 05:31 PM	Aaron Marcuse-Kubitza	schemas/util.sql: fix_array(): turned off STRICT to allow dynamic inlining, which speeds up util.map_nulls() by 3x (1500 ms -> 500 ms)
10354	07/19/2013 05:15 PM	Aaron Marcuse-Kubitza	schemas/util.sql: array_length(anyarray), array_length(anyarray, dimension integer): turned off STRICT to allow dynamic inlining, which speeds up util.map_nulls(). this requires adding a `CASE WHEN $1 IS NULL THEN NULL` statement to array_length(anyarray, dimension integer) to replace the functionality provided by STRICT.
10353	07/19/2013 04:41 PM	Aaron Marcuse-Kubitza	schemas/util.sql: map_nulls(): turned off STRICT to allow dynamic inlining, which causes a 2x speed improvement¹. (see r10352 for an explanation of dynamic inlining.) note that turning off STRICT disables NULL-skipping (avoiding running a function when all its params are NULL), so it should only be used when the NULL-skipping optimization is needed less than dynamic inlining....
10352	07/19/2013 04:23 PM	Aaron Marcuse-Kubitza	schemas/util.sql: inlinable IMMUTABLE functions: avoid using config params (e.g. `SET search_path TO util`) because these prevent dynamic inlining (i.e. inlining of a function call with variable instead of constant arguments, by substituting the arguments into the function's body). dynamic inlining can speed up function evaluation significantly, because a (slow) call to a user-defined SQL function is avoided.
10351	07/19/2013 04:15 PM	Aaron Marcuse-Kubitza	schemas/vegbien.my.sql: updated for new bin/repl text mode matching, which also affects non-regexps. this causes the replacement of a few more occurrences of PostgreSQL-only one-word typenames with their MySQL equivalents.

Project

General

Profile