web/**/.htaccess: internal redirects: Use ?&$0 instead of just ?$0 to prevent the query string from being reinterpreted as a dotpath by the destination dir
web/VegBIEN/Redmine/*/.htaccess: Use relative URL paths. This requires adding a RewriteBase directive because these dirs can be reached by symlinks.
inputs/SALVIAS/.htaccess: Removed test RedirectMatch directive
web/**/.htaccess: RewriteOptions: Redirect dir paths without the trailing / to the datasource's homepage, and let dir paths with the trailing / point to the directory index. This is the opposite of the way it was before, and is more intuitive because the trailing / indicates a directory listing, while the trailing / would generally be omitted when linking to the datasource itself.
web/**/.htaccess: handle DirectoryIndex subrequests, which append "index" to the dir
web/**/.htaccess: RewriteOptions: Removed inherit because if the destination is not found in the current dir, a 404 should be returned instead of trying to invoke the parent dir's fallback handler (which applies only to paths in the parent dir). Ideally, we would use InheritBefore to get the subdomain preprocessing, etc., but the parent dir's fallback handler would still need to be turned off somehow when the rule was being run by a subdir.
web/.htaccess: Only forward specific terms (subpaths) to VegCore, not also the dir index itself
web/**/.htaccess: RewriteOptions: Added AllowNoSlash (available in Apache 2.4) so dir paths without the trailing / can be parsed by mod_rewrite
web/.htaccess: Turned off DirectorySlash so that the dir without the trailing / can have a separate meaning (e.g. as the home page of a datasource, rather than a listing of the short URLs available for it)
web/main.conf: <Directory .>: Added `Require all granted`, which is needed by Apache 2.4 to prevent 403 Forbidden errors. This change breaks compatibility with older versions of Apache, but is unfortunately required by 2.4.
web/**/.htaccess: parse dotpath in the query string: Use just dotpath instead of dotpath.php because the extension is added automatically by MultiViews
web/**/.htaccess: parse dotpath in the query string: Remove any index.* suffix, which is added by MultiViews in Apache 2.4
web/main.conf: Updated the path of the <Directory> directive for the DocumentRoot
web/.htaccess: Set Options +Indexes because this isn't the default in Apache 2.4
web/.htaccess: DirectoryIndex: Use index instead of hardcoding index.php because MultiViews will now add the appropriate extension automatically
web/.htaccess: Turn on MultiViews to allow auto-adding of the file extension
inputs/**/.htaccess: Use a direct filesystem path rather than going through /svn (the Redmine web interface) because this loads much faster, and avoids needing to manually specify svn-web instead of svn for files that should be displayed in the browser instead of downloaded. It also adds support for files not in svn. Note that the use of a relative RewriteRule replacement requires RewriteBase when the directory is not under the document root, in order to construct the correct self-referential URL (http://httpd.apache.org/docs/current/mod/mod_rewrite.html#rewritebase).
inputs/**/.htaccess: Removed no longer needed trailing / in paths
inputs/CTFS/*/.htaccess: Use new direct shortcuts to /Redmine subtrees
Added wiki shortcut to Redmine/wiki
web/.htaccess: use separate lowercase version when available: Support paths without a trailing /
web/.htaccess: use separate lowercase version when available: Fixed bug where need to use [last] flag, in order to rewrite in the new subdir instead of applying the rest of this dir's rules (which would forward to VegCore)
web/: datasources indexed in VegPath: Link to the corresponding inputs/ dir instead, which now stores the data and the short URLs in one place
web/.htaccess: Use %{REQUEST_URI} ("The path component of the requested URI" <http://httpd.apache.org/docs/current/mod/mod_rewrite.html#rewritecond>) instead of /$0, in order to preserve whether or not the path had a trailing slash. This also ensures that the rule functions correctly if it is invoked in a subdir context, which may occur for the "remove www subdomain" rule.
web/.htaccess: Removed "don't redirect subdir paths" because this is now handled by "don't rewrite existing paths" as described in r8375
web/: datasources indexed in VegPath: Added .htaccess files to corresponding inputs/ dir so the short URLs can also be accessed via VegBIEN's own dir for the datasource
inputs/input.Makefile: Removed .PRECIOUS from %/header.csv, %/map.csv so that these scripts are deleted on error. This is useful for the runscripts and for non-data dirs whose header.csv cannot be made.
inputs/input.Makefile: SVN: Only run %/add on subdirs with visible (non-hidden) files. Subdirs with only hidden files (e.g. .htaccess) are assumed to be non-data dirs.
inputs/input.Makefile: Staging tables installation: %/install: $(logInstall): Only use log file if log dir exists, to support non-data dirs
inputs/input.Makefile: Staging tables installation: %/install: ignore errors in $(exportHeader) and $(cleanup) if the table does not exist (i.e. a non-data dir)
inputs/input.Makefile: Staging tables installation: %/install: Don't run $(import_install_) for empty dirs because there is no data to import
lib/sql.py: parse_exception(): typed_name_re: Fixed bug where need to require the "" around the table/column name, because otherwise, the regexp will try to match as few characters as possible, causing it to match only the first letter of the name
web/**/.htaccess: Removed "handle DirectoryIndex" rule, which does not appear to be needed with the new dotpath format
web/VegBank/.htaccess: RewriteRule: Fixed bug where need to match tables by themselves as well as followed by columns
web/**/.htaccess: Added "don't rewrite existing paths" and "handle DirectoryIndex" rules to all .htaccess files, not just those for dirs that contain subdirs, to facilitate linking directly to files in the filesystem without first needing to retrofit the applicable .htaccess file.
web/**/.htaccess: Restrict redirection only of fully-existing paths, rather than of any path whose subdir exists, in order to avoid having to hardcode the absolute path to the directory containing the .htaccess file. This more restrictive approach still works because if the subdir exists and contains an .htaccess file, this (outer) .htaccess file's RewriteRules will be ignored completely until the inner .htaccess file's rules have been run, so that any subdir redirects will happen first. This "hole-punch" effect causes an existing filesystem path to always take priority over a virtual path (i.e. a redirect) defined in an outer .htaccess file.
Added web/datasources/ symlink to inputs/
web/servers/vegbiendev/fs: Updated symlink for new web/ dir nesting level
Deleted no longer used www/
Renamed www/ back to web/
Removed www/logs/ because it was actually successfully moved into main/ (but some of its files weren't)
Added www/logs/
web/: Moved auxiliary files into the main/ subdir in preparation for having just the web/ dir. Renamed web/ to www/ so it can be replaced with web/main/.
web/main/servers/vegbiendev/: Split it into db and fs forks, with db being the default. The fs fork links directly to /home/bien/svn on vegbiendev, which makes world-readable files directly web-accessible. (Permissions on /home/bien/svn and subdirs have been checked to ensure that private files are not world-readable.)
Added web/main/IH/db/
web/main/index.php: Updated fragment redirect for new dotpath format (using ? instead of a relative path)
web/main/index.php: Updated path templates for new dotpath format (using ? instead of /)
web/main/**/.htaccess: Support dotpaths in the query string instead of in the path, so that non-dotpath paths don't need to be suffixed with / to prevent their filenames from being interpreted as dotpaths. Putting dotpaths in the query string still requires only one character between the host and the path, but it's ? instead of / . ? is in many ways more natural, because the dotpath is a non-filesystem string to be parsed rather than something that's already a filesystem path. This change also avoids the need to strip trailing /s in many RewriteRules, because the dotpath mechanism is no longer appending them.
Added web/main/dotpath.php, which parses any dotpath in the query string
web/main/util.php: Added coalesce()
web/main/svn*: Fixed symlinks to use .redmine instead of Redmine
web/main/IH/.htaccess: Fixed whitespace
web/main/IH/.htaccess: RewriteRule: Fixed bug where need to \-escape % because it's a special character in the replacement string (it references a regexp group matched by the last RewriteCond)
Removed web/main/Redmine symlink so that it isn't listed in the directory listing as a visible (non-hidden) file. (There is already a .redmine symlink which matches all case variations via web/main/.htaccess logic.)
Added web/main/svn* symlinks to VegBIEN/Redmine/svn*
web/main/svn*/: Removed in preparation for replacing with symlinks, which will work now that absolute paths are used where needed
web/main/**/.htaccess: internal redirects using relative paths with .. : Use absolute paths instead so that when the directory is reached through a symlink, the redirect will still work. Note that relative paths without .. do not need to be absolute paths because the subtree structure is the same (just the parent dirs are different).
schemas/VegCore/VegCore.ERD.mwb: Regenerated exports
schemas/VegCore/VegCore.ERD.mwb legend images: resized to medium-sized squares because apparently MySQL Workbench can't handle rectangular or very large images (it distorts them or resets them to their original size when the diagram is reloaded)
schemas/VegCore/VegCore.ERD.mwb: Legend: Added IS-A, HAS-A, and "inherits from record" entries
Added lib/MySQL_Workbench/connector.png, solid_line.png, dotted_line.png
schemas/VegCore/VegCore.ERD.mwb: Fixed lines
schemas/VegCore/VegCore.ERD.mwb: Renamed placename to named_place to match the table name
schemas/VegCore/VegCore.ERD.mwb: Renamed measurement to trait because this is the more commonly used name for the entity
schemas/VegCore/VegCore.ERD.mwb: Added VegCore logo
Added schemas/VegCore/VegCore.logo.svg
schemas/VegCore/VegCore.ERD.mwb: taxon: Added recursive parent fkey for (optionally) storing taxon hierarchies
schemas/VegCore/VegCore.ERD.mwb: Repositioned taxon_determination
schemas/VegCore/VegCore.ERD.mwb: Moved taxon next to qualified_taxon instead of above it, because inheritance (IS-A) is shown vertically while HAS-A is shown horizontally
schemas/VegCore/VegCore.ERD.mwb: Populated legend
schemas/VegCore/VegCore.ERD.mwb: Added exports
schemas/VegCore/VegCore.ERD.mwb: Fixed lines and settings for the Linux MySQL Workbench
schemas/VegCore/VegCore.ERD.mwb: Added table colors
Removed backup file schemas/VegCore/VegCore.ERD.mwb.bak
Added schemas/VegCore/VegCore.ERD.mwb, VegCore.my.sql with first VegCore ERD and MySQL schema. All tables are in the ERD, but contain only pkey and fkey columns.
lib/sql.py: mk_select(): using subset function: Turn off enable_sort (within the transaction) to avoid unwanted slow sorts. This change (along with the subset functions themselves) should significantly reduce the long FIA.occurrence_all table subset time (~8 hours altogether) and with it the total import time, which had more than doubled as a result of the FIA refresh. Note that this issue would have been even more pronounced for larger datasets, such as the GBIF refresh, which would have taken ~2.5 days longer (400 million rows * ~30% are plants * (FIA: ~8 hours/16.7 million rows) * 1 day/24 hours).
lib/sql.py: mk_select(): Use subset function when it's available for fast querying at large OFFSET values
lib/sql.py: Added has_subset_func()
inputs/FIA/occurrence_all/import: Run mk_subset_by_row_num_func() to make the subset functions available for fast querying at large OFFSET values
schemas/util.sql: mk_subset_by_row_num_func(): regular subset function: Fixed bug where need to add 1 to the 0-based offset_ to get the 1-based row_num (which is usually a serial column)
schemas/util.sql: mk_subset_by_row_num_func(): regular subset function: Fixed bug where need to subtract 1 from the end row_num because BETWEEN limits are inclusive of the bounds
schemas/util.sql: mk_subset_by_row_num_func(): regular subset function: Fixed bug where also need to COALESCE offset_ to 0 when it's added to the limit_
schemas/util.sql: mk_subset_by_row_num_func(): subset function which turns off enable_sort: Fixed bug where need to pass ($2, $3) instead of ($1, $2) to the regular subset function
inputs/FIA/occurrence_all/import: Added occurrence_all-row_num column for use with mk_subset_by_row_num_func()
schemas/util.sql: mk_subset_by_row_num_func(): Also create subset function which turns off enable_sort. This is used for limit values greater than ~100,000 to avoid unwanted slow sorts. The regular subset function is still needed to work with EXPLAIN, so that it produces expanded output instead of just a function scan.
schemas/util.sql: Added mk_subset_by_row_num_func()
schemas/util.sql: Added type_qual_name()
schemas/util.sql: force_update_view(): Fixed bug where also need to drop view for "cannot change name of view column" errors
inputs/FIA/occurrence_all/import: Use new force_update_view(), which only drops the view if its columns have changed and otherwise just uses CREATE OR REPLACE VIEW, rather than always first running DROP VIEW IF EXISTS
schemas/util.sql: Added force_update_view()
bin/make_analytical_db: Commented out export_analytical_db because we are not yet using the analytical DB in MySQL, and it doesn't make sense to generate a large, unused CSV export each time
bin/export_analytical_db: Replaced analytical_aggregate with analytical_stem
inputs/FIA/occurrence_all/: Updated header.csv for new column order
inputs/FIA/occurrence_all/import: Use directional joins (LEFT/RIGHT JOIN) instead of inner joins to ensure that the PostgreSQL query planner always joins starting with the TREE table. Note that the directional joins are now needed for a different reason than when they were initially added, which had been to avoid slow sorts. The sorts (at least for LIMIT-only queries) went away when small tables such as COUNTY and REF_UNIT were added to the joins.
inputs/FIA/*/map.csv: Changed newlines between table and field name to - because the newlines mess up the flow of queries and also break pgAdmin's display of EXPLAIN output. The - was chosen because it's a non-whitespace character that linewraps in browsers, phpPgAdmin, and Google spreadsheets (although unfortunately not in pgAdmin). It is better than space because you can set a text editor to treat it as a word character, allowing the entire column name (<table>-<field>) to be selected by double-clicking it.
Added planning/workflow/normalized_vs_denormalized/denormalized.generic_standardizations.png (a slide from Brad's bien3_architecture_denormalized.pptx PowerPoint), which shows the staging table preprocessing particularly well
README.TXT: Full database import: record the import times in inputs/import.stats.xls: Added instructions for what to do if the rightmost imports start getting truncated due to the 255-column limit in spreadsheets. (This will occur in 8 imports.)
inputs/import.stats.xls: Removed the previous imports from the current tab because they are also in the 2012-6~9 tab, and should not be in two places
inputs/import.stats.xls: Updated import times. MO and FIA have been refreshed.
Removed no longer needed inputs/GBIF/import. Use ./run instead.