Project

General

Profile

Statistics
| Revision:

# Date Author Comment
8117 03/20/2013 03:10 PM Aaron Marcuse-Kubitza

Added planning/workflow/(de)normalized_import.mappings.png

8116 03/20/2013 03:04 PM Aaron Marcuse-Kubitza

Added planning/workflow/denormalized_import.png, normalized_import.png

8115 03/20/2013 10:37 AM Aaron Marcuse-Kubitza

web/main/IH/: Added lowercase alias

8114 03/20/2013 10:32 AM Aaron Marcuse-Kubitza

Added web/main/IH/

8113 03/20/2013 10:12 AM Aaron Marcuse-Kubitza

inputs/input.Makefile: Staging tables installation: Added postprocess target, which runs all the postprocess.sql files

8112 03/20/2013 09:34 AM Aaron Marcuse-Kubitza

inputs/FIA/REF_SPECIES/postprocess.sql: Cast ID column to integer

8111 03/20/2013 08:52 AM Aaron Marcuse-Kubitza

inputs/FIA/*/postprocess.sql: Cluster tables by their *.unique index for faster joins

8110 03/20/2013 08:51 AM Aaron Marcuse-Kubitza

inputs/FIA/*/postprocess.sql: Cast ID columns to integer using new functions.set_col_types()

8109 03/20/2013 08:49 AM Aaron Marcuse-Kubitza

bin/psql_verbose_vegbien: Run with client_min_messages = NOTICE to display notices for debugging. This is supposed to be the default, but apparently isn't.

8108 03/20/2013 08:47 AM Aaron Marcuse-Kubitza

inputs/input.Makefile: BIEN commands: $(psqlAsBien): Use psql_verbose_vegbien instead of psql_script_vegbien so that timings and notices are displayed, which is useful for profiling and debugging

8107 03/20/2013 08:32 AM Aaron Marcuse-Kubitza

schemas/functions.sql: Added col_cast and set_col_types()

8106 03/20/2013 07:45 AM Aaron Marcuse-Kubitza

schemas/functions.sql: Added col_ref, col_type()

8105 03/20/2013 06:51 AM Aaron Marcuse-Kubitza

schemas/functions.sql: Added cluster_once()

8104 03/20/2013 06:36 AM Aaron Marcuse-Kubitza

schemas/functions.sql: Added cluster_index()

8103 03/20/2013 05:55 AM Aaron Marcuse-Kubitza

schemas/functions.sql: create_if_not_exists(): Also handle duplicate_column exceptions

8102 03/20/2013 05:54 AM Aaron Marcuse-Kubitza

schemas/functions.sql: Added rename_if_exists()

8101 03/20/2013 05:48 AM Aaron Marcuse-Kubitza

inputs/FIA/COND/postprocess.sql: Renamed oldgrowth to COND.oldgrowth so it wouldn't be renamed by to_global_col_names()

8100 03/20/2013 04:28 AM Aaron Marcuse-Kubitza

inputs/FIA/COND/postprocess.sql: Added oldgrowth column as part of the postprocessing instead of as part of the view that left joins the core tables together. This avoids needing to regenerate the oldgrowth field whenever the view is queried or materialized.

8099 03/20/2013 04:01 AM Aaron Marcuse-Kubitza

inputs/FIA/TREE/postprocess.sql: Added index on columns that join to parent tables

8098 03/20/2013 03:00 AM Aaron Marcuse-Kubitza

inputs/FIA/*/postprocess.sql: Removed table prefix from globally-unique columns that should be joined on

8097 03/20/2013 02:25 AM Aaron Marcuse-Kubitza

schemas/functions.sql: Marked STRICT functions as such

8096 03/20/2013 02:22 AM Aaron Marcuse-Kubitza

schemas/functions.sql: col_global_names(): Treat any column name that contains . as already being globally unique, and don't prepend the table name. This allows renaming the table columns after running col_global_names(), without causing the table name to be re-prepended the next time col_global_names() is run.

8095 03/20/2013 02:09 AM Aaron Marcuse-Kubitza

schemas/functions.sql: Added contains()

8094 03/20/2013 02:07 AM Aaron Marcuse-Kubitza

schemas/functions.sql: Added create_if_not_exists()

8093 03/20/2013 01:28 AM Aaron Marcuse-Kubitza

inputs/FIA/*/postprocess.sql: Use functions.to_global_col_names() to ensure that all column names are globally unique. This makes it easy to join the tables together without worrying about column name collisions.

8092 03/20/2013 01:15 AM Aaron Marcuse-Kubitza

inputs/FIA/*/postprocess.sql: Use new functions.create_if_not_exists() to allow re-running postprocess.sql idempotently

8091 03/19/2013 11:48 PM Aaron Marcuse-Kubitza

inputs/input.Makefile: Staging tables installation: %/install: Use new %.sql/run to run postprocess.sql

8090 03/19/2013 11:47 PM Aaron Marcuse-Kubitza

inputs/input.Makefile: Staging tables installation: Added %.sql/run to run postprocess.sql, etc. separately from the install targets they are a part of

8089 03/19/2013 11:47 PM Aaron Marcuse-Kubitza

inputs/input.Makefile: Staging tables installation: Added %.sql/run to run postprocess.sql, etc. separately from the install targets they are a part of

8088 03/19/2013 10:43 PM Aaron Marcuse-Kubitza

schemas/functions.sql: Added to_global_col_names()

8087 03/19/2013 10:22 PM Aaron Marcuse-Kubitza

schemas/functions.sql: col_global_names(): Use new functions.ensure_prefix() to only add the table name prefix if it doesn't already exist. This makes the function idempotent.

8086 03/19/2013 10:19 PM Aaron Marcuse-Kubitza

schemas/functions.sql: Added ensure_prefix()

8085 03/19/2013 10:17 PM Aaron Marcuse-Kubitza

schemas/functions.sql: Added has_prefix()

8084 03/19/2013 10:09 PM Aaron Marcuse-Kubitza

schemas/functions.sql: Added col_global_names()

8083 03/19/2013 09:59 PM Aaron Marcuse-Kubitza

schemas/functions.sql: Added name(regtype)

8082 03/19/2013 09:43 PM Aaron Marcuse-Kubitza

schemas/functions.sql: Added col_names()

8081 03/19/2013 09:27 PM Aaron Marcuse-Kubitza

root Makefile: Installation: Fixed bug where need to run schemas/public/install separately because schemas/install installs only the util schemas

8080 03/19/2013 09:26 PM Aaron Marcuse-Kubitza

root Makefile: Installation: install util schemas (temp functions py_functions) before inputs, so that inputs can use util functions in their postprocess.sql or create.sql scripts. (However, they must not use util functions in views or index functions, because these would be cascadingly deleted whenever the util schemas are reinstalled before an import.)

8079 03/19/2013 08:07 PM Aaron Marcuse-Kubitza

README.TXT: Single datasource import: Added by_col=1 to all commands

8078 03/19/2013 02:28 AM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: locationRemarks: Remapped to locationnarrative because location.notespublic is a boolean field

8077 03/19/2013 02:05 AM Aaron Marcuse-Kubitza

lib/sql_io.py: mk_errors_table(): Create a unique index on the MD5 of the value and error instead of on the values directly, because some strings are too long to index (e.g. row 2537268 of MO.Specimen causes an error "index row size 3032 exceeds maximum 2712 for index [...] Values larger than 1/3 of a buffer page cannot be indexed")

8076 03/19/2013 12:49 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

8075 03/16/2013 02:16 PM Aaron Marcuse-Kubitza

bin/map: No mappings warning: Added explanation that this could also be due to no column name matches, and hint to check if you are importing the correct input table

8074 03/16/2013 01:45 PM Aaron Marcuse-Kubitza

inputs/MO/: Renamed Specimen.2/ -> now available Specimen/

8073 03/16/2013 01:42 PM Aaron Marcuse-Kubitza

inputs/MO/: Removed old import in Specimen/

8072 03/16/2013 01:33 PM Aaron Marcuse-Kubitza

Refreshed MO

8071 03/16/2013 12:44 PM Aaron Marcuse-Kubitza

csvs.py: TsvReader.next(): Fixed bug where empty line needs to be separately returned as [], because csv.reader would interpret it as EOF since the line ending has already been removed

8070 03/16/2013 12:25 PM Aaron Marcuse-Kubitza

csvs.py: sniff(): TSVs: Turn off quoting because TSVs use \-escapes instead of quotes to escape delimeters, newlines, etc.

8069 03/16/2013 11:49 AM Aaron Marcuse-Kubitza

csvs.py: InputRewriter.readline(): Surround function in a try block that prints all exceptions, so that debugging information is available if an error occurs when this stream is used as input for psycopg's copy_expert() (COPY FROM)

8068 03/16/2013 06:56 AM Aaron Marcuse-Kubitza

Populated inputs/MO/import_order.txt

8067 03/16/2013 06:46 AM Aaron Marcuse-Kubitza

Refreshed SALVIAS

8066 03/16/2013 06:33 AM Aaron Marcuse-Kubitza

Added web/main/CTFS/

8065 03/16/2013 06:21 AM Aaron Marcuse-Kubitza

inputs/SALVIAS/: Regenerated salvias_*.schema.sql from the MySQL version, to take advantage of my2pg improvements. The placeholder *_index columns which take the place of MySQL's inline index definitions have now been replaced by no-op CHECK constraints, so that there are no longer lots of dummy *_index columns in the map spreadsheets.

8064 03/16/2013 05:52 AM Aaron Marcuse-Kubitza

Added web/main/Redmine/ alias to VegBIEN/Redmine/

8063 03/16/2013 05:51 AM Aaron Marcuse-Kubitza

Added web/main/VegBIEN/Redmine/

8062 03/16/2013 05:48 AM Aaron Marcuse-Kubitza

web/main/VegBIEN/.htaccess: Forward to new db/ subdir

8061 03/16/2013 05:47 AM Aaron Marcuse-Kubitza

Added web/main/VegBIEN/db/

8060 03/16/2013 05:45 AM Aaron Marcuse-Kubitza

web/main/**/.htaccess: Removed RewriteCond -l tests because one of the -d or -f tests will always also pass, making the -l test unnecessary

8059 03/16/2013 05:38 AM Aaron Marcuse-Kubitza

web/main.conf: Added tolower RewriteMap

8058 03/16/2013 05:19 AM Aaron Marcuse-Kubitza

web/main/.htaccess: use separate lowercase version when available: Also support input strings in mixed case which is not the default capitalization, in addition to all-lowercase strings

8057 03/16/2013 05:18 AM Aaron Marcuse-Kubitza

web/main/.htaccess: use separate lowercase version when available: Generate the new dirname with a separate RewriteCond so its value can be used both in the -d test and in the replacement string, rather than separately for each

8056 03/16/2013 05:03 AM Aaron Marcuse-Kubitza

web/main/.htaccess: translate dotpaths: Allow an unescaped . at the beginning of a filename, because this will never be a . separator. This adds support for hidden files in dir paths, which now won't be interpreted as dotpaths. However, regular files with extensions still need to have the filename escaped because it will otherwise be interpreted as a dotpath.

8055 03/16/2013 04:53 AM Aaron Marcuse-Kubitza

web/main/.htaccess: Set Options +FollowSymLinks. It should be on by default ("All options except for MultiViews. This is the default setting." <http://httpd.apache.org/docs/2.2/mod/core.html#options&gt;), but this makes sure it will always be enabled.

8054 03/16/2013 03:32 AM Aaron Marcuse-Kubitza

web/main/.htaccess: Name the lowercased versions of dirs with a leading . (to make them hidden) instead of a trailing _ , to avoid having each dir listed twice in a row in the dir index

8053 03/16/2013 03:21 AM Aaron Marcuse-Kubitza

Added web/main/TNRS/

8052 03/16/2013 03:15 AM Aaron Marcuse-Kubitza

Added web/main/VegBank/

8051 03/16/2013 02:57 AM Aaron Marcuse-Kubitza

Added web/main/BIEN2/

8050 03/16/2013 02:56 AM Aaron Marcuse-Kubitza

web/main/index.php: Replaced - with . in namespaces to conform to new dotpath naming convention, which allows nesting of namespaces

8049 03/16/2013 02:35 AM Aaron Marcuse-Kubitza

web/main/SALVIAS/.htaccess: Forward to new dd/ subdir

8048 03/16/2013 02:34 AM Aaron Marcuse-Kubitza

Added web/main/SALVIAS/dd/

8047 03/16/2013 02:28 AM Aaron Marcuse-Kubitza

web/main/DwC/.htaccess: Forward to new terms/ subdir

8046 03/16/2013 02:27 AM Aaron Marcuse-Kubitza

Added web/main/DwC/terms/

8045 03/16/2013 01:51 AM Aaron Marcuse-Kubitza

web/main/**/.htaccess: don't redirect subdir paths: Fixed bug where can only match non-empty string, because otherwise the rule would match this directory, which should still have its redirects processed

8044 03/16/2013 01:43 AM Aaron Marcuse-Kubitza

web/main/index.php: Added back smaller spacing between the table columns

8043 03/16/2013 01:40 AM Aaron Marcuse-Kubitza

web/main/main.css: blockquote: Removed right margin so there isn't a big space between the table columns in index.php, which results from nesting right-padded blockquotes inside one another

8042 03/16/2013 01:39 AM Aaron Marcuse-Kubitza

web/main/index.php: Changed Brad-Boyle to just Brad because people's names only have to be unique within VegPath

8041 03/16/2013 01:32 AM Aaron Marcuse-Kubitza

web/main/.htaccess: Added fallback redirect to VegCore for paths without a namespace. This can be used to link to specific VegCore terms without needing to include the VegCore namespace.

8040 03/16/2013 01:13 AM Aaron Marcuse-Kubitza

Added web/main/VegBIEN/

8039 03/16/2013 01:13 AM Aaron Marcuse-Kubitza

Added web/main/servers/vegbiendev/

8038 03/16/2013 01:13 AM Aaron Marcuse-Kubitza

Added web/main/.phpPgAdmin/

8037 03/16/2013 01:12 AM Aaron Marcuse-Kubitza

web/main/.phpMyAdmin/.htaccess: Set [redirect] flag in case the dest server is on the same machine as VegPath itself

8036 03/16/2013 12:25 AM Aaron Marcuse-Kubitza

web/main/*/ lowercase versions: Renamed with _ suffix to avoid svn conflicts on case-insensitive filesystems such as Mac HFS+

8035 03/16/2013 12:24 AM Aaron Marcuse-Kubitza

web/main/.htaccess: Support lowercase versions of mixed-case dirnames without breaking case-insensitive filesystems such as Mac HFS+

8034 03/16/2013 12:10 AM Aaron Marcuse-Kubitza

web/main/: Added lowercase symlinks for mixed-case dirs to work with subdomain translation, which uses subdomains lowercased by DNS

8033 03/16/2013 12:08 AM Aaron Marcuse-Kubitza

web/main/index.php: Use absolute URLs for dependencies to work with subdomain translation, which adds components to the URL path

8032 03/15/2013 11:30 PM Aaron Marcuse-Kubitza

web/main/**/.htaccess: Use RewriteRule instead of RedirectMatch to handle incremental redirects internally instead of issuing a (much slower) redirect to the web browser each time. This also handles edge cases better, as [last] RewriteRules can be used to control when to forward control to a subdir, and doesn't require prepending the path to the dir the .htaccess file is in. Note that this requires all gateway dirs (dirs with subdirs) to contain special RewriteRules to avoid redirecting subdir paths and handle DirectoryIndex; see web/main/DwC/.htaccess. Also note that the regexp of a catch-all RewriteRule must exactly follow the template for internal or external redirects; see web/main/SALVIAS/db/.htaccess for internal redirects and web/main/DwC/history/.htaccess for external redirects.

8031 03/15/2013 07:31 PM Aaron Marcuse-Kubitza

web/main/.htaccess: translate dotpaths: Allow the part before the [] escape to contain [], to support labels that end in [] (like PHP array vars in the query string) labels with a simple array-subscript syntax (a[b]). This also shortens the regexp and makes it more readable without the \[\] in [^.\[\]/] . Note that this also allows invalid combinations of [] exprs (e.g. more than one per level or unbalanced []), which will still be translated but will probably not have the desired result.

8030 03/15/2013 07:16 PM Aaron Marcuse-Kubitza

web/main/.htaccess: translate dotpaths: Inline the [] escape regexp into the main regexp, because it is now approximately the same length as the []-matching portion of the main regexp and this greatly simplifies the code by removing the extra RewriteCond. Note that the translation rule is now a plain regexp (run repeatedly until no match), which can be used in any programming language that supports Perl-compatible regular expressions, not just mod_rewrite.

8029 03/15/2013 07:06 PM Aaron Marcuse-Kubitza

web/main/.htaccess: translate dotpaths: discardpath explanation: Clarified that the infinite loop resulted from reappending PATH_INFO (the Apache-matched filename)

8028 03/15/2013 07:02 PM Aaron Marcuse-Kubitza

web/main/.htaccess: translate dotpaths: Require any [] escape to have the ] at the end of the level, to simplify the [] regexp

8027 03/15/2013 06:54 PM Aaron Marcuse-Kubitza

web/main/.htaccess: translate dotpaths: Use a lookahead assertion to ensure that at least one character is matched as the head of the dotpath. This ensures that (.*/)? + the rest of the regexp does not match a path with a trailing /, which is a sealed /-path and not subject to dotpath translation.

8026 03/15/2013 06:46 PM Aaron Marcuse-Kubitza

web/main/.htaccess: translate dotpaths: Only support one [] escape per dot-level to (greatly) shorten the [] regexp. This does not pose a problem for encoding . because the entire level can simply be enclosed in [].

8025 03/15/2013 06:17 PM Aaron Marcuse-Kubitza

web/main/.htaccess: Uncommented ErrorDocument

8024 03/15/2013 06:15 PM Aaron Marcuse-Kubitza

web/main/.htaccess: translate dotpaths: Removed separate sealing of the /-path, which is now performed by the main RewriteRule because it appends a / even if there is no . suffix. This does not cause an infinite loop because a character is always added (/), which prevents the previously-matched head (after the last / but before any .) from being matched again.

8023 03/15/2013 06:05 PM Aaron Marcuse-Kubitza

web/main/.htaccess: translate dotpaths: Fixed bug where it's actually the portion before the . (but after the last /) that should be subject to []-unescaping, rather than the portion after the . . Fixed bug where [] escapes were not being unescaped because the wildcard .* group matched the whole head portion instead of allowing the []-captures to match.

8022 03/15/2013 03:17 PM Aaron Marcuse-Kubitza

web/main/.htaccess: translate subdomain to path: Don't use expr RewriteConds because they are not supported by Apache 2.2. Instead issue an external redirect with the subdomain part of the hostname removed, for the purpose of changing HTTP_HOST so that the replacement is not performed again if the mod_rewrite rules are run more than once.

8021 03/15/2013 02:46 PM Aaron Marcuse-Kubitza

web/main/**/.htaccess: Use RewriteEngine, inheriting the web/main/.htaccess rules, in order to translate dotpaths that follow /-paths to existing dirs

8020 03/15/2013 02:45 PM Aaron Marcuse-Kubitza

web/main/DwC/.htaccess: Moved DwC.history redirect to web/main/DwC/history/.htaccess

8019 03/15/2013 02:40 PM Aaron Marcuse-Kubitza

web/main/SALVIAS/db/.htaccess: Removed trailing / because for DB redirects, this is apparently necessary

8018 03/15/2013 02:30 PM Aaron Marcuse-Kubitza

web/main/.phpMyAdmin/.htaccess: Prepend http:// to the dest URL stem, instead of requiring the dest URL to provide the protocol, because the two // are replaced with one / by Apache when mod_rewrite is on, creating an invalid URL