Project

General

Profile

Aggregating validations refactoring

translate to Postgres

in validation/aggregating/*/{,*/}*.sql (the VegCore-input and VegBIEN-output queries):

  1. open in jEdit
  2. remove trailing whitespace: replace text
    
    
    with
    
    
    repeatedly, until no more replacements are made
  3. replace text
    REGEXP
    
    with
    ~
    
  4. for VegCore input queries:
    1. translate column names to VegCore:
      bin/in_place validation/aggregating/.../__.VegCore.sql env text=1 bin/repl inputs/__/__/map.csv
      
  5. enclose mixed-case identifiers in "": replace regexp
    ^(?!--)(.*)(?<![:.] )(?<!^ *)(?<!")\b([a-z]*[A-Z][a-z]\w*)\b(?!")
    
    with
    $1"$2" 
    
    repeatedly, until no more replacements are made
  6. reformat comments for diffing with schemas/vegbien.sql:
    1. for each -- commented section before a query, move just the notes in it (not the query name) to after the query and enclose them in
      /*
      ...
      */
      
      1. if applicable, move Check comments all at once: replace regexp
        ^-- (Check:.*)
        ((?s).*?;$)
        
        with
        $2
        /*
        $1
        */
        

prepend CREATE VIEW

replace vars below: <observation_type> (plots, specimens, traits)

in validation/aggregating/*/{,*/}*.sql (the VegCore-input and VegBIEN-output queries):

  1. open in jEdit
  2. replace regexp
    -- (\d+\..*)((?s).*?)(?=(?i)
    SELECT|
    
    )
    
    with BeanShell snippet
    _0+"\CREATE VIEW _<observation_type>_0"+_1.toLowerCase().replaceAll("\\W+","_")+" AS" 
    
  3. in all queries except the first, replace text
    _01
    
    with
    _1
    
  4. abbreviate view names longer than 63 chars to prevent them from being arbitrarily truncated
  5. remove no longer needed -- comments containing the query name: in everything after the header comments, replace regexp
    --.*
    
    
    with nothing
  6. change /* */ comments to COMMENT ON comments
    1. replace regexp
      ^(CREATE VIEW (.*?) AS
      (?s:.*?);)
      /\*
      
      with
      $1
      COMMENT ON VIEW $2 IS '
      
    2. replace text
      */
      
      with
      ';
      
    3. escape any ' inside '...' by doubling them

prep for pipeline

these changes do not get committed, but instead get performed right before adding finished statement(s) to the pipeline

in validation/aggregating/*/*.sql (the VegBIEN-output queries):

  1. open in jEdit
  2. replace text
    :datasource
    
    with
    "current_schema"()::text
    

remove DDL statements

after importing the finished statements to the DB, this is done as a final step so that running the query file does not alter the database

in validation/aggregating/*/{,*/}*.sql (the VegCore-input and VegBIEN-output queries):

  1. open in jEdit
  2. replace regexp
    ^CREATE VIEW (.*?) AS$
    
    with
    -- $1
    
  3. replace regexp
    ^COMMENT ON VIEW .*? IS '$
    
    with
    /*
    
  4. replace text
    ';
    
    with
    */
    
  5. replace text
    public.
    
    with nothing