/ - Diff - BIEN 3 - NCEAS Projects

« Previous | Next »

Revision 3661

Added by Aaron Marcuse-Kubitza over 12 years ago

Moved importing of col_defaults from db_xml.put_table() to bin/map, so that it also happens in row-based mode. Note that this causes a DB entry for the datasource to always be created, even if the datasource has no mappings or no rows.

inputs/Madidi/test/import.plots.xml.ref

Put template:

<VegBIEN><_ignore><inLabel>Madidi</inLabel></_ignore></VegBIEN>Inserted 0 new rows into database

<VegBIEN><_ignore><inLabel>Madidi</inLabel></_ignore></VegBIEN>Inserted 1 new rows into database

inputs/Madidi/test/import.organisms.xml.ref

Put template:

<VegBIEN><_ignore><inLabel>Madidi</inLabel></_ignore></VegBIEN>Inserted 0 new rows into database

<VegBIEN><_ignore><inLabel>Madidi</inLabel></_ignore></VegBIEN>Inserted 1 new rows into database

         db.autoanalyze = True # but don't do this in row-based import
         db.autoexplain = True # but don't do this in row-based import
         # Import col_defaults
         for col, node_ in col_defaults.items():
             col_defaults[col] = put(db, node_, row_ins_ct_ref, on_error)
         # Subset and partition in_table
         # OK to do even if table already the right size because it takes <1 sec.
         full_in_table = in_table

         pool = parallelproc.MultiProducerPool(cpus)
         log('Using '+str(pool.process_ct)+' parallel CPUs')
         # Set up DB access
         row_ins_ct_ref = [0]
         if out_is_db:
             out_db = connect_db(out_db_config)
             rel_funcs = set(sql.tables(out_db, schema_like='%',
                 table_like=r'\__%'))
         doc = xml_dom.create_doc()
         root = doc.documentElement
         out_is_xml_ref = [False]
-...
         def update_in_label():
             if in_label_ref[0] != None:
                 xpath.get(root, '/_ignore/inLabel="'+in_label_ref[0]+'"', True)
                 # TODO: Move this to the mappings as some kind of metadata
                 col_defaults['datasource_id'] = xpath.path2xml(
                     'party/organizationname="'+in_label_ref[0]+'"')
                 if out_is_db:
                     # TODO: Move this to the mappings as some kind of metadata
                     col_defaults['datasource_id'] = db_xml.put(out_db,
                         xpath.path2xml('party/organizationname="'+in_label_ref[0]
                             +'"'), row_ins_ct_ref)
         def prep_root():
             root.clear()
             update_in_label()
         prep_root()
         # Define before the out_is_db section because it's used by by_col
         row_ins_ct_ref = [0]
         if out_is_db:
             out_db = connect_db(out_db_config)
             rel_funcs = set(sql.tables(out_db, schema_like='%',
                 table_like=r'\__%'))
         def process_input(root, row_ready, map_path):
             '''Inputs datasource to XML tree, mapping if needed'''
             # Load map header

Also available in: Unified diff

Project

General

Profile

Revision 3661

Added by Aaron Marcuse-Kubitza over 12 years ago