/lib/csvs.py - Diff - BIEN 3 - NCEAS Projects

« Previous | Next »

Revision 8202

Added by Aaron Marcuse-Kubitza over 11 years ago

lib/csvs.py: stream_info(): Fixed bug where headers with multiline columns were not supported because only the first line (not the first multiline row) is sniffed for the dialect

         return dialect
     def has_unbalanced_quotes(str_): return str_.count('"') % 2 == 1 # odd # of "
     def has_multiline_column(str_): return has_unbalanced_quotes(str_)
     def stream_info(stream, parse_header=False):
         '''Automatically detects the dialect based on the header line.
         Uses the Excel dialect if the CSV file is empty.
         @return NamedTuple {header_line, header, dialect}'''
         info = util.NamedTuple()
         info.header_line = stream.readline()
         if has_multiline_column(info.header_line): # 1st line not full header
             # assume it's a header-only csv with multiline columns
             info.header_line += ''.join(stream.readlines()) # use entire file
         info.header = None
         if info.header_line != '':
             info.dialect = sniff(info.header_line)

Also available in: Unified diff

Project

General

Profile

Revision 8202

Added by Aaron Marcuse-Kubitza over 11 years ago