Project

General

Profile

« Previous | Next » 

Revision 1718

bin/map: process_rows(): When iterating over each row, only retrieve the next row if the end (limit of # of rows) has not been reached. This prevents the next row from being fetched, possibly causing an entire additional consecutive XML document to be parsed, if the limit has already been reached. This is primarily useful for XML inputs with a ".0.top" segment prepended before the other documents, which contains just the first two nodes for fast parsing of this smaller XML document when only the first two nodes are needed for testing. Without this fix, the ".0.top" segment would have needed to contain the first three nodes instead.

View differences:

map
165 165
            '''Processes input rows
166 166
            @param process_row(in_row, i)
167 167
            '''
168
            i = -1 # in case for loop does not execute
169
            for i, row in enumerate(rows):
170
                if i < start: continue
171
                if end != None and i >= end: break
168
            i = 0
169
            while end == None or i < end:
170
                try: row = rows.next()
171
                except StopIteration: break # no more rows
172
                if i < start: continue # not at start row yet
173
                
172 174
                process_row(row, i)
173 175
                row_ready(i, row)
174
            row_ct = i-start+1
176
                i += 1
177
            row_ct = i-start
175 178
            return row_ct
176 179
        
177 180
        def map_rows(get_value, rows):

Also available in: Unified diff