Revision 1718
Added by Aaron Marcuse-Kubitza almost 13 years ago
map | ||
---|---|---|
165 | 165 |
'''Processes input rows |
166 | 166 |
@param process_row(in_row, i) |
167 | 167 |
''' |
168 |
i = -1 # in case for loop does not execute |
|
169 |
for i, row in enumerate(rows): |
|
170 |
if i < start: continue |
|
171 |
if end != None and i >= end: break |
|
168 |
i = 0 |
|
169 |
while end == None or i < end: |
|
170 |
try: row = rows.next() |
|
171 |
except StopIteration: break # no more rows |
|
172 |
if i < start: continue # not at start row yet |
|
173 |
|
|
172 | 174 |
process_row(row, i) |
173 | 175 |
row_ready(i, row) |
174 |
row_ct = i-start+1 |
|
176 |
i += 1 |
|
177 |
row_ct = i-start |
|
175 | 178 |
return row_ct |
176 | 179 |
|
177 | 180 |
def map_rows(get_value, rows): |
Also available in: Unified diff
bin/map: process_rows(): When iterating over each row, only retrieve the next row if the end (limit of # of rows) has not been reached. This prevents the next row from being fetched, possibly causing an entire additional consecutive XML document to be parsed, if the limit has already been reached. This is primarily useful for XML inputs with a ".0.top" segment prepended before the other documents, which contains just the first two nodes for fast parsing of this smaller XML document when only the first two nodes are needed for testing. Without this fix, the ".0.top" segment would have needed to contain the first three nodes instead.