Project

General

Profile

Statistics
| Revision:
Name Size Revision Age Author Comment
  _archive 1598 almost 13 years Aaron Marcuse-Kubitza Moved _archive/tapir2flatClient/trunk/client/ t...
  bin 1719 almost 13 years Aaron Marcuse-Kubitza Parser.py: Renamed SyntaxException to SyntaxErr...
  config 272 about 13 years Aaron Marcuse-Kubitza Moved bien_password to new config dir
  inputs 1717 almost 13 years Aaron Marcuse-Kubitza inputs/XAL: Accepted initial test outputs
  lib 1720 almost 13 years Aaron Marcuse-Kubitza xml_dom.py: Don't print whitespace from parsed ...
  mappings 1625 almost 13 years Aaron Marcuse-Kubitza mappings/DwC2-VegBIEN.specimens.csv: minimumEle...
  schemas 1722 almost 13 years Aaron Marcuse-Kubitza Added schemas/postgresql.Mac.conf (for tuning d...
  to_do 811 almost 13 years Aaron Marcuse-Kubitza Added to_do/milestones.doc
Makefile 7.63 KB 1665 almost 13 years Aaron Marcuse-Kubitza main Makefile: php-Darwin: Added instruction to...
README.TXT 1.8 KB 1556 almost 13 years Aaron Marcuse-Kubitza README.TXT: Added instructions how to stop all ...
map 867 Bytes 1299 almost 13 years Aaron Marcuse-Kubitza map: On nimoy, use bien2_staging unless otherwi...

Latest revisions

# Date Author Comment
1722 04/02/2012 09:43 AM Aaron Marcuse-Kubitza

Added schemas/postgresql.Mac.conf (for tuning developers' local testing DBs)

1721 04/02/2012 09:42 AM Aaron Marcuse-Kubitza

schemas/postgresql*.conf: Increased checkpoint_segments and checkpoint_completion_target so that checkpoints (performance intensive) are written less often and load-balanced better

1720 04/02/2012 08:35 AM Aaron Marcuse-Kubitza

xml_dom.py: Don't print whitespace from parsed XML document when pretty-printing XML. minidom modifications section: Added subsection labels for the class each modification applies to.

1719 04/02/2012 08:20 AM Aaron Marcuse-Kubitza

Parser.py: Renamed SyntaxException to SyntaxError because it's an unexpected condition that should exit the program, a.k.a. an error

1718 04/02/2012 08:05 AM Aaron Marcuse-Kubitza

bin/map: process_rows(): When iterating over each row, only retrieve the next row if the end (limit of # of rows) has not been reached. This prevents the next row from being fetched, possibly causing an entire additional consecutive XML document to be parsed, if the limit has already been reached. This is primarily useful for XML inputs with a ".0.top" segment prepended before the other documents, which contains just the first two nodes for fast parsing of this smaller XML document when only the first two nodes are needed for testing. Without this fix, the ".0.top" segment would have needed to contain the first three nodes instead.

1717 04/02/2012 07:55 AM Aaron Marcuse-Kubitza

inputs/XAL: Accepted initial test outputs

1716 04/02/2012 07:54 AM Aaron Marcuse-Kubitza

inputs/XAL: Added maps

1715 04/02/2012 07:52 AM Aaron Marcuse-Kubitza

bin/map: Extended consecutive XML document support to direct-XML inputs (without a map spreadsheet). Factored out consecutive XML document row-iteration code into helper method get_rows() which does the iters.flatten() and itertools.imap() calls.

1714 04/02/2012 07:37 AM Aaron Marcuse-Kubitza

bin/map: Fixed bug in iteration over consecutive XML documents where only the first element of the first document was processed. Use of iters.flatten() and itertools.imap() fixes this problem so that the consecutive XML documents are regarded as a continuous stream of rows.

1713 04/02/2012 07:16 AM Aaron Marcuse-Kubitza

bin/map: Use new xml_parse.docs_iter() to iterate over each consecutive XML document in stdin

View all revisions | View revisions

Also available in: Atom