Project

General

Profile

Statistics
| Revision:
Name Size Revision Age Author Comment
  _archive 1598 over 12 years Aaron Marcuse-Kubitza Moved _archive/tapir2flatClient/trunk/client/ t...
  analysis 3076 over 12 years Aaron Marcuse-Kubitza Added top-level analysis dir for range modeling
  backups 4751 about 12 years Aaron Marcuse-Kubitza backups/Makefile: Backups: Full DB: Specify the...
  bin 5591 about 12 years Aaron Marcuse-Kubitza sql_io.py: import_csv(): Take a reader and head...
  config 272 almost 13 years Aaron Marcuse-Kubitza Moved bien_password to new config dir
  inputs 5600 about 12 years Aaron Marcuse-Kubitza mappings/VegCore-VegBIEN.csv: Remapped people's...
  lib 5594 about 12 years Aaron Marcuse-Kubitza sql_io.py: import_csv(): Add a row_num column a...
  mappings 5600 about 12 years Aaron Marcuse-Kubitza mappings/VegCore-VegBIEN.csv: Remapped people's...
  schemas 5599 about 12 years Aaron Marcuse-Kubitza schemas/vegbien.sql: party: Added fullname
  to_do 4524 about 12 years Aaron Marcuse-Kubitza to_do/timeline.doc: Updated to reflect addition...
  validation 4523 about 12 years Aaron Marcuse-Kubitza Added validation/
Makefile 9.86 KB 5459 about 12 years Aaron Marcuse-Kubitza Makefile: Moved setting of $(root) before inclu...
README.TXT 12.9 KB 5563 about 12 years Aaron Marcuse-Kubitza README.TXT: Data import: import_all: Added NCBI...
map 989 Bytes 5158 about 12 years Aaron Marcuse-Kubitza root map: Removed no longer needed public schem...
new_terms.csv 30.4 KB 4887 about 12 years Aaron Marcuse-Kubitza Regenerated root unmapped_terms.csv, new_terms.csv
unmapped_terms.csv 5.8 KB 4887 about 12 years Aaron Marcuse-Kubitza Regenerated root unmapped_terms.csv, new_terms.csv

Latest revisions

# Date Author Comment
5600 10/17/2012 01:12 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: Remapped people's names split apart into name components in party to new party.fullname, which does not require splitting or make assumptions about the number of people who may be listed in a particular name field and which components of their name(s) are present

5599 10/17/2012 01:02 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: party: Added fullname

5598 10/17/2012 12:55 PM Aaron Marcuse-Kubitza

mappings/VegCore.csv: Added accordingTo

5597 10/17/2012 12:47 PM Aaron Marcuse-Kubitza

inputs/.TNRS/tnrs/map.csv: Mapped Name_matched_url to scientificNameID, since the URL uniquely identifies the matched taxonconcept

5596 10/17/2012 12:43 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: taxonconcept: Renamed taxonname to taxonepithet for clarity and to be consistent with TCS's use of "epithet" to denote what the taxonname was intended to be (http://www.tdwg.org/standards/117/download/#/UserGuidev_1.3.pdf)

5595 10/17/2012 12:18 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: taxonconcept.creator_id: Documented that this is the concept reference for a taxon concept with an "according to", or the identifier's name for a nominal concept, and is equivalent to "Name sec. x"

5594 10/17/2012 11:50 AM Aaron Marcuse-Kubitza

sql_io.py: import_csv(): Add a row_num column at the beginning of the table, which is autopopulated by csvs.RowNumFilter (it cannot be autopopulated by the serial datatype, because this does not support COPY FROM with a NULL-equivalent value in the serial field). This fixes a bug in csv2db where rows would not stay in inserted order upon querying the table, and would be returned in a different order each query, which prevented LIMIT/OFFSET based subsetting from returning consistent, nonoverlapping results. This occurs because PostgreSQL unfortunately does not return rows in inserted order (or any stable order: "If sorting is not chosen, the rows will be returned in an unspecified order [which] must not be relied on" <http://www.postgresql.org/docs/8.3/static/queries-order.html&gt;), so an explicit ORDER BY is always needed to ensure staging table rows are retrievable in the order they were inserted.

5593 10/17/2012 11:43 AM Aaron Marcuse-Kubitza

csvs.py: Added RowNumFilter, which adds a row # column at the beginning of each row

5592 10/17/2012 11:42 AM Aaron Marcuse-Kubitza

streams.py: LineCountStream, LineCountInputStream: Fixed bug where line_num was 1 too high because it started at 1 and was incremented before each line is returned. It now properly starts at 1, but the initial line_num value is 0 to increment to 1 upon encountering the first line. This off-by-one behavior may have been needed for code that associates an error message with a line #, but such code should add 1 to the line_num to get the line # of the error if the error prevents the next line from being read by the LineCount*Stream.

5591 10/17/2012 11:04 AM Aaron Marcuse-Kubitza

sql_io.py: import_csv(): Take a reader and header rather than a stream to allow callers to pass in a wrapped CSV reader for filtering, etc.

View all revisions | View revisions

Also available in: Atom