Project

General

Profile

« Previous | Next » 

Revision 5903

sql.py: distinct_table(): Use DISTINCT ON instead of a unique index and insert_select()'s ignore mode to remove duplicate rows. This uses whichever sorting method PostgreSQL deems to be fastest instead of requiring the use of a B-tree index. Since most of the slower operations in TNRS's import are distinct_table() calls, this should speed up the TNRS import, which is a bottleneck for the DB import as a whole because the TNRS import must complete before other datasources can be imported.

  • added
  • modified
  • copied
  • renamed
  • deleted