Import issues¶
Merging duplicates with children¶
Problem: When a new table entry duplicates an existing entry, the 1:1 tables of that table and those tables' children are not merged, causing them to become orphaned
Affected tables:
- specimenreplicate->taxonoccurrence->locationevent->location
Solutions:
Use inheritance and unique columns where this is an issue¶
- Probably the simplest solution because it doesn't require major changes to the import algorithm
- Only works where inheritance makes sense, but this is usually true where this is a problem
- In PostgreSQL, need to translate some foreign keys to triggers because "foreign key constraints only apply to single tables, not to their inheritance children" _1
- Where inheritance doesn't make sense because the relationship isn't always 1:1, use the parent table's unique columns to uniquify the child table
- e.g. specimenreplicate->locationevent
1 http://www.postgresql.org/docs/9.1/static/ddl-inherit.html#DDL-INHERIT-CAVEATS
Avoid creating duplicate descendants by adding the descendants after their parents are inserted/uniquified¶
- For non-nullable descendants, do one of the following
- Make them nullable and check the not-null constraint later
- Insert 0 in the field and defer the foreign key constraint
- Requires running the entire subtree in one transaction
- Add a trigger that auto-adds an empty descendant, so there is something in the field
Merge descendants when duplicate parents are detected¶
- Requires determining what the descendants are, using DB introspection of the foreign keys
- Use deferred constraints to temporarily set the pkeys of the two children to the same value
- Allow foreign keys' on update actions to propagate the change to tables that point to the child
- Before/after doing that, merge all the descendants of the child
- Delete the not yet attached child, specifying it using its oid