Revision 10255
Added by Aaron Marcuse-Kubitza over 11 years ago
lib/maps.py | ||
---|---|---|
48 | 48 |
|
49 | 49 |
##### Merging |
50 | 50 |
|
51 |
def simplify(str_): return re.sub(r'[\W_]+', r'', str_.lower()) |
|
51 |
def simplify(str_): return re.sub(r'#.*$|[\W_]+', r'', str_.lower())
|
|
52 | 52 |
|
53 | 53 |
def is_nonexplicit_empty_mapping(row): |
54 | 54 |
return reduce(util.and_, (v == '' for v in row[1:])) |
bin/filter_out_ci | ||
---|---|---|
7 | 7 |
import re |
8 | 8 |
import sys |
9 | 9 |
|
10 |
def simplify(str_): return re.sub(r'[\W_]+', r'', str_.lower()) |
|
10 |
def simplify(str_): return re.sub(r'#.*$|[\W_]+', r'', str_.lower())
|
|
11 | 11 |
|
12 | 12 |
def main(): |
13 | 13 |
try: _prog_name, col_num, vocab_path = sys.argv |
Also available in: Unified diff
bin/filter_out_ci, lib/maps.py: simplify(): also remove distinguishing #... suffix from terms (e.g. UNUSED#institutionID), to support mapping multiple columns to the special terms OMIT, PRIVATE, UNUSED (VegCore.vegpath.org#Special-terms), without creating a collision in the staging table renaming. note that this change must not be made to bin/canon, because this would cause suffixed terms to be autorenamed to their *un*suffixed VegCore versions.