Project

General

Profile

TNRS unsafe chars

the following characters are unsafe to run through TNRS and need to be encoded:

unsafe char encoding1 why needs encoding
\t ' !tab ' TNRS replaces with " "
\n ' !nl ' used to separate multiple names
\r ' !cr ' used to separate multiple names
" ' !quo ' TNRS removes it when at the beginning or end
% ' !pct ' TNRS URL-decodes it in matched fields
' ' !apo ' TNRS removes it when at the beginning or end
; ' !sem ' changes TNRS response format
\ ' !bsl ' TNRS removes it
_ ' !und ' TNRS replaces with " "
­ ' !sub ' TNRS removes it
× ' !mul ' TNRS replaces with "x"
(empty str) ' !pad ' prepend to empty and whitespace-only strings
! ' !exc ' our escape char

1 the ' are not part of the encoded value; they are used to indicate the whitespace padding