TNRS unsafe chars¶
the following characters are unsafe to run through TNRS and need to be encoded:
unsafe char | encoding1 | why needs encoding |
\t |
' !tab ' |
TNRS replaces with " " |
\n |
' !nl ' |
used to separate multiple names |
\r |
' !cr ' |
used to separate multiple names |
" |
' !quo ' |
TNRS removes it when at the beginning or end |
% |
' !pct ' |
TNRS URL-decodes it in matched fields |
' |
' !apo ' |
TNRS removes it when at the beginning or end |
; |
' !sem ' |
changes TNRS response format |
\ |
' !bsl ' |
TNRS removes it |
_ |
' !und ' |
TNRS replaces with " " |
­ |
' !sub ' |
TNRS removes it |
× |
' !mul ' |
TNRS replaces with "x" |
(empty str) | ' !pad ' |
prepend to empty and whitespace-only strings |
! |
' !exc ' |
our escape char |
1 the ' are not part of the encoded value; they are used to indicate the whitespace padding