different used regex-expressions to find html tags in a text
- separate pro-, pri- and post- expression parts in
()
-
?<=
indicates characters at the beginning of the search -
?=
the end of the search -
|
is OR -
?
used for not greedy expression -
(.*?)
any characters in expression -
[\\u10000-\\uEFFFF]
find all UTF-16 character encodings (includes a-z, the Chinese and Greek alphabet etc.)