![[Put logo here]](media/logo.png)
Lou Burnard Consulting
Up to a point...
Documents might come in...
Good luck!
Regexp (regular expressions) allow you to specify patterns which match strings of characters and manipulate the resulting matches
xyz : the string xyz[xyz] any one of the characters x y z[\d+] one or more digits[\p{Lu}+] one or more uppercase unicode charactersXpath is a standard syntax for matching parts of an XML tree in terms of its elements and attributes
//p : any <p> anywhere//p[anchor] : any <p> containing an <anchor>//p[@rend] : any p with a rend attribute//p[starts-with(@rend,'Head')] any p whose rend attribute has a value starting with HeadRegexp and xpath are both built-in to oXygen
If you have lots of texts in a specific arcane format, it is worth investing time and effort to translate them, using whatever tools are at your disposal.
A multi-stage path is usually easiest, e.g.
Remember: the computer should be doing the boring repetitive work, not you!