How can I extract the text that a user attaches in a Word document, if I have inserted a content control? Is it possible that I should even update after the content control was changed?
Extract File Attached With Word Document
Hi Chester,
You can try using Apache POI, this provides basic text extraction this can also access metadata. This also supports text extraction for Rich text extraction, for .doc files you can use org.apache.poi.hwpf.extractor.WordExtractor, this will have the text on your document extracted. It can also work on powerpoint, publisher, visio and embedded objects. Once you finalize the document you can use the tool to extract the text.