From: Karl-Heinz Zimmer (khz@stardivision.de)
Date: Tue Mar 30 1999 - 09:52:09 CEST
Am 29.03.1999, 22:18:36, schrieb Michele Andreoli:
> You work at StarDivision, therefore you know much about MS Word
> format.
> In your opinion, it's possibile to develope an "awk" script
> which strip escape from a .doc document, converting them to
> plain text format?
Very sorry: it will not be possible to do that.
(Ths is not my opinion, but i am SURE of it!)
The way the store information in their files is far different from 
normal encoding procedures: they use a so called 'Storage' format 
containing several 'Streams' containing the data in a somewhat random 
way. ;-)
True: as one can see in the file format documentation on their webpage 
(only when going there with Internet Explorer) it will never be 
possible for a script to extract content correctly from an WW97 doc.
Maybe you can get out parts of the info from a WW95 or WW6 doc but 
even that is not sure...
Sorry for the bad news,
      Karl-Heinz
This archive was generated by hypermail 2.1.6 : Sat Feb 08 2003 - 15:27:11 CET