DOCX2TROFF(1) DOCX2TROFF(1)
NAME
docx2troff, docx2txt, word2troff word2txt - translate
Microsoft™ Office™ documents
SYNOPSIS
docx2troff [ file.docx ]
docx2txt [ file.docx ]
opc/word2troff
opc/word2txt
DESCRIPTION
Microsoft's new format for Office documents is a zip'ed
directory hierarchy containing XML files. This format is
known as the ``Open Packaging Convention'' or OPC.
Docx2txt is an rc(1) script that uses fs/zipfs(1) and
opc/word2txt to extract the printable text from the body of
a Microsoft Word docx document and write it on the standard
output. Typically this is then piped through fmt(1) to wrap
paragraphs.
Docx2troff is similar, but emits troff source corresponding
to the document. If the document contains tables additional
commands will be emitted for tbl(1) Opc/word2troff does not
attempt to produce an exact facsimile of the source layout,
but rather a reasonable looking troff version of the docu-
ment.
SOURCE
/sys/src/cmd/opc
SEE ALSO
xlsx2txt(1)
libxml(2)
``2007 Office Document: Open XML Markup Explained'',
http://www.microsoft.com/en-
us/download/details.aspx?id=15359
Page 1 Plan 9 (printed 4/16/26)