Microsoft Word 2008 Inappropriate for Education

I received my first batch of student papers this calendar year, which means I am encountering the Microsoft .docx format for the first time. The Microsoft Word 2008 .docx forrmat is incompatible with earlier versions of Word which shouldn't really matter since the .docx format is XML. However, opening a .docx file in a text editor reveals this

screen shot of MS Word 2008 .docx file opened in BBEdit

Calling that XML is like calling hamburger tofu.1 The innards of the .docx formart are binary. I defy anyone to copy-and-paste the text content of a .docx file into a new file without using Microsoft Word. XML is supposed to be human-readable to facilitate precisely such interchange. I couldn't even write a PERL script to convert that mess.2

Ideally (and in practice until Microsoft got around to making an XML DTD of their own), XML files are human-readable, like so:

screen shot of Tinderbox 4.2.1 .tbx file opened in BBEdit

That's XML.

I don't use Microsoft software except for document compatibility with publishers, colleagues, and students.

Upgrading to Microsoft Office 2008 would cost me out-of-pocket something in the neighborhood of $20 (US). Not much money. On the other hand, Microsoft has removed Visual Basic from Office 2008 for Mac OS, and this means the custom macros I use (colored text and automated insertion as I explain in the "understanding comments" document I provide my students) would not be available to me. It's not just that there is no incentive for me to upgrade; there is disincentive.

Mac users are up in arms about the crippling of Office 2008 for Mac OS. I'm frankly surprised MacWorld gave Word 2008 higher than two out of five. This seems to be of a piece with Microsoft's failures of late, including the ongoing debacle of Vista which Randall Stross covers in today's New York Times, thirteen months after Vista was released.

Furthermore, it is imperative that documents created by a publicly-funded institution such as Ohio University (where I presently teach) be accessible to people who use non-proprietary software. All documents produced by public institutions of higher education should be in open formats, formats other than Microsoft's which have never been open. The pressure Microsoft is placing upon users of older versions of Microsoft Word, upon institutions of higher learning, and upon taxpayer-funded government bodies should be the last warning anyone needs before abandoning closed formats.

Here's what I'm doing.

For the first time in the five years I have required students to submit papers electronically, I will disallow Microsoft Word documents. Starting with Spring quarter 2008, I will require submissions to be in RTF only. end of article

1 As a point of comparison, below is a screenshot of a Microsoft Word 2004-compatible document open in a text editor. You will notice that unlike files produced by Microsoft Word 2008, it contains human-readable text. This human-readable text does come quite a ways into the text, as indicated by the position of the scroll bar. The Microsoft Word 2008 .docx file in the screenshot at the top of this entry contains undreadable binary data throughout.
screen shot of MS Word 2004-compatible .doc file opened in BBEdit
2 I told my students I could handle Microsoft Word documents. Several of my students submitted .docx files and, to my chagrin, I had to ask them to send RTF files in their place. So much for compatibility and accessibility.



Word docx files are zip compressed, there is XML under the compression. Try using a zip utility and then reading the xml. Microsoft also has a converter on its website for use with the older word.

Interestingly enough Word 2008 docx files are bigger in size by far despite the compression.


To Johnnie:
Would you mind sharing offline some ways you are using TB in your work? I am starting an Ed.D in educational leadership and social justice and would love to manage my research with it. I am already using Nvivo via VM Fusion as well as Word and Endnotes on the Mac side.