OOXML has been accused of being rushed through not even the writing itself but also certification in both ECMA and ISO. It's a quick accusation to make but sometimes it can be really tricky to figure out if a statement is true or false. But you know, sometimes you stumple over something that really shows you that the specification was rushed through not only preliminary editing but also certification in ISO.
The one thing I noticed in was password hashing. As with other document formats, document protection can be defined in multiple ways. There is of course protection of the document itself but most document formats also allow protection of specific parts of the document or even read-only protection of the document. The way it's usually done is to ask the user for a password, hash it and store it in the document. When the document is opened the next time, the user is prompted for a password, and if it matches the stored value - the protection of the document (or parts of it) is released.
Now, this is defined, amongst other places, in section 4.4.1 (Section attributes) where it deals with protection of sections. The text says:
A section can be protected, which means that a user can not edit the section. The text:protected attribute indicates whether or not a section is protected. The user interface must enforce the protection attribute if it is enabled.
This is more or less what I wrote above. It also says:
A user can use the user interface to reset the protection flag, unless the section is further protected by a password. In this case, the user must know the password in order to reset the protection flag. The text:protection-key attribute specifies the password that protects the section. To avoid saving the password directly into the XML file, only a hash value of the password is stored.
And that's it.
WTF? Nothing more? Nothing about how to specify the hashing algorithm? Nothing about how to specify initialization vectors, prepending of zeroes ... nothing?
But wait - what if we look in the schema itself - maybe it's just the descriptive text that is a bit ... ahem ... limited. Ok - the schema says:
<define name="sectionAttr" combine="interleave">
Dammit - nothing here either. Notice also that it is not possible to store the way the hash-value is persisted. Is it a bit-sequence? A Hex'ed bit-sequence? A Base64-sequence? Nothing!
But wait again - let's look into the file of an actual document with read-only protection. Let's see what is stored in the document. Well, the XML-fragment lists as:
Any clever suggestions for me as an ocument consumer to what to do with this value? This is truly amazing. One one hand the authors talk about their document format being able to provide true and pure interoperability ... but they haven''t specified something as common as document protection. I wonder how they can claim this with a straight face. Interoperability is certainly not enabled by limiting the details of the specification to as little as this ... but maybe they just hope noone will use this feature and thereby have "interoperability by rejection".
I cannot help to wonder: who in their right mind would put up a suggestion for standardisation of a document format that was unspecified in such a central feature as "document protection". This must be one of those places where
Ratification trumphs perfection
Yeah, well ...