a 'mooh' point

clearly an IBM drone

DII-workshop in Redmond - round-table discussions

... continued from DII ODF workshop catchup.

The last part of the afternoon in Redmond was a round-table discussion of standards in general; what to do with them and how to work with them in terms of handling interop with other vendors implementing the same standard. It was really interesting and it was clear that Microsoft wanted to hear our input. Everyone in the Microsoft Office "Who's who"-book was there to participate and we had a good couple of hours debating the issues at hand.

One of the really interesting guys I met there was John Head aka "Starfish". He is a Microsoft partner as well as an IBM business partner, and he really grilled Microsoft with respect to some of the decisions they had made around how the UI behaved. You should check out his thoughts on his own blog. It was clear that he had some leverage in relation to Microsoft - even though I did not agree with everything he said. 

An interesting topic was application interop. If you ask me, interop is based on standards but carried out by applications - in other words, standards do not give good interop simply by themselves. This idea was really confirmed when we talked about a thing John also mentions - how do I handle bugs in other applications? I think that it was Peter Amstein that noted that an example of this was the 1900-leap year problem where a decision made 20 years ago still haunt them. I couldn't agree more. But a similar example is application-specific extensions. ODF has this wonderful (read: awful) concept of "configuration item sets". These are specified in section 2.4 of ODF 1.0 and the usage is intended to be to store various application specific settings. The problem with these elements is that there are really no restrictions to how to use them. So you will end up with an application like OpenOffice.org 2.4 that puts data like this in the section:

[code=xml]<config:config-item config:name="AddParaTableSpacing" config:type="boolean">true</config:config-item>
<config:config-item config:name="AddParaSpacingToTableCells" config:type="boolean">true</config:config-item>
<config:config-item config:name="UseFormerLineSpacing" config:type="boolean">false</config:config-item>
<config:config-item config:name="AddParaTableSpacingAtStart" config:type="boolean">true</config:config-item>
<config:config-item config:name="UseFormerTextWrapping" config:type="boolean">false</config:config-item>
<config:config-item config:name="UseFormerObjectPositioning" config:type="boolean">false</config:config-item>
<config:config-item config:name="UseOldNumbering" config:type="boolean">false</config:config-item>[/code]

 

Lotus Symphony 1 even puts binary blobs into the configuration items to hold application specific printer settings

[code=xml]<config:config-item config:name="PrinterSetup" config:type="base64Binary">
  ugD+/wAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
  AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
  AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
  AAAAAAAAAAAAAAAAAAWAAAAAAAAAAAAAAAIAAAAAAAAAAAA
</config:config-item>[/code]

So you now have OOo (and also Lotus Symphony but to a lesser degree) put in all these settings that not only directly affects the visual layout of the document but - in terms of e.g. the "UseFormerLineSpacing" - specifices that an application should behave as OOo 1.1 .  These are really "OOo Compat-elements".

Now, the question is, what should other vendors do with these "extensions"? Well, Microsoft seems to be under a lot of pressure from organisations like the European Union to implement ODF strictly by the book, so they have chosen to ignore them (and other knowledge of bugs) completely. If you look at the settings.xml-file they actually strip it completely from content and do not use it themselves. Another example is mathematical content in text documents. As I documented some time ago, OOo has a bug requiring the MathML-fragment to include a !DOCTYPE-declaration - otherwise OOo will not display the math content. The result is that ODF with math generated by Microsoft Office will not load the math in OOo due to this OOo-bug. Is the approach chosen by Microsoft the right one? I think so for the following reasons:

  1. Otherwise the result will be an endless propagation of these settings where each implementation will need to support each and every setting from all other vendors
  2. I agree with John Head that it is good to put some pressure on OOo. It has for a long time been living relatively "low-key" in terms of critism and market pressure and it will be good for all of us to have the quality of the application be enhanced.

 

Will this hurt interop? Yes, of course it will ... but I still think it is the right decision.

Another interesting thing we discussed was extensibility - how applications should/could extend a standard. This was one of the topics where it seemed that I was dissagreeing with almost the entire room. What we talked about was: What do an application do with content it does not understand? Both ODF and OOXML have mechanisms to extend the document format with foreign namespaces etc, and I got the impression that most implementations simply remove content they do not understand when roundtripping documents. Microsoft has chosen the same approach and the argument they made was that it was imposed on them by their "Thrustworthy computing"-guys since preserving non-understood data could be used to hide sensitive information in documents. Even though I see the problem, I still think the argument is wrong. There are tons of other places and ways to hide information in a document and I'd prefer to have the unknown elements and attributes preserved when roundtripping. 

Comments (6) -

Bart Hanssens

Very interesting post. Some remarks:

- The concept itself of application specific configuration items is a good thing IMHO, although unfortunately it can be (and is) used beyond its intended purpose...

- Given a healthy competition, non-propagation of bugs will increase interoperability in the long run, since vendors will be pushed to solve these issues

- You're right on the foreign elements: you can hide sensitive data in hidden paragraphs or metadata, an extra file in the package or what not. If you worry about those things, you might want to consider using a product like 3BView.
The only issue I can think of is when signing a document, then you might want to remove unknown/unsupported elements and hidden paragraphs (then again, digital signatures aren't part of ODF 1.0/1.1, but will be part of ODF 1.2)

Thanks for the post, a very interesting reading.

I am not sure that it is a good idea to drop support for certain features just because they are not in the published standard, specially when they are part of actual documents available on the web.   Consider the efforts of the what-wg group to innovate on HTML: although the elements are not yet mandated and some of the features originated in Safari, they are now being adopted as useful features across various browsers.

Doug Mahugh

Thanks for being there, Jesper, your straightforward feedback was much appreciated.

Miguel, you make a good point about how HTML extensions have evolved.  I think most people would agree that some good things have come out of extensions that have originated in specific browsers, and there have also been some messy interoperability issues created by that informal process.  So in the case of ODF (and OOXML, for that matter), the question is whether a similarly casual approach is viable; and we're all still debating that question and its many implications.  I think the fact these formats are editable raises the bar a bit on how formal the process needs to be (i.e., more things can go wrong because the documents are dynamic), but that's just one guy's opinion.

By the way, Jesper, it's "trustworthy" computing and not "thrustworthy" ... the mind reels. Smile

Jesper - it was great to meet you as well. I felt like the oddman out at the event, being one of the few people there who did not care about the specs and file formats as much as how the apps work and what the user deals with. But I think having voices from all of the perspectives was good. I am even glad you disagreed with me on things ... everyone agreeing on everything is just so boring!

Bart,

I am not saying that application specific behavior is not desired nor should be banned. Application specific behavior is quite common but it is a "tool that should be used with caution. Remember the out-cry on the subject of the "compat-elements" of OOXML? The difference between "AutoSpaceLikeWord95" and "UseFormerObjectPositioning" is quite subtle - even though OOo is an OSS-application and the code is available. Problem is, what do you do with an application that uses these elements "wrongly" but is not OSS - like Lotus Symphony (where they are actually used) or Microsoft Office?

About digital signatures - I don't think anything should be removed before applying a digital signature. The digital signature signs the existing parts of the package and that is really the point of a digital signature.

Miguel,

Again, it is a tool to be used with caution. If you look at the "acknowledgment"-section of HTML 5 at www.w3.org/.../acknowledgements.html, you will actually see Microsoft and Safari being thanked for implementing new stuff on top of HTML 4. So the discussion is not black/white even though I am sure none of us would like to relive the bad times in the 90's during the browser wars.

Doug,

Crap ... stupid English language. It's just ... ahem ... the speed of Microsoft development is so fast that it is almost "thrust" through the regular pace of software development.

(or not)

Wink

John,

I think we all agree that the UI is as important than the underlying format. As I told you at the meeting, I agree with your assessment of the UI-problem with the "something-might-disappear-but-I-won't-tell-you-what"-dialog - I just didn't agree with your proposed solution. I talked a bit with Jean Paoli about it, and my take on a solution was more a "please show it less often"-approach where it was not shown if the visual content could be (almost completely) 1-1 mapped between the formats.

I think it would be best if the user wast confronted with a dialog showing that information from the document could potentially be lost that on the first time he could make a default settings with some checkboxes.
Something like:
* Always warn if any information is lost on saving to opendocument
* Warn when human readable content is lost
* Warn when custom data is lost
* Warn when application specific setting are lost
* Warn when MS Office only functionality is lost
However there might be to many situations to cover as i can u=imagine a lot of scenario's when the behaviour of the document could potentailly be different by saving in another format.

Comments are closed