a 'mooh' point

clearly an IBM drone

Embrace and extend - SVG in ODF revisited

One of the attack-vectors on OOXML has been the lack of reuse of existing standards. Specifically it lands directly in the discussion of DrawingML vs. SVG and OOML vs. MathML ... both of which are relatively interesting subjects. The argument has been why Microsoft chose not to reuse SVG and created DrawingML instead - and likewise with MathML and OMML.

Now, some of the arguments for reusing existing standards are:

  • Reuse of other people's code
    As a programmer, I love this - there is nothing more satisfying than being able to reuse something that others have made an effort to produce
  • Increase quality
    If something is an existing standard, someone else has propably reviewed it and the worst bugs have likely been removed.
  • Brain cycle reuse
    If you reuse some work already defined, you will propably be able to find someone in your organization that has skills in this area - and you avoid the costs of re-educating them to use a new tool.

So, with respect to ODF, it has tried to reuse as many standards as possible, so e.g. mathematical content is done using MathML and vector graphics are supposedly done using SVG. Microsoft has chosen a different path where they have created new formats for their formats, så mathematical content is done using OMML (Office Math Markup Language) and vector graphics are done using DrawingML.

A couple of weeks ago I heard some rumours that ODF had not actually only used SVG as vector graphics format but also even extended it beyond the standardized format. My initial response was that it had to be wrong information. One of the corner stones of ODF is namely that it reuses existing standards and that there is a "clean cut" between ODF and the standard it utilizes. This way I would be able to buy/aquire some library that supports SVG and simply incorporate it in my product implementing ODF. But if the referenced standard is extended - I will either experience less functionality due to extensions not being parts of the standard or I could experience crashing code when I try to pass the extended format to the external library - at least if it performs e.g. DTD/schema validation and finds out that invalid elements are present in the input.

So what did I do?

Basically I started by doing a random text-search in the ODF-spec for occurences of "[SVG]". One of the first things that caught my attention was the paragraph in section 1.3 Namespaces, Table 2 where it says:

Prefix Description
Namespace
svg
For elements and attributes that are compatible to elements or attributes defined in [SVG].
urn:oasis:names:tc:opendocument:xmlns: svg-compatible:1.0


The term "compatible to elements or attributes" seems quite odd to me, since it should not be necessary to specify this if the referenced standard is not extended. I did another quick search and I stumpled over these sections of the specification:

  • 14.14.2 SVG Gradients
  • 15.13.13 Line Join

Let me quickly walk through the contents of each section.

14.14.2 SVG Gradients

The contents of section 14.14.2 says, amongst other things.

In addition to the gradients specified in section 14.14.1, gradient may be defined by the SVG gradient elements <linarGradient> and <radialGradient> as specified in §13.2 of [SVG].

Cool!

Now, the section goes on as

The following rules apply to SVG gradients if they are used in documents in OpenDocument format:

  • The gradients must get a name. It is specified by the draw:name attribute.
  • For <linarGradient>, only the attributes gradientTransform, x1, y1, x2, y2 and spreadMethod will be evaluated.
  • For <radialGradient>, only the attributes gradientTransform, cx, cy, r, fx, fy and spreadMethod will be evaluated.
  • The gradient will be calculated like having a gradientUnits of objectBoundingBox, regardless what the actual value of the attribute is.
  • The only child element that is evaluated is <stop>.
  • For <stop>, only the attributes offset, stop-color and stop-opacity will be evaluated.

 So, to be able to determine if ODF is only referencing SVG, we need to look at section 13.2 in SVG spec. It says:

<!ELEMENT %SVG.linearGradient.qname; %SVG.linearGradient.content; >
<!-- end of SVG.linearGradient.element -->]]>
<!ENTITY % SVG.linearGradient.attlist "INCLUDE" >
<![%SVG.linearGradient.attlist;[
<!ATTLIST %SVG.linearGradient.qname;
    %SVG.Core.attrib;
    %SVG.Style.attrib;
    %SVG.Color.attrib;
    %SVG.Gradient.attrib;
    %SVG.XLink.attrib;
    %SVG.External.attrib;
    x1 %Coordinate.datatype; #IMPLIED
    y1 %Coordinate.datatype; #IMPLIED
    x2 %Coordinate.datatype; #IMPLIED
    y2 %Coordinate.datatype; #IMPLIED
    gradientUnits ( userSpaceOnUse | objectBoundingBox ) #IMPLIED
    gradientTransform %TransformList.datatype; #IMPLIED
    spreadMethod ( pad | reflect | repeat ) #IMPLIED  
>

So it seems that at least the attribute gradientUnits is not used in the ODF-adapted version of SVG.

If we look at <radialGradient>, we need to cross reference with the corresponding  DTD in SVG. It says:

<!ENTITY % SVG.radialGradient.extra.content "" >
<!ENTITY % SVG.radialGradient.element "INCLUDE" >
<![%SVG.radialGradient.element;[
<!ENTITY % SVG.radialGradient.content
    "(( %SVG.Description.class; )*, ( %SVG.stop.qname; | %SVG.animate.qname;
    | %SVG.set.qname; | %SVG.animateTransform.qname;
    %SVG.radialGradient.extra.content; )*)"
>
<!ELEMENT %SVG.radialGradient.qname; %SVG.radialGradient.content; >
<!-- end of SVG.radialGradient.element -->]]>
<!ENTITY % SVG.radialGradient.attlist "INCLUDE" >
<![%SVG.radialGradient.attlist;[
<!ATTLIST %SVG.radialGradient.qname;
    %SVG.Core.attrib;
    %SVG.Style.attrib;
    %SVG.Color.attrib;
    %SVG.Gradient.attrib;
    %SVG.XLink.attrib;
    %SVG.External.attrib;
    cx %Coordinate.datatype; #IMPLIED
    cy %Coordinate.datatype; #IMPLIED
    r %Length.datatype; #IMPLIED
    fx %Coordinate.datatype; #IMPLIED
    fy %Coordinate.datatype; #IMPLIED
    gradientUnits ( userSpaceOnUse | objectBoundingBox ) #IMPLIED
    gradientTransform %TransformList.datatype; #IMPLIED
    spreadMethod ( pad | reflect | repeat ) #IMPLIED
>

So here the attribute gradientUnits is not used as well. 

But luckily the good guys at ODF TC have solved this mystery for us - since they have decided that the value of the (non-existing) attribute gradientUnits is calculated as having a value of "objectBoundingBox", regardless of the value passed as this parameter. It's a bit odd, but I suppose it has something to do with the way the SVG-fragments positions themselves around the other objects in the document.

15.13.12 Line Join

The contents of section 15.13.13 is:

The attribute draw:stroke-linejoin specifies the shape at the corners of paths or other vector shapes, when they are stroked. The values are the same as for [SVG]'s strokelinejoin attribute, except that the attribute in addition to the values supported by SVG may have the value middle, which means that the mean value between the joints is used.

They have even been so kind to provide us with a schema fragment defining the possible usage of this feature in ODF:

<define name="style-graphic-properties-attlist" combine="interleave">
    <optional>
        <attribute name="draw:stroke-linejoin">
            <choice>
                <value>miter</value>
                <value>round</value>
                <value>bevel</value>
                <value>middle</value>
                <value>none</value>
                <value>inherit</value>
            </choice>
        </attribute>
    </optional>
</define>

Compare this with the DTD of SVG (Appendix A.1.7 Paint Attribute Model):

<!ENTITY % SVG.stroke-linejoin.attrib
    "stroke-linejoin ( miter | round | bevel | inherit ) #IMPLIED"
>

So the attribute value "middle"  is indeed an addition to SVG.

Conclusion 

You might be wondering if all this is really worth an entire article about a couple of additions/exclusions of SVG, and you kindda have a point. However, the devil lies in the details.

The modifications to SVG (even if they are minor) are bad enough as they are, because they basically kill high-fidelity interoperability when using existing SVG-libraries. When you are limiting the usage of some component (the limitations to the values of gradientUnits) you basically loose control with how existing data behaves. And when you enlarge a standard (addition of the middle-attribute of the stroke-linejoin element) you loose control with how your own data behaves when using it in other scenarios. You know, this is exactly what Microsoft did when they enlarged not only CSS but JavaScript. Maybe the memory of the ODF-founders is not that great, but I certainly remember the loads of crap-work we had to do in the late ninetees when creating web-pages to "IE5-compatible" browsers and "the rest". In fact - this nightmare still haunts us with the Microsoft additions to JavaScript.  Maybe they just thought: "If Microsoft pulled it off, so can we". I think that's a bad choice.

Also, you should note that ODF does not use SVG "as such" at all. They use fragments of SVG, i.e. elements with same names and attributes and then they fit it into the overall architecture of ODF. This is hardly "just referencing". As the paragraph says above (stroke-linejoin), the elements specifying this are not SVG-elements. They are similar to SVG-elements and even extended beyond this. I actually find it really hard to see or understand how the ODF TC can claim - with a straight face - that ODF only references SVG. I suppose that if I made my own JLSMarkup for document formats and used an element called <body> I would also be able to claim that I was reusing W3C xHTML 1.0. I just don't find it the right thing to do.

My only surprise is why this has not surfaced until now and how anyone can sit down and read in ODF (as being both pro-ODF or pro-choice) and not be just a little confused about how they could claim "just referencing existing standards", is a bit mind-baffling to me. I suppose ECMA could do the same with OOXML and claim "reusage of HTML DOM in OOXML-architecture" since a WordProcessingML-document contains both a <body>-element as well as a <p>-element.

Post scriptum

On his blog Brian Jones speculated in his last comment on the thread "Why all the secrecy?" if you could take an existing SVG-drawing, put it into an ODF-document and expect it to work. Well, just as OOXML, ODF has no limitations to what kind of data you might want to put into it, so usage of SVG in a ODF-document is indeed possible from a technical/architectural point of view. It is not a format question but an implementation-specific question. However - will it work?

ODF has several ways to embed data into the document. The two relevant means are inclusion of an SVG-drawing as an image and inclusion of an SVG-image as an object. ODF supports two ways to embed an object, as stipulated in section 9.3.3:

A document in OpenDocument format can contain two types of objects, as follows:

  1. Objects that have an OpenDocument representation. These objects are:'
    1. Formulas (represented as [MathML])
    2. Charts
    3. Spreadsheets
    4. Text documents
    5. Drawings
    6. Presentations
  2. Objects that do not have an XML representation. These objects only have a binary Representation, An example for this kind of objects OLE objects (see [OLE]).

 

Well, SVG is clearly XML but it is not an "OpenDocument representation" - but then again, neither is MathML, so I'll opt for using these two methods when trying to embed an SVG-drawing into a ODT-document:

  • Insert the SVG-drawing as an image
  • Insert the SVG-drawing as an XML part using the <draw:object>-element as specified in section 9.3.3 of the ODF spec.

I'll use the latest and greatest release of OOo, OpenOffice 2.3.1 DA, to try to display the files. You can see the SVG-file here: ex.svg (482,00 bytes)

Insert SVG as an image

I have created a small ODT-document and added the SVG-file to it. I have added an SVG-image to content.xml as a regular image and put the SVG-file in a folder by itself. The XML-file content.xml is displayed here below.

<?xml version="1.0" encoding="UTF-8" ?>
<office:document-content
 xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
 xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0"
 xmlns:draw="urn:oasis:names:tc:opendocument:xmlns:drawing:1.0"
 xmlns:xlink="http://www.w3.org/1999/xlink"
 xmlns:svg="urn:oasis:names:tc:opendocument:xmlns:svg-compatible:1.0"
>
 <office:body>
  <office:text>
   <text:p >Test of insertion of SVG-image in ODT-document</text:p>
   <text:p >
    <draw:frame
     draw:style-name="fr1"
     draw:name="grafik1"
     text:anchor-type="paragraph"
     svg:width="17cm"
     svg:height="13cm"
     draw:z-index="0">
       <draw:image
      xlink:href="SVG/ex.svg"
      xlink:type="simple"
      xlink:show="embed"
      xlink:actuate="onLoad" />
      </draw:frame>
   </text:p>
  </office:text>
 </office:body>
</office:document-content>

As it is seen the SVG-image is simply added as a regular image using the ODF-modified version of SVG. The ODT-file is available here: test svg image.odt (1,48 kb). Anyone want to take a guess on what the result of opening this file will be?

 

 

Insert SVG as an "XML-object"

As noted above ODF allows insertion of objects with an "XML-representation" as just a text file. The construction of the ODF-package is a bit more complicated and I'd be happy if anyone could tell me if I made a mistake - and what the correct way would be. As basis for my file I have used an ODT-file with a formula in MathML embedded, an so I'll just again show the contents of the content.xml-file here below.

<?xml version="1.0" encoding="UTF-8"?>
<office:document-content
  xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
  xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0"
  xmlns:draw="urn:oasis:names:tc:opendocument:xmlns:drawing:1.0"
  xmlns:xlink="http://www.w3.org/1999/xlink"
  xmlns:svg="urn:oasis:names:tc:opendocument:xmlns:svg-compatible:1.0"
  office:version="1.0">
  <office:body>
    <office:text>
      <text:p >Test of insertion of SVG in OOo</text:p>
      <text:p >
        <draw:frame
          draw:name="My SVG drawing [JLS]"
          text:anchor-type="as-char"
          svg:width="1.011cm"
          svg:height="0.467cm"
          draw:z-index="0"
        >
          <draw:object
            xlink:href="./SVG"
            xlink:type="simple"
            xlink:show="embed"
            xlink:actuate="onLoad"
           />
        </draw:frame>
      </text:p>
    </office:text>
  </office:body>
</office:document-content>

Again an xlink reference to the SVG-file is "simply" added to content.xml. The ODT-file is available here: test insert svg.odt (1,48 kb). Anyone want to take a guess on what the result of opening this file will be?

 

 

 

So it seems to recognize the SVG filetype - it just doesn't understand how to process it.

I have a feeling that I might have made an error in the manifest, so I'll include it here and hopefully someone can pinpoint if there is an error:

<?xml version="1.0" encoding="UTF-8"?>
<manifest:manifest xmlns:manifest="urn:oasis:names:tc:opendocument:xmlns:manifest:1.0">
 <manifest:file-entry manifest:media-type="application/vnd.oasis.opendocument.text" manifest:full-path="/"/>
 <manifest:file-entry manifest:media-type="image/svg+xml" manifest:full-path="SVG/ex.svg"/>
 <manifest:file-entry manifest:media-type="application/vnd.oasis.opendocument.image" manifest:full-path="SVG/"/>
</manifest:manifest>

OOo and SVG

I have said before that the devil lies in the details - but here it actually lies right up-front. You see - OpenOffice.org does presently (version 2.3.1) not suppport SVG. It doesn't support SVG as regular images and it does not support SVG as providing vector graphics or "line art". You can import SVG-images with OOo, but it is converted to OpenDocument Draw and Open Document Draw data can be exported to SVG. The import/export is not done not using OOo itself but with a filter, that converts the SVG into the internal ODF Draw format. The feature of supporting SVG is apparently the single most requested feature in OOo, so maybe it will soon be a part of OOo. Also take a look at the "General note" on the "Unsuppoted SVG features"-page of the filter:

SVG and what's named SVG-compatible in OpenDocument is really different. Therefore, the import filter can only approximate the SVG contents.

Ooh - and incidentally - the way ODF and OOo handles SVG is exactly the same way OOXML and Microsoft Office 2007 handles MathML.

Smile

Where did my line go?

When we started doing our tests in the lab and started thinking about what we thought we would be seeing, we had a very clear understanding that it would not all be blue-sky conversions and that we would identify problems - some more severe than others. We were also pretty aware, that there would be areas, where conversion was just not possible.

But - I am pretty sure I speak for the rest of the group - we were quite surprised to see which areas this concerned.

On area where absolutely nothing could be converted was ... lines. Not only line art, not only complex line drawings ... but simply - lines.

Lines are done in OOXML as either VML or DrawingML and in ODF it is done using a SVG-derivative. The puzzling thing is, that this area is apparently simply left out in either of the converters. We made some simple documents (line.docx 10,47 kb) and (line.odt 6,60 kb)  [I have re-made these for this article]. When converting these files using CleverAge 1.0 on Microsoft Office 2003 and 2007, Novell OOXML Translator (on Windows and SLED) or IBM Lotus Notes 8 (on SLED), the lines are simply removed. They are not altered, they are not just hidden, they are not moved to a different location in the document ... they are just removed.

This is another example of the overall observation from our tests ... the quality of the converters are simply not good enough today. If you look at the XML in either of the files above, you will see, that even though they look different, they basically specify the same thing (start and end-point for the line drawn), so technically it should pose no problem to be able to do a better conversion.

It is often said, that the main problem with converting from ODF to OOXML (and vice versa) is incompatibilities between the formats. This example is by first glance suporting this argument, but if you dig a bit deeper into the technicality of it, is simply boils down to a problem with bad converters.

Conclusion: The world is seldom black/white ... even if people are trying to convince you so. More often, the world is grey and depressing as a rainy day.