Diff for Design for fixing chunking bugs

Thu, 2008-07-10 02:56 by xiaojun.fengThu, 2008-07-10 09:34 by xiaojun.feng
Changes to Body
Line 6Line 6
 
</p>
 
</p>
 
<p>
 
<p>
-
&lt;topicref&gt; can be nested, and each nested &lt;topicref&gt; can have its own @chunk, in this situation the nested &lt;topicref&gt; are chunked into its own file, not conflict with the <span style="font-size: x-small; font-family: Trebuchet MS">ancestors, </span><span style="font-size: x-small; font-family: Trebuchet MS"> this brings some difficulties in implementing this feature, fortunately  the same idea of nested &lt;topicref&gt; can be used to process the problem of duplicated topicIDs, that is recursion. </span>
+
&lt;topicref&gt; can be nested, and each nested &lt;topicref&gt; can have its own @chunk, in this situation the nested &lt;topicref&gt; are chunked into its own file, not conflict with the <span style="font-size: x-small; font-family: Trebuchet MS">ancestors, </span><span style="font-size: x-small; font-family: Trebuchet MS"> this brings some difficulties in implementing this feature, fortunately the same idea of nested &lt;topicref&gt; can be used to process the problem of duplicated topicIDs, that is recursion. </span>
 
</p>
 
</p>
 
<p>
 
<p>
Line 23Line 23
 
<h3>What user need will be met by this feature?</h3>
 
<h3>What user need will be met by this feature?</h3>
 
<p>
 
<p>
-
 Chunking to content won't reprocess maps and duplicates content.
+
Chunking to content won't reprocess maps and duplicates content.
 
</p>
 
</p>
 
<h3>What is the technical design for the change?</h3>
 
<h3>What is the technical design for the change?</h3>
 
<p>
 
<p>
-
Chunking module is processed after mapref, so before chunking there is already one complete map containg all the map files refrenced by the input ditamap,  there is no need to read all the map files. If doing this, it will duplicate content. There is an exception to this change, if the transtype is eclipsehelp and the dtd of ditmap if eclipsemap, mapref won't pull referenced map content into the input ditamap file; so it is necessary to read all the map files. 
+
Chunking module is processed after mapref, so before chunking there is already one complete map containg all the map files refrenced by the input ditamap, there is no need to read all the map files. If doing this, it will duplicate content. There is an exception to this change, if the transtype is eclipsehelp and the dtd of ditmap if eclipsemap, mapref won't pull referenced map content into the input ditamap file; so it is necessary to read all the map files.
 
</p>
 
</p>
 
The condition for the exception is like this
 
The condition for the exception is like this
 
<pre>
 
<pre>
-
element.getAttribute(Constants.ATTRIBUTE_NAME_CLASS).contains(&quot; eclipsemap/plugin &quot;) &amp;&amp; transtype.equals(Constants.INDEX_TYPE_ECLIPSEHELP) 
+
element.getAttribute(Constants.ATTRIBUTE_NAME_CLASS).contains(&quot; eclipsemap/plugin &quot;) &amp;&amp; transtype.equals(Constants.INDEX_TYPE_ECLIPSEHELP)
 
</pre>
 
</pre>
 
<h3>What sections of the toolkit will be impacted by the change?</h3>
 
<h3>What sections of the toolkit will be impacted by the change?</h3>
 
<p>
 
<p>
-
ChunkModule.java and Constants.java will be impacetd by this change. 
+
ChunkModule.java and Constants.java will be impacetd by this change.
  +
</p>
  +
<h3>See also</h3>
  +
<p>
  +
none
  +
</p>
  +
<h2><em><em>Design for bug #</em></em> 2008317</h2>
  +
<h3>what user need will be met by this fixing?</h3>
  +
<p>
  +
When a map references each individual topic in a ditafile, changed dita-ot will work more efficiently
  +
</p>
  +
<h3>What is the technical design for the change?</h3>
  +
<p>
  +
 There is a map like this
  +
</p>
  +
<p>
  +
&lt;map&gt;<br />
  +
&lt;topicref chunk=&quot;to-content select-topic&quot; copy-to=&quot;A.dita&quot;<br />
  +
href=&quot;test.dita#A&quot;&gt;<br />
  +
&lt;/topicref&gt;<br />
  +
&lt;topicref chunk=&quot;to-content select-topic&quot; copy-to=&quot;B.dita&quot;<br />
  +
href=&quot;test.dita#B&quot;&gt;&lt;/topicref&gt;<br />
  +
&lt;/map&gt;<br />
  +
test.dita is as follows
  +
</p>
  +
<p>
  +
&lt;dita&gt;<br />
  +
    &lt;topic id=&quot;A&quot;&gt;<br />
  +
        &lt;title&gt;a&lt;/title&gt;<br />
  +
        &lt;body&gt;<br />
  +
            &lt;p&gt;abc&lt;/p&gt;<br />
  +
        &lt;/body&gt;<br />
  +
    &lt;/topic&gt;<br />
  +
    &lt;topic id=&quot;B&quot;&gt;<br />
  +
        &lt;title&gt;b&lt;/title&gt;<br />
  +
        &lt;body&gt;<br />
  +
            &lt;p&gt;abc&lt;/p&gt;<br />
  +
        &lt;/body&gt;<br />
  +
    &lt;/topic&gt;<br />
  +
&lt;/dita&gt;
  +
</p>
  +
<p>
  +
GenList module will copy the entire test.dita to A.dita and B.dita respectively. @chunk attribute is handled in Chunk module. If test.dita has many @conref attributes, they will be bottleneck, because @conref is handled many times, while this is not necessary.
  +
</p>
  +
<p>
  +
So when &lt;topicref&gt; both have @chunk and @copy-to, previous behavior can cause time bottleneck. @copy-to can be delayed to handle in the chunk module, because the file specified in @copy-to will be rewrote by chunk module. So it is reasonable to delay it.
  +
</p>
  +
<p>
  +
The keypoint of this technical design is the change of the dita.list file, many modules after genlist are depending on this list file. @copy-to value are not handled by the genlist, so the file specified in @href should be copied to temp directory to subsitute the  file specified in @copy-to, and @href value should be added to fulltopiclist to facilate the following modules. 
  +
</p>
  +
<p>
  +
In the chunk module, for list should be updated, they are fullditatopiclist, copytosourcelist, fullditamapandtopiclist and copytotarget2sourcemaplist. All these lists are generated in genlist module in previous behavior, while they are updated in chunk moudle after this change. 
  +
</p>
  +
<h3>What sections of the toolkit will be impacted by the change?</h3>
  +
<p>
  +
GenListModule.java and MapMetaReader.java ChunkTopicParser.java will be impacted by this change.
  +
</p>
  +
<h3>See also</h3>
  +
<p>
  +
none 
  +
</p>
  +
<p>
  +
&nbsp;
 
</p>
 
</p>
-
<h3>See also</h3>none
  
 
 
Revision of Thu, 2008-07-10 09:34:

Design for fixing chunking bugs

Design for Bug # 1897542

What user need will be met by this feature?

Users needn't worry about their topic ids conflicting with other's. Duplicated topicIDs betwwen different dita files are handled automated by Chunk module.

What is the technical design for the change?

@chunk attribute in <topicref> has seven different values, but only the vaule of to-content need to be considered according to the specification. "to-content" chunk several dita files or topics into one certain dita file, if there are duplicated topicIDs in them, they should be changed.

<topicref> can be nested, and each nested <topicref> can have its own @chunk, in this situation the nested <topicref> are chunked into its own file, not conflict with the ancestors, this brings some difficulties in implementing this feature, fortunately the same idea of nested <topicref> can be used to process the problem of duplicated topicIDs, that is recursion.

Use a hashset to store topicids which are chunked into a certain ditafile, if there are duplicated topicids, subsitutes it with random generated id. In the case of nested <topicref> with @chunk attribute, construct a new hashset, and preserve the hashset of the parent's <topicref>, restore it after processing current <topicref>.

What sections of the toolkit will be impacted by the change?

ChunkTopicParser.java is impacted by this change.

See also

none

Design for bug # 1897497

What user need will be met by this feature?

Chunking to content won't reprocess maps and duplicates content.

What is the technical design for the change?

Chunking module is processed after mapref, so before chunking there is already one complete map containg all the map files refrenced by the input ditamap, there is no need to read all the map files. If doing this, it will duplicate content. There is an exception to this change, if the transtype is eclipsehelp and the dtd of ditmap if eclipsemap, mapref won't pull referenced map content into the input ditamap file; so it is necessary to read all the map files.

The condition for the exception is like this element.getAttribute(Constants.ATTRIBUTE_NAME_CLASS).contains(" eclipsemap/plugin ") && transtype.equals(Constants.INDEX_TYPE_ECLIPSEHELP)

What sections of the toolkit will be impacted by the change?

ChunkModule.java and Constants.java will be impacetd by this change.

See also

none

Design for bug # 2008317

what user need will be met by this fixing?

When a map references each individual topic in a ditafile, changed dita-ot will work more efficiently. 

What is the technical design for the change?

 There is a map like this

<map>
<topicref chunk="to-content select-topic" copy-to="A.dita"
href="test.dita#A">
</topicref>
<topicref chunk="to-content select-topic" copy-to="B.dita"
href="test.dita#B"></topicref>
</map>
test.dita is as follows

<dita>
    <topic id="A">
        <title>a</title>
        <body>
            <p>abc</p>
        </body>
    </topic>
    <topic id="B">
        <title>b</title>
        <body>
            <p>abc</p>
        </body>
    </topic>
</dita>

GenList module will copy the entire test.dita to A.dita and B.dita respectively. @chunk attribute is handled in Chunk module. If test.dita has many @conref attributes, they will be bottleneck, because @conref is handled many times, while this is not necessary.

So when <topicref> both have @chunk and @copy-to, previous behavior can cause time bottleneck. @copy-to can be delayed to handle in the chunk module, because the file specified in @copy-to will be rewrote by chunk module. So it is reasonable to delay it.

The keypoint of this technical design is the change of the dita.list file, many modules after genlist are depending on this list file. @copy-to value are not handled by the genlist, so the file specified in @href should be copied to temp directory to subsitute the  file specified in @copy-to, and @href value should be added to fulltopiclist to facilate the following modules. 

In the chunk module, for list should be updated, they are fullditatopiclist, copytosourcelist, fullditamapandtopiclist and copytotarget2sourcemaplist. All these lists are generated in genlist module in previous behavior, while they are updated in chunk moudle after this change. 

What sections of the toolkit will be impacted by the change?

GenListModule.java and MapMetaReader.java ChunkTopicParser.java will be impacted by this change.

See also

none 

 

XML.org Focus Areas: BPEL | DITA | ebXML | IDtrust | OpenDocument | SAML | UBL | UDDI
OASIS sites: OASIS | Cover Pages | XML.org | AMQP | CGM Open | eGov | Emergency | IDtrust | LegalXML | Open CSA | OSLC | WS-I