Why care about content quality ?

By and large, content is seen as some type of art project where creativity and writers’ preferences often guide its creation. I expect this to be less and less the case in the future.

As Scott Abel (publisher of the Content Wrangler), states in the interview Understanding the need for content quality management, there are surprisingly few companies that take content quality seriously when it comes to technical communication. Often, technical writers are expected to “wing it” and take responsibility for the quality of their writing as much as humanly possible. This stands in sharp contrast with the translation industry, which has been using sophisticated quality assurance tools for decades, which include:

  • glossaries and terminology banks
  • style guides
  • spell checkers
  • readability scores

However, he states that he expects that the situation will change in the near future, especially given that intelligent content now stands out as a powerful asset for the Industry 4.0. Intelligent content, as explained in this article by the Content Marketing Institute, is content that is “ structurally rich and semantically categorized and therefore automatically discoverable, reusable, reconfigurable, and adaptable.

Improving and managing content quality for technical communication provides substantial benefits:

  • Good source text results in good translation quality, and intelligent content reduces translation costs by increasing text reuse.
  • Consistent terminology is an essential factor for a positive user experience.
  • Systematic content quality checks relieve technical writers from worrying about small mistakes.
  • Well-structured modular content can be output to multiple channels, with little to no human intervention, resulting in time and cost gains.
  • Semantically tagged content is SEO-friendly, resulting in better exposure for the company website, and boosting profits in the long term.

DITA and style guides

DITA (Darwinian Information Typing Architecture) is an open-source xml standard designed by OASIS for writing technical documentation. It is especially suited for intelligent content, since it is based on a modular architecture, where each piece of information (or “topic”) can be reused across multiple documents (or “maps”).

The content is separated from the presentation, which is handled automatically during publication, so that the author doesn’t have to worry about layout guidelines. On the other hand, writing in xml requires a specific set of editing rules – to select the right tag or to impose a certain set of attributes for example. Multiple DITA style guide projects have emerged across the web, the most authoritative being The DITA Style Guide Best Practices for Authors by Tony Self.

Enter the Dynamic Information Model

The Dynamic Information Model (DIM) (1) is an open-source project published on Github created by George Bina (Syncro Soft SRL) with the contribution of ComTech Services. It provides a toolkit and templates to create an integrated style guide in DITA, in which every rule can both be described and implemented within the authoring tool. The toolkit is designed to:

  • Publish an html version of the style guide.
  • Trigger warnings or suggestions when one rule is not respected, optionally with automatic corrective actions.
  • Point back to the rule so that the author can understand the error.

 

Behind the scenes, the implementation is handled by a library of Schematron rules and Schematron Quick Fix (SQF) actions, as well as XSLT templates to compile the rule set. The project is designed for a full integration into oXygen XML, but it can also be adapted for other tools that provide Schematron and SQF support.

 

The goal of this project is to be accessible for all technical writers without requiring any knowledge of Schematron, SQF, or XSLT. Each rule is described in a separate topic using DITA markup, in the form of a definition list (dl) placed within a section and marked with a specific audience attribute. The embedded rules use patterns from generic rules defined in a Schematron library. To enforce a rule, the user simply needs to refer to the rule name and specify the parameters.

Some of the predefined rules are:

  • restrictWords: Check the number of words to be within certain limits.
  • avoidWordInElement: Issue a warning if a word or a phrase appears inside a specified element.
  • avoidEndFragment: Issue a warning if a an element end with a specified fragment or character.
  • avoidAttributeInElement: Issue a warning if an attribute appears inside a specified element.

What does it look like ?

Let’s have a look at one of the sample rules provided in the project:

 

<?xml version=”1.0″ encoding=”UTF-8″?>
<!DOCTYPE concept PUBLIC “-//OASIS//DTD DITA Concept//EN”
“concept.dtd”>
<concept id=”AuthoringGuidelines”>
<title>Beginning a Topic</title>
<conbody>
<p>With the exception of glossary topics, you must include a title   and prolog section before you begin the body of the topic. In addition, you can optionally include a short description of the topic. The following sections provide guidelines for these common elements.</p>
<note>
<p>When creating a new topic, always start with the <keyword keyref=”companyname”/> template for the corresponding information type, if one is available. If you copy another topic, you could
inadvertently duplicate element IDs and you risk overlooking elements that you might need for the new topic that were removed from the topic you copied.</p>
</note>
<section audience=”rules”>
<title>Business Rules</title>
<p>We will recommend adding a prolog to different topic types, except for the glossary
topics.</p>
<dl>
<dlhead>
<dthd>Rule</dthd>
<ddhd>recommendElementInParent</ddhd>
</dlhead>
<dlentry>
<dt>parent</dt>
<dd>task</dd>
</dlentry>
<dlentry>
<dt>element</dt>
<dd>prolog</dd>
</dlentry>
<dlentry>
<dt>message</dt>
<dd>A prolog is required for each task. Add this just before the task body.</dd>
</dlentry>
</dl>
<dl>
<dlhead>
<dthd>Rule</dthd>
<ddhd>recommendElementInParent</ddhd>
</dlhead>
<dlentry>
<dt>parent</dt>
<dd>concept</dd>
</dlentry>
<dlentry>
<dt>element</dt>
<dd>prolog</dd>
</dlentry>
<dlentry>
<dt>message</dt>
<dd>A prolog is required for each concept. Add this just before the concept body.</dd>
</dlentry>
</dl>
<dl>
<dlhead>
<dthd>Rule</dthd>
<ddhd>recommendElementInParent</ddhd>
</dlhead>
<dlentry>
<dt>parent</dt>
<dd>reference</dd>
</dlentry>
<dlentry>
<dt>element</dt>
<dd>prolog</dd>
</dlentry>
<dlentry>
<dt>message</dt>
<dd>A prolog is required for each reference. Add this just before the reference body.</dd>
</dlentry>
</dl>
<dl>
<dlhead>
<dthd>Rule</dthd>
<ddhd>recommendElementInParent</ddhd>
</dlhead>
<dlentry>
<dt>parent</dt>
<dd>troubleshooting</dd>
</dlentry>
<dlentry>
<dt>element</dt>
<dd>prolog</dd>
</dlentry>
<dlentry>
<dt>message</dt>
<dd>A prolog is required for each troubleshooting topic. Add this just before the
troubleshooting body.</dd>
</dlentry>
</dl>
</section>
</conbody>
</concept>

This editing rule stipulates that each topic, except for glossaries, must start with a title and a prolog. Since titles are already mandatory in DITA, only the prolog rule must be enforced. The definition list references the predefined pattern “recommendElementInParent” and associates the parameters “parent” with each topic type, “element” with prolog, and “message” with the message that must be displayed in case the rule is not followed.

Making it your own

In order to extend the rule patterns library, users should get to grips with the basics of Schematron. Schematron is a basic xml-language used to search patterns in xml files. When a pattern is found (or not found), a defined action is performed, which can be an automatic correction, a suggestion or a warning.

Find out further information here : Schematron: a handy XML tool that’s not just for villains! 

(1)Apache License 2.0, copyright Syncro Soft SRL – 2015