How to use Sling Transformers in AEM

19 April 2018
Joanna Jasnowska
Frink_Cognifide_2016_HeaderImages_0117

If you're an AEM developer, you'll know that, fairly often, you need to do a little extra manipulation on a page markup.


A common example occurs when a site is about to go live and needs stakeholder approval. In order to review a page, the customer will open it and switch to disabled wcmmode. However, if they click a hyperlink during the review, the target page will open back in editor mode. This is misleading and means they need to switch back to disabled wcmmode again. In AEM you can't stay in wcmmode while navigating between pages. But we can fix this!


Another example. These days, responsive design is no longer a choice - it is a hygiene factor for modern websites. This means displaying images of different sizes on different browser resolutions. Ideally matching image size to browser resolution should be configurable. And it would be good to avoid loading all necessary image sizes and creating different component variations for specific devices. How do you do this?

  

Solution: Transformers

 

Whilst these are different examples, they both have something in common. Implementing a solution at a component level is not a great approach as both <a> and <img> HTML tags can be rendered within a variety of components. A better approach is to parse the page markup to find the necessary tags and transform them according to business needs.

 

This is where Sling Transformers come into play. This is a powerful mechanism that rewrites the output (typically html markup) generated by the Sling rendering process. It is part of the Apache Sling Rewriter module, which uses SAX event based pipelines as shown here.

 

Every pipeline consists of three components, and each component has a corresponding Java interface and factory:

Sling Rewriter Schema

The Generator - generates SAX events from the output stream and puts them in the pipeline.

The Transformer(s) - this is where the actual markup transformations happen. Transformers are put in a chain and each one takes events from a previous item, performs modifications and forwards them to the next item in the sequence.

The Serializer - gathers transformed events from the pipeline, builds an html response and writes it to the output stream.

AEM developers typically implement just the middle part of the pipeline - custom transformers.

In Zen Garden we have several custom transformer implementations. They are responsible for wrapping html <img> tags, providing additional information for hyperlinks, or applying styles and data attributes to a component’s markup. I’d like to share some lessons learned while implementing these over the years.

The wcmmode use case

First, we need to implement org.apache.sling.rewriter.TransformerFactory:

@Component(
   immediate = true,
   service = TransformerFactory.class,
   property = {
      "pipeline.type=preserve-wcmmode-transformer"
   }
)
public class PreserveWCMModeFactory implements TransformerFactory {
   @Override
   public Transformer createTransformer() {
      return new PreserveWCMModeTransformer();
   }
}

pipeline.type  is an unique transformer identifier, referenced by a pipeline configuration.

 The implementation of org.apache.sling.rewriter.Transformer is as follows:

public class PreserveWCMModeTransformer extends DefaultTransformer {

   @Override
   public void init(ProcessingContext context, 
                   ProcessingComponentConfiguration config) throws IOException {
      // get necessary data from context
   }

   @Override
   public void startElement(String uri, String localName, String qName, 
                           Attributes attributes) throws SAXException {
      if ("a".equals(localName) && shouldBeTransformed(attributes)) {
         String href = attributes.getValue("href");
         String modifiedHref = modifyHref(href);
         AttributesImpl attributesImpl = new AttributesImpl(attributes);
         attributesImpl.setValue(attributes.getIndex(href), modifiedHref);
         super.startElement(uri, localName, qName, attributesImpl);
      } else {
         super.startElement(uri, localName, qName, attributes);
      }
   }

   private boolean shouldBeTransformed(Attributes attributes) {
      // check wcmmode additional conditions
   }

   private String modifyHref(String href) {
      // append ?wcmmode=disabled to url 
      // return modified href
   }
}

Now we need to configure an html rewriter pipeline. Configuration is stored in a repository under the path /apps/APPNAME/config/rewriter:

<custom-pipeline
   jcr:primaryType="nt:unstructured"
   contentTypes="text/html"
   generatorType="htmlparser"
   order="1"
   serializerType="htmlwriter"
   transformerTypes="[linkchecker,preserve-wcmmode-transformer]" />
   <generator-htmlparser
      jcr:primaryType="nt:unstructured"
      includeTags="[A,/A,IMG,AREA,FORM,BASE,LINK,SCRIPT,BODY,/BODY,DIV,/DIV]" />
</custom-pipeline />

Our transformer is chained in the middle of the pipeline together with OOTB LinkCheckerTransformer. Please note that both are referenced by pipeline.type property. The pipeline order is 1 which means that this configuration will be used as long as there is no other matching configuration with a higher order.   

You can check if your configuration is correctly registered on the AEM Sling Rewriter system console. And that’s it! Now the hyperlinks on our pages preserve wcmmode.

But, watch out for...

Based on our Zen Garden team experience with transformers, and according to this presentation by Justin Edelson from Adobe; you'll need to pay special attention in a couple of cases.

1. The first stumbling block is so-called “global” transformers:

@Component(
    immediate = true,
    service = TransformerFactory.class,
    property = {
        "pipeline.mode=global"
    }
)

This kind of transformer will be chained in every pipeline configured on an AEM instance, so watch out for these two consequences.

Firstly, it is impossible to provide a custom configuration for a global transformer. If you have a “named” transformer (like PreserveWCMModeTransformer) used in a custom pipeline, you can add paths property to the pipeline configuration. Et voilà - your transformer will be fired only when specific paths are processed. You cannot do that with “global” transformers because they are not tied to any specific pipeline configuration.

Secondly, the order of processing ‘global’ transformers is quite unpredictable. Theoretically, you can use the service.ranking property to set an order. But it can still lead to unexpected results, as managing the order of all particular transformers in AEM, including OOTB ones like LinkCheckerTransformer, can be tricky.

2. You should also be careful when using the
 characters() method in your transformer. The SAX parser documentation says, parsers are not required to return any particular number of characters at one time.” In other words, it is not guaranteed that the characters() method will run only once inside an element. 


Whereas, for simple cases, like converting all text to uppercase, it is not really important; for more complex manipulations it can lead to really unexpected results. So, if you need to implement the characters() method, using StringBuilder to buffer the whole input is a good idea. Actual modifications and passing to the next transformer can be then done on the buffer in endElement() method.

 

I hope this has been a useful guide on using Sling Transformers. For further information, take a look at the links below.

Links

https://sling.apache.org/documentation/bundles/output-rewriting-pipelines-org-apache-sling-rewriter.html

https://www.slideshare.net/justinedelson/mastering-the-sling-rewriter