Closed User Groups and Access Control Lists: principles, transformation and usage.

09 May 2016
Marek Krokosinski
Frink_Cognifide_2016_HeaderImages_0117

Security is a very important part of every internet web site and Adobe Experience Manager places great emphasis on it. One of the key elements of AEM authentication are Closed User Groups (CUGs), which can be set on the author instance, to be later transformed into access control lists on the publish instance. This post describes the process of that transformation.

In this post you can find information about:

  • What happens on the author instance regarding authentication and authorization
  • What happens on the author instance regarding Closed User Groups
  • How Closed User Groups are transformed into access control lists on a publish Instance
  • How requests are authenticated and authorized when they hit a publish instance (basics)

Authentication and Authorization object overview

I would like to define four Oak and JCR objects which will be referenced several times in this post. I hope that this will make it easier to understand parts of AEM Authentication and Authorization process. Those four objects are: ACL, ACE, Privilege and Restriction

  • ACL – is an implementation of an AbstractAccessControlList which implements JCR’s JackrabbitAccessControlList. It contains a list of ACEs. It has an abstract method for validating principal and creating ACEs. Default Implementations are NodeACL and PrincipalACL.
  • ACE – AccessControlEntry. An implementation of JCR’s JackrabbitAccessControlEntry. It is an entry in the ACL object. It contains set of Restrictions, a reference to PrivilageBits and Principal objects.
  • Privilege – This object has been described as follows: A privilege represents the capability of performing a particular set of operations on items in the JCR repository. Each privilege is identified by a JCR name. JCR defines a set of standard privileges in the jcr namespace. Implementations may add additional privileges in namespaces other than jcr.
  • Restriction – restrictions are used to give the ability of more fine-grained permissions control. Default implementation contains four default restrictions: rep:glob, rep:ntNames, rep:prefixes and rep:itemNames.

classes3

How AccessControlEntries are saved into JCR repository ?

Before I begin on Closed User Groups and their transformation, I will explain how ACEs (AccessControlEntries) are saved into the JCR repository, for example when a User changes access rights to a specific node on the User Admin page.

Why do I want to talk about that first? Because Closed User Groups are transformed into ACLs and ACEs on a publish instance, and in the end, they are saved exactly in the same way as on the author instance.

On the author instance, when ACLs are set on the User Admin page, the following things happen: 

  • The AccessControlManager finds an AccessControlPolicy object for the path the changes have been made for.
  • The AccessControlManager checks if the retrieved AccessControlPolicy object is an instance of Oak’s ACL (Access Control List), 
    • if it’s not, then AccessControlException will be thrown. 
  • AccessControlManager gets an Oak's Tree object and validates itIt checks following things:
    • if the requested path exists
    • if permissions are granted
    • if the given tree has already defined access controlled content.
  • ACLs for each Tree are being saved under rep:policy node.
  • The AccessControlTree is retrieved.
    • If the node with path “/our/path/rep:policy” exists, then AccessControlManager will remove all its child Trees. 
    • If it doesn’t exist, a rep:policy node will be created.
  • When the AccessControlTree is empty, new ACEs (AccessControlEntries) are added. This is done by taking all ACEs from the ACL object.
  • For each ACE, a check is made if the given ACE is an allow or deny entry, and based on that the node with name “allow” or “deny” with a number at the end (if there is more than one entry), will be added, and the following properties will be set on that new node:
    • Property nt:name will be set to rep:GrantACE or rep:DenyACE.
    • The principal name is retrieved from the ACE object and saved under rep:principalName property.
    • Restrictions are saved into the repository under rep:restrictions property.
    • Names of the Privileges which are present in the ACE object are also retrieved, but those are saved under /jcr:system/rep:privileges 
The last two main steps are performed for each user / group which has any permission set for the given path. For each user / group there might be two nodes: allow and deny. Only read entries are not persisted under rep:policy, as by default each user / group has read access to the whole repository (excluding a few paths like /bin, /oak:index, /system, /content/projects).

Transformation of Closed User Groups into ACLs on the publish instance

Adobe defines CUGs succinctly: 

“Closed User Groups (CUGs) are used to limit access to specific pages that reside within a published internet site. Such pages require the assigned members to login and provide security credentials.”

So CUGs are an additional way to restrict access to your page. If a page is restricted only for members of given CUGs, it has property cq:cugEnabled set to true. Of course it should also have a group assigned, because without that, the page won’t be accessible even if the user is logged in. Groups which are allowed to view the given page or pages are saved under a page property called cq:cugPrincipals. You can also configure the login page path or Realm (name for the groups of pages).

When the page with CUG configured is activated, a few interesting things happen on the publish instance.

  • When a page with property starting from “cq:cug” is activated,  the page path is added to paths which may require authentication. 
  • Next, a check is made to verify if CUGs are enabled for the given page. This is done by checking the property cq:cugEnabled and by checking the path if it’s not user generated content
    • If all those conditions are true then CugRoot object is created. It contains information about CUG configuration – principal names, login page path, registration page path and realms.
  • CugRoot is added to existing CugRoots and the AccessControlList is installed.
  • To install ACLs which comes from a CugRoot object, Applicable Policies are retrieved from the AccessControlManager
  • The first instance of JackrabbitAccessControlPolicy is taken to process ACL. Usually for a content page, a returned object would be an instance of NodeACL (already mentioned in this post).
  • Now check if ACEs need to be installed at all. There are several conditions which must be met to not install ACEs again, but won't go into such detail here, because in most of the cases, ACEs need to be installed again.
  • If ACEs needs to be installed, from ACL object all ACEs are removed, 
  • A new ACE object is added for Principal "Everyone", with Privilage ALL set to Deny.
  • For each AccessControlPolicy (which were retrieved before from AccessControlManager), where the path is different from a CugRoot path, ACEs are added for exempted principals. 
    • This is done because exempted Principals are a special kind of Principal which can access a restricted page, even if they weren’t configured for the given page. An example of an exempted principal is, for example, the Principal called “administrators”, which of course makes sense, because administrators should have full access to the website. Administrators are added by default to the exempted Principal list.
  • The last entries which are added, are added for Principals which were configured for CUG. The Principal list is taken from the CugRoot object and if the Principal exists – the ACEs are added to ACL object.
  • Finally the AccessControlManager#setPolicy method is called and the policy is set. 
  • At the end, the session is saved. You can read how policies are saved at the beginning of this post.

Now it’s time to update the configuration of the Day CQ Closed User Group (CUG) Support, with new CugRoot properties. It is important that sling.auth.requirements property is updated with paths for which CUGs were configured. SlingAuthenticator listens for changes on that property.

This property is used to dynamically extend the authentication requirements from the Authentication Requirement Configuration.

Why this is important? Because SlingAuthenticator reads authentication requirements, and performs authentication for paths which are configured in sling.auth.requirements.

Authentication Requirement Configuration can be found under /system/console/slingauth

For example, one would configure CUGs for two paths, and as we can see on the screenshot below, they were included in Authentication Requirement Configuration:

authentication_requirement

SlingAuthenticator does many things. One of its roles is to handle modifications of the property sling.auth.requirements properly, It defines a listener called SlingAuthenticatorServiceListener which listens for services, which are setting that property. 

When a service is changing that property, SlingAuthenticatorServiceListener creates and then adds or removes an object called AuthenticationRequirementHolder from the PathBasedHolderCache. Of course, an AuthenticationRequirementHolder is added / removed for each path insling.auth.requirements property.

The whole transformation process of Closed User Groups is shown below. 

cug-transformation-sequence

Authentication on a publish instance

SlingAuthenticator is the main entry point for authenticating a request to SlingMainServlet. It holds a mapping between request paths and the corresponding authentication handlers, which are used to authenticate requests, but this is only the tip of the iceberg.

When SlingAuthenticator is authenticating a request, it checks if the request contains resource resolver. If resource resolver is not present then security handling begins. This is a very complicated process which deserves its own blog post, but if a user is not logged in, SlingAuthenticator checks if Anonymous access is allowed. This when the object PathBasedHolderCache list of AuthenticationRequirementHolders is retrieved. The latter class has a path, for which this object is applicable, and also contains information if authentication is required. It also contains a Service Reference, which has registered the Authentication Requirement.

If a proper Authentication Requirement entry is found and authentication is required, a redirect to the configured login page is sent to the user. If the user tried to log in, SlingAuthenticator would try to get the AuthenticationInfo object, which may be retrieved from AuthenticationHandlers objects.

SlingAuthenticator takes AuthenticationHandlers which are applicable for the requested path. The first AuthenticationHandler will extract the user credentials from the request, whose path is a prefix of the requested path.

If the method of extractCredential returns a non/null AuthenticationInfo object, then the AuthenticationFeedbackHandler is set on that object and it is returned for further processing. SlingAuthenticator won’t invoke the rest of AuthenticationHandlers, even if the first one wasn’t able to authenticate the user because, for example, credentials were invalid.

After that, all registered AuthenticationInfoPostProcessors are triggered, which can, for example, set additional properties on the AuthenticationInfo object.

The last step of authentication is to check the AuthenticationInfo type. There may be four cases:

  • If the type is set to AuthenticationInfo.DOING_AUTH – it means that the request is already authenticated or authentication is still in progress and no further processing should be made.
  • If the type is set to AuthenticationInfo. FAIL_AUTH – it means that credentials were present in the request, but authentication has failed because, for example, the password was invalid. The user will stay on the login page.
  • If the type is missing – it means that the SlingAuthenticator wasn’t able to extract credentials and the resource resolver wasn’t present in the request. The request comes from an Anonymous user, and the SlingAuthenticator needs to check if the requested path is available for anonymous users. If the path is available for anonymous users, then an anonymous resource resolver object will be set on the request. 
  • If the type is present, but it’s not AuthenticationInfo.DOING_AUTH or AuthenticationInfo. FAIL_AUTH – it means that the user has authenticated himself successfully and, in further processing,  the resource resolver object should be set on the request. Also if the user was impersonated, proper information will be added to AuthenticationInfo object.

Summary

As you can see, there is a lot happening during the authentication processing in AEM, and that's not everything, I haven't described everything in detail here, but I have discussed the basic objects which are used in authentication processing, and shown you how, in theory, simple Closed User Groups are transformed to the form which is used by Sling Authenticator. If you are interested in further details, or you want to check the code, please refer to the source code of Apache Sling and Jackrabbit Oak: