The IT audit and compliance agendas of banks and insurance companies and of critical IT infrastructures definitely include “privilege xyz monitoring” topics such as “privileged user monitoring,” “privileged activity monitoring,” “privilege escalation prevention,” etc. In the stressful context of requiring a “real” audit success story, by passing a mainframe audit with honors, you need to be aware of the “philosophical dimensions” of these “privilege(d) xyz” aspects.
It’s of great benefit during any interviews with auditors to know exactly which of your implemented powerful and streamlined monitoring and detection capabilities are focusing on “privileged user monitoring” and which on “privilege escalation detection” or “privilege misuse detection.” SF-Sherlock, for example, has always been firmly committed to all kinds of “privileged xyz” monitoring aspects. The primary basis is its powerful real-time event monitoring. This also includes the detection of tricky dynamic manipulations such as of sensitive control blocks residing in memory in order to reduce an attacker’s chance of a successful privilege escalation.
Regardless of how you implement your “privilege xyz” monitoring, a success story requires the clearly understandable and complete tagging of users, resources, and actions, namely as privileged or non-privileged (regular). This tagging is done either before sending such information to a SIEM, or afterwards within the SIEM. This brings up another important decision: where, within a SIEM or SOC integration of the mainframe, is the tagging-related “intelligence” located? On the mainframe or inside the SIEM? Which way would be better? Who is most familiar with all details required to tag something as “privileged”? We strongly believe that in most IT environments, it’s the mainframe itself; it knows best about all details and therefore should send principally classified events to the SIEM in use. Of course, as a vendor it might seem that we are not neutral, but here are some strong arguments that might convince you:
- It’s hard for the SIEM to “dig” on the mainframe for all the knowledge required to properly tag users, resources, and activities. Therefore, the mainframe has to collect and forward that knowledge to the SIEM anyway – at regular intervals.
- The mainframe always knows about its current status – it’s a kind of real-time knowledge. The SIEM’s knowledge depends on a continuous and thus very frequent update and refresh – which means potential overhead.
- Deciding on the “privilege” factor given for an event’s user, resource or operation is a complex task. Performing it directly within the event monitoring context is much more precise and efficient.
- By receiving “out of the box” classifications within each event, your SIEM solution may work with even easier and simpler selections for feeding the established dashboards, reports, and alerts.
Therefore, in case of a mainframe-located event monitoring, an important task is to implement an efficient and detailed enrichment of raw events with all the additionally required “privilege” information – that is not part of the native SMF, syslog, or other event record. SF-Sherlock and SF-NoEvasion users should know: In total, you get the benefit of two areas where automatic tagging is applied before an event is sent to your SIEM: a) the standard-policy-rule classification, combined with an alert level, and b) the “PRIV-USER=xxx”, “PRIV-RESOURCE=xxx” and “PRIV-OPERATION=xxx” tags.
While deciding on how and where to establish smart “privilege” tagging capabilities, questions like the following quickly arise: Do I need to apply all three tags or can I start with just the privileged user tag, for example. And how redundant is the privileged resource tag when I tag privileged users or operations anyway? Some answers are easy, others are not. Therefore, let’s first share some “philosophical” thoughts about the privilege tagging’s principles:
- The privileged user tagging will definitely be part of your kick-off. Privileged user tagging is more or less quickly doable. It’s the “easiest” tag since you probably already know all relevant user IDs or at least how to determine them via technical criteria. Without a doubt, your security administrators, such as the RACF system administrators, are privileged users. But there are more, such as users with access permissions to critical resources. Compared to the process of privileged operation and privileged resource tagging, there is only some small redundancy among the privileged user’s tagging and all the regular “critical event” filters.
- In case of privileged resources, you follow some best-practice audit agendas; some good proposals are identified easily. For example, your APF-authorized libraries definitely belong to the set of privileged resources. But maybe you need to also include some specific DB2 tables – ones that only you know about, and no tool is able to tag them properly “out of the box.”
- Compared to privileged user tagging, implementing privileged resource tagging may easily result in some high level of redundancy with all your regular filter traps. Monitoring APF library updates via specific filters is a good example. If you execute such a filter, one that focuses on “almost the same thing,” namely detecting any unauthorized system file update, you may use this existing filter’s evaluation (logic) also to efficiently assign the “privileged resource” or “privileged operation” tag. Integrating the classification into existing filters helps you avoid the raw event’s additional and redundant evaluation concerning these two privilege tags. SF-Sherlock pursues this highly efficient approach; therefore, it is able to efficiently evaluate and tag millions of events per minute.
- Privileged operations, as explained in more detail below, express something slightly more “complex” as compared to “just” privileged resources. The reason is that privileged operations actually represent the combination of both a privileged resource with a specific level of access. Typically, both are required and evaluated. This means that any “the access level was update or higher” filter criterion does not by itself already express any privileged operation – since not all updates are automatically critical or privileged or sensitive. It’s the combination of the way a resource is accessed (read, update, …) with a resource’s principal criticality. Updating an APF-authorized library, compared to a user’s CNTL file, makes this distinction clear quite easily.
Here is some information for SF-Sherlock users: As an option, you are allowed to keep a lot of your privilege tagging controls in RACF instead of the init-deck. Inside the init-deck, we strongly recommend using external resource lists. You may implement an automated maintenance of your user and resource lists in both ways.
Best-practice experience has shown that another important decision concerns the format and content of the tags. First of all, the decision about how to “treat” the three privilege tags, i.e., on their values (content), depends on your goals, namely on what you want to express with such tags. Whether you may decide on something depends on the monitoring solution. Within SF-Sherlock, for example, you may adapt all standard tags. Before instantly acting, and deciding on the tags by yourself, you should ask your SIEM, SOC, or security monitoring teams about their detailed expectations. Just so that you better understand what we are talking about: The clearest, and at the same time easiest tag usage is to assign the “YES” string in case of a privileged candidate. Another “question of taste” comes up with your decision about how to tag non-privileged candidates. Several options are given: a) you assign “NO,” b) you simply don’t assign any tag value and leave it blank, or c) you even omit the complete tag-related information, i.e., the corresponding “PRIV-USER=xxx”, “PRIV-RESOURCE=xxx” and/or “PRIV-OPERATION=xxx” specification will be completely missing in this event. By default, SF-Sherlock applies the latter method in its SHKI5877 event messages; this makes any “privilege aspect” immediately obvious to the reader.
Aside from simply differentiating between ”yes” and “no,” you could define a set of more detailed values, such as “low,” “medium,” “high,” or similar. Again, before deciding and changing anything, you definitely need to discuss this first with all parties involved. This coordination is crucial, since the privilege tagging’s intention is to assist your SIEM and SOC teams in better understanding mainframe events and to simplify any required selections feeding their dashboards, reports, and alerts. Privilege tagging is one of several measures that are taken into account for making mainframe events easier to understand for non-mainframe specialists.
How do the given three privilege tags coexist with one another? Some additional conceptual thoughts can help you better understand how to apply these three privilege tags effectively for achieving a proper “privilege xyz” monitoring, reporting, and alerting:
- Privileged users: What makes a user privileged? Who are my privileged users? Several, maybe even a lot of reasons are possible that make a user “privileged:” a) high-level privileges in RACF (or CA-TSS or CA-ACF2), such as SPECIAL, OPERATIONS, or similar, b) critical access authority for any sensitive resources, such as update authority for any APF load library, c) the permission to use critical utilities or functions, such as issuing start and stop system commands, d) having access to critical objects of your business, and e) other reasons only your installation knows or worries about.
It’s important to figure out that a user becomes privileged due to their permissions and authorities, not due to their actions, i.e., regardless of any effectively performed accesses. It’s enough that the user is allowed to do something.
As long as the reason for classifying a user as privileged is of a technical nature, you may use technical selection capabilities to determine such users automatically. The fallback for all other scenarios is given with creating corresponding “user lists.” Such a list is the vehicle to register relevant candidates manually or automatically; you keep such lists in a safe place within the mainframe monitoring solution, such as in SF-Sherlock’s configuration file, the “init-deck,” or in RACF.
- Privileged resources: Maybe “critical” would be another good term, but it’s more convenient to keep all tags and their names in the common “privileged” terminology context. So what makes a resource privileged? It’s simply a question of how critical it is, how many restrictions you have to apply concerning its access, and other questions. APF-authorized libraries are typically privileged resources on a mainframe. As long as the classification context is of a technical nature, you can use technical query and selection capabilities to automatically identify such privileged (system) resources; for example, in case of SF-Sherlock, you may refer to automatically maintained system load library lists. The fallback for identifying all other relevant resources is given with corresponding “resource lists.” Such a set of lists is the vehicle to register relevant candidates manually or automatically – also kept in a safe place and included in an audit.
While effectively reviewing your resources, you quickly recognize that considering the isolated resource is too simple and not enough. The privileged resource tag has “conceptual limitations.” It only works for highly critical resources. Therefore, the privileged operation tag is required to close that “gap”; it defines a kind of context by combining resource and the level of access.
- Privileged operations: Maybe “critical operation” would be another good term, but it was our intention to keep all tags and their names in the common “privileged” terminology context. Compared to the privileged resource, the privileged operation tag also takes the kind of access to that resource into account, i.e., the access level or type.
Combining both is important and reasonable since in the end it’s the combination of operation and resource, and not just the resource or just the access level that moves any activity into the arena of “privileged events.” For example, just reading the SYS1.LINKLIB load library, that is APF-authorized, or executing any of its modules, is not necessarily a privileged operation; even the resource, namely the APF-authorized library SYS1.LINKLIB, is sensitive and critical. It’s the combination of both critical resource and critical access type/level that makes an operation privileged. In case of an APF-authorized library it’s an update operation, for example, that makes this event a privileged operation. In some cases, the resource is already so critical that any kind of access immediately lets the operation become privileged; the RACF database is a good example. Accordingly, it’s enough that you only select such a critical resource and ignore the level of access. But this is rarely the case.
In short, a “privileged” operation can be a) any read access to highly sensitive data sets, such as to the RACF database, b) an update access to any sensitive load library, such as updating an APF-authorized library, c) executing sensitive functions or features, such as issuing a critical system command (instead of just a display command), or d) other critical operations that only your installation knows or worries about. If the operational context is of a technical nature, you may apply technical selection capabilities to identify a privileged resource or operation; for example, via internally and automatically maintained system load library lists. The access level is either implicitly given by the type of event or you work in detail with the relevant access levels differentiated by RACF’s resource access audit events. The fallback for identifying all relevant resources is a “resource list;” it’s the vehicle to register relevant candidates manually or automatically as lists – normally they are kept in a safe place, such as in the init-deck in case of SF-Sherlock.
Why is it smart to focus, from the beginning onwards, with separate tags on both “privileged operations” and “privileged resources?” Because this approach is more general and lets you address all scenarios in a sharpened and significant way. Even all three privilege tags are conceptually linked, you have to keep them strictly separated. This is especially true for privileged users and privileged resources/operations. For example, it’s a given user’s access permission to any privileged resource that lets this user become privileged. Be aware that we are talking here about the principal permission to do something – that thing does not necessarily have to happen. If he or she has the principal and theoretical right to do something, a user becomes a privileged one. Reviewing and auditing users’ privileges and roles will prevent your IT from having unreasonable privileged users. Therefore, to pass an audit of your privileged user monitoring successfully, you need to have a well-documented catalog of privileges in your IT and business that lets a user become a privileged one. Even better than having these catalogs just on paper or in Excel sheets are lists that can be a) referred to directly by your monitoring process or b) even be created and maintained automatically.
Even pure resource-related authorities already let a user become privileged, and privileged operations also focus on such critical resources as well; the two tags are not the same, but rather express different “things.” Therefore, to achieve very powerful privileged user monitoring, appreciated by auditors with “praise,” it will be expected that you have a separate focus on both users as well as operations (and thus resources). And this is reasonable: a privileged user, for example, is not doing privileged or critical operations, such as accessing critical resources in sensitive ways, all day long. He or she, or the technical user, is also performing trivial operations such as issuing a “display time” command or similar, or just browsing a private data set. Of course if a privileged user is performing a trivial action, this action does not become a privileged operation – just because the user was privileged. To conclude, if an auditor asks for your “privileged user” monitoring, it also needs to address such users’ trivial actions. Therefore, keeping users and operations separate from one other is necessary.
Now let’s summarize and apply this knowledge to a given event technology, such as SF-Sherlock’s logical trap concept: Classifying users and classifying resources and operations as “privileged” are two different things – even they are closely connected. As a consequence, a given event may have just a privileged user tag but no privileged operation or resource tags, or vice-versa. Maybe most of your events will even include none of these three tags. And the combination of an “UNprivileged user” with a “privileged operation” can even be used to derive an alert or review-relevant incident directly. Why? Because it points to a kind of mismatch: either the event identifies an attacker, i.e., a regular user (suddenly) performing privileged operations, or the user’s privilege classification is improper. In case of a violation, this is easily clarified: the operation was privileged, but the user was not. Only in case of a successful operation does the given user, not tagged as “privileged,” become a candidate for a review.
Another important aspect is the visual appearance of the privilege tags within the event (message). To keep events as clear as possible, and to make any “privilege factor” obvious right away, we strongly recommend leaving a privilege tag blank if it does not deal with a “privileged candidate”; do this instead of using a tag such as “NO.” As mentioned above, that’s how we provide SF-Sherlock’s standard event: The tag remains bank and we go even one step further; the event message will not even include the corresponding “PRIV-xxx=” part with just nothing behind the “=” character. The great benefit of this concept is the significantly higher recognition value: if you see any “PRIV-” string, it’s immediately clear to you that this event is “special”; this “instant significance” is lost if all events, including the simplest ones, would always include all three “PRIV-xxx” details with just “NO” or empty values. The only party that might potentially prefer “PRIV-xxx=NO” tags instead of a blank value or even completely omitting the tag is your SIEM team. Why? Since they have to set up queries and filters to feed dashboards, reports, and alerts. Depending on the solution’s query function, it may seem “complicated” at first glance to work with tags that do not always occur in an event. But it’s not so bad. In such a case, the SIEM solution simply enriches events, coming from the mainframe, with corresponding default values.
Let’s conclude: Privilege user monitoring and privilege escalation detection definitely makes your IT secure and strong, and if it works properly, you auditors are happy. When implementing both, it’s crucial to understand the individual purpose and intention of each “privilege xyz” tag. Cooperate with your audit and SIEM teams when deciding on the details. These are “your customers” and they need to be able to define the required queries in order to properly feed their dashboards, reports, and alerts. When your auditor arrives, be prepared with detailed questions about how these tags are defined, how the classification “machine” behind the curtain works, and how the related processes and lists are secured, regularly reviewed, and audited. If this is this case, your chances for a success story are pretty high. Good luck!