IT Security Forum
IT Security and Compliance. We take it to the max.
If you look back along the evolutionary steps of mainframe security, APF libraries play a leading role – due to their “superpower.” Until the 1980s, “almost anyone” working on a mainframe was able and allowed to define one themselves. In most cases, there was no APF library protection at all. Then there was a phase where APF-related auditing received more and more attention, and correspondingly became an important audit issue.
The attention that “APF” received as a security risk has continued to increase over time. Today, it has almost reached the highest level of awareness: only a very few members of a company’s mainframe team are allowed to define a new APF library or update existing ones. Any such action requires prior permission, not just some documentation after it has happened. On our customer visits, we have seen companies where a new APF library requires not only an official change request, but up to “5 signatures.” Otherwise, you lose your job. Correspondingly, relentless monitoring and compliance reporting has become standard for the “APF” risk, resulting in real-time security alerts by a SIEM if corresponding rules are bypassed.
So far, so good. Now that there is great awareness of “APF,” the question will be if the entire mainframe security mission has now been accomplished? Or what’s the next superpower, following “APF,” that mainframe users need to focus on?
Based on our worldwide penetration testing experience, we have determined that “System REXX” and “BCPii” are two further members of the superpower league; both are good candidates for becoming the next “big focus.” In recent years, both z/OS features have been improved so that they are now “easy-to-use” functions. But there is no free lunch. As a consequence, highly critical operations became minimally complex, and you have to “pay” the price for setting up gap-free security measures. User-friendly and easy-to-use superpower features are an invitation to attackers. Complexity is a kind of protection. Compared to assembler programming or disassembling machine code, REXX programming is pretty trivial!
This is why SF-Sherlock focuses intensively on both of these areas. Please feel free to contact us to discuss additional details of what is necessary to properly protect System REXX and BCPii.
Successfully implementing ”privileged user monitoring,” “privileged activity monitoring,” and “privilege escalation prevention” on your mainframe
The IT audit and compliance agendas of banks and insurance companies and of critical IT infrastructures definitely include “privilege xyz monitoring” topics such as “privileged user monitoring,” “privileged activity monitoring,” “privilege escalation prevention,” etc. In the stressful context of requiring a “real” audit success story, by passing a mainframe audit with honors, you need to be aware of the “philosophical dimensions” of these “privilege(d) xyz” aspects.
It’s of great benefit during any interviews with auditors to know exactly which of your implemented powerful and streamlined monitoring and detection capabilities are focusing on “privileged user monitoring” and which on “privilege escalation detection” or “privilege misuse detection.” SF-Sherlock, for example, has always been firmly committed to all kinds of “privileged xyz” monitoring aspects. The primary basis is its powerful real-time event monitoring. This also includes the detection of tricky dynamic manipulations such as of sensitive control blocks residing in memory in order to reduce an attacker’s chance of a successful privilege escalation.
Regardless of how you implement your “privilege xyz” monitoring, a success story requires the clearly understandable and complete tagging of users, resources, and actions, namely as privileged or non-privileged (regular). This tagging is done either before sending such information to a SIEM, or afterwards within the SIEM. This brings up another important decision: where, within a SIEM or SOC integration of the mainframe, is the tagging-related “intelligence” located? On the mainframe or inside the SIEM? Which way would be better? Who is most familiar with all details required to tag something as “privileged”? We strongly believe that in most IT environments, it’s the mainframe itself; it knows best about all details and therefore should send principally classified events to the SIEM in use. Of course, as a vendor it might seem that we are not neutral, but here are some strong arguments that might convince you:
- It’s hard for the SIEM to “dig” on the mainframe for all the knowledge required to properly tag users, resources, and activities. Therefore, the mainframe has to collect and forward that knowledge to the SIEM anyway – at regular intervals.
- The mainframe always knows about its current status – it’s a kind of real-time knowledge. The SIEM’s knowledge depends on a continuous and thus very frequent update and refresh – which means potential overhead.
- Deciding on the “privilege” factor given for an event’s user, resource or operation is a complex task. Performing it directly within the event monitoring context is much more precise and efficient.
- By receiving “out of the box” classifications within each event, your SIEM solution may work with even easier and simpler selections for feeding the established dashboards, reports, and alerts.
Therefore, in case of a mainframe-located event monitoring, an important task is to implement an efficient and detailed enrichment of raw events with all the additionally required “privilege” information – that is not part of the native SMF, syslog, or other event record. SF-Sherlock and SF-NoEvasion users should know: In total, you get the benefit of two areas where automatic tagging is applied before an event is sent to your SIEM: a) the standard-policy-rule classification, combined with an alert level, and b) the “PRIV-USER=xxx”, “PRIV-RESOURCE=xxx” and “PRIV-OPERATION=xxx” tags.
While deciding on how and where to establish smart “privilege” tagging capabilities, questions like the following quickly arise: Do I need to apply all three tags or can I start with just the privileged user tag, for example. And how redundant is the privileged resource tag when I tag privileged users or operations anyway? Some answers are easy, others are not. Therefore, let’s first share some “philosophical” thoughts about the privilege tagging’s principles:
- The privileged user tagging will definitely be part of your kick-off. Privileged user tagging is more or less quickly doable. It’s the “easiest” tag since you probably already know all relevant user IDs or at least how to determine them via technical criteria. Without a doubt, your security administrators, such as the RACF system administrators, are privileged users. But there are more, such as users with access permissions to critical resources. Compared to the process of privileged operation and privileged resource tagging, there is only some small redundancy among the privileged user’s tagging and all the regular “critical event” filters.
- In case of privileged resources, you follow some best-practice audit agendas; some good proposals are identified easily. For example, your APF-authorized libraries definitely belong to the set of privileged resources. But maybe you need to also include some specific DB2 tables – ones that only you know about, and no tool is able to tag them properly “out of the box.”
- Compared to privileged user tagging, implementing privileged resource tagging may easily result in some high level of redundancy with all your regular filter traps. Monitoring APF library updates via specific filters is a good example. If you execute such a filter, one that focuses on “almost the same thing,” namely detecting any unauthorized system file update, you may use this existing filter’s evaluation (logic) also to efficiently assign the “privileged resource” or “privileged operation” tag. Integrating the classification into existing filters helps you avoid the raw event’s additional and redundant evaluation concerning these two privilege tags. SF-Sherlock pursues this highly efficient approach; therefore, it is able to efficiently evaluate and tag millions of events per minute.
- Privileged operations, as explained in more detail below, express something slightly more “complex” as compared to “just” privileged resources. The reason is that privileged operations actually represent the combination of both a privileged resource with a specific level of access. Typically, both are required and evaluated. This means that any “the access level was update or higher” filter criterion does not by itself already express any privileged operation – since not all updates are automatically critical or privileged or sensitive. It’s the combination of the way a resource is accessed (read, update, …) with a resource’s principal criticality. Updating an APF-authorized library, compared to a user’s CNTL file, makes this distinction clear quite easily.
Here is some information for SF-Sherlock users: As an option, you are allowed to keep a lot of your privilege tagging controls in RACF instead of the init-deck. Inside the init-deck, we strongly recommend using external resource lists. You may implement an automated maintenance of your user and resource lists in both ways.
Best-practice experience has shown that another important decision concerns the format and content of the tags. First of all, the decision about how to “treat” the three privilege tags, i.e., on their values (content), depends on your goals, namely on what you want to express with such tags. Whether you may decide on something depends on the monitoring solution. Within SF-Sherlock, for example, you may adapt all standard tags. Before instantly acting, and deciding on the tags by yourself, you should ask your SIEM, SOC, or security monitoring teams about their detailed expectations. Just so that you better understand what we are talking about: The clearest, and at the same time easiest tag usage is to assign the “YES” string in case of a privileged candidate. Another “question of taste” comes up with your decision about how to tag non-privileged candidates. Several options are given: a) you assign “NO,” b) you simply don’t assign any tag value and leave it blank, or c) you even omit the complete tag-related information, i.e., the corresponding “PRIV-USER=xxx”, “PRIV-RESOURCE=xxx” and/or “PRIV-OPERATION=xxx” specification will be completely missing in this event. By default, SF-Sherlock applies the latter method in its SHKI5877 event messages; this makes any “privilege aspect” immediately obvious to the reader.
Aside from simply differentiating between ”yes” and “no,” you could define a set of more detailed values, such as “low,” “medium,” “high,” or similar. Again, before deciding and changing anything, you definitely need to discuss this first with all parties involved. This coordination is crucial, since the privilege tagging’s intention is to assist your SIEM and SOC teams in better understanding mainframe events and to simplify any required selections feeding their dashboards, reports, and alerts. Privilege tagging is one of several measures that are taken into account for making mainframe events easier to understand for non-mainframe specialists.
How do the given three privilege tags coexist with one another? Some additional conceptual thoughts can help you better understand how to apply these three privilege tags effectively for achieving a proper “privilege xyz” monitoring, reporting, and alerting:
- Privileged users: What makes a user privileged? Who are my privileged users? Several, maybe even a lot of reasons are possible that make a user “privileged:” a) high-level privileges in RACF (or CA-TSS or CA-ACF2), such as SPECIAL, OPERATIONS, or similar, b) critical access authority for any sensitive resources, such as update authority for any APF load library, c) the permission to use critical utilities or functions, such as issuing start and stop system commands, d) having access to critical objects of your business, and e) other reasons only your installation knows or worries about.
It’s important to figure out that a user becomes privileged due to their permissions and authorities, not due to their actions, i.e., regardless of any effectively performed accesses. It’s enough that the user is allowed to do something.
As long as the reason for classifying a user as privileged is of a technical nature, you may use technical selection capabilities to determine such users automatically. The fallback for all other scenarios is given with creating corresponding “user lists.” Such a list is the vehicle to register relevant candidates manually or automatically; you keep such lists in a safe place within the mainframe monitoring solution, such as in SF-Sherlock’s configuration file, the “init-deck,” or in RACF.
- Privileged resources: Maybe “critical” would be another good term, but it’s more convenient to keep all tags and their names in the common “privileged” terminology context. So what makes a resource privileged? It’s simply a question of how critical it is, how many restrictions you have to apply concerning its access, and other questions. APF-authorized libraries are typically privileged resources on a mainframe. As long as the classification context is of a technical nature, you can use technical query and selection capabilities to automatically identify such privileged (system) resources; for example, in case of SF-Sherlock, you may refer to automatically maintained system load library lists. The fallback for identifying all other relevant resources is given with corresponding “resource lists.” Such a set of lists is the vehicle to register relevant candidates manually or automatically – also kept in a safe place and included in an audit.
While effectively reviewing your resources, you quickly recognize that considering the isolated resource is too simple and not enough. The privileged resource tag has “conceptual limitations.” It only works for highly critical resources. Therefore, the privileged operation tag is required to close that “gap”; it defines a kind of context by combining resource and the level of access.
- Privileged operations: Maybe “critical operation” would be another good term, but it was our intention to keep all tags and their names in the common “privileged” terminology context. Compared to the privileged resource, the privileged operation tag also takes the kind of access to that resource into account, i.e., the access level or type.
Combining both is important and reasonable since in the end it’s the combination of operation and resource, and not just the resource or just the access level that moves any activity into the arena of “privileged events.” For example, just reading the SYS1.LINKLIB load library, that is APF-authorized, or executing any of its modules, is not necessarily a privileged operation; even the resource, namely the APF-authorized library SYS1.LINKLIB, is sensitive and critical. It’s the combination of both critical resource and critical access type/level that makes an operation privileged. In case of an APF-authorized library it’s an update operation, for example, that makes this event a privileged operation. In some cases, the resource is already so critical that any kind of access immediately lets the operation become privileged; the RACF database is a good example. Accordingly, it’s enough that you only select such a critical resource and ignore the level of access. But this is rarely the case.
In short, a “privileged” operation can be a) any read access to highly sensitive data sets, such as to the RACF database, b) an update access to any sensitive load library, such as updating an APF-authorized library, c) executing sensitive functions or features, such as issuing a critical system command (instead of just a display command), or d) other critical operations that only your installation knows or worries about. If the operational context is of a technical nature, you may apply technical selection capabilities to identify a privileged resource or operation; for example, via internally and automatically maintained system load library lists. The access level is either implicitly given by the type of event or you work in detail with the relevant access levels differentiated by RACF’s resource access audit events. The fallback for identifying all relevant resources is a “resource list;” it’s the vehicle to register relevant candidates manually or automatically as lists – normally they are kept in a safe place, such as in the init-deck in case of SF-Sherlock.
Why is it smart to focus, from the beginning onwards, with separate tags on both “privileged operations” and “privileged resources?” Because this approach is more general and lets you address all scenarios in a sharpened and significant way. Even all three privilege tags are conceptually linked, you have to keep them strictly separated. This is especially true for privileged users and privileged resources/operations. For example, it’s a given user’s access permission to any privileged resource that lets this user become privileged. Be aware that we are talking here about the principal permission to do something – that thing does not necessarily have to happen. If he or she has the principal and theoretical right to do something, a user becomes a privileged one. Reviewing and auditing users’ privileges and roles will prevent your IT from having unreasonable privileged users. Therefore, to pass an audit of your privileged user monitoring successfully, you need to have a well-documented catalog of privileges in your IT and business that lets a user become a privileged one. Even better than having these catalogs just on paper or in Excel sheets are lists that can be a) referred to directly by your monitoring process or b) even be created and maintained automatically.
Even pure resource-related authorities already let a user become privileged, and privileged operations also focus on such critical resources as well; the two tags are not the same, but rather express different “things.” Therefore, to achieve very powerful privileged user monitoring, appreciated by auditors with “praise,” it will be expected that you have a separate focus on both users as well as operations (and thus resources). And this is reasonable: a privileged user, for example, is not doing privileged or critical operations, such as accessing critical resources in sensitive ways, all day long. He or she, or the technical user, is also performing trivial operations such as issuing a “display time” command or similar, or just browsing a private data set. Of course if a privileged user is performing a trivial action, this action does not become a privileged operation – just because the user was privileged. To conclude, if an auditor asks for your “privileged user” monitoring, it also needs to address such users’ trivial actions. Therefore, keeping users and operations separate from one other is necessary.
Now let’s summarize and apply this knowledge to a given event technology, such as SF-Sherlock’s logical trap concept: Classifying users and classifying resources and operations as “privileged” are two different things – even they are closely connected. As a consequence, a given event may have just a privileged user tag but no privileged operation or resource tags, or vice-versa. Maybe most of your events will even include none of these three tags. And the combination of an “UNprivileged user” with a “privileged operation” can even be used to derive an alert or review-relevant incident directly. Why? Because it points to a kind of mismatch: either the event identifies an attacker, i.e., a regular user (suddenly) performing privileged operations, or the user’s privilege classification is improper. In case of a violation, this is easily clarified: the operation was privileged, but the user was not. Only in case of a successful operation does the given user, not tagged as “privileged,” become a candidate for a review.
Another important aspect is the visual appearance of the privilege tags within the event (message). To keep events as clear as possible, and to make any “privilege factor” obvious right away, we strongly recommend leaving a privilege tag blank if it does not deal with a “privileged candidate”; do this instead of using a tag such as “NO.” As mentioned above, that’s how we provide SF-Sherlock’s standard event: The tag remains bank and we go even one step further; the event message will not even include the corresponding “PRIV-xxx=” part with just nothing behind the “=” character. The great benefit of this concept is the significantly higher recognition value: if you see any “PRIV-” string, it’s immediately clear to you that this event is “special”; this “instant significance” is lost if all events, including the simplest ones, would always include all three “PRIV-xxx” details with just “NO” or empty values. The only party that might potentially prefer “PRIV-xxx=NO” tags instead of a blank value or even completely omitting the tag is your SIEM team. Why? Since they have to set up queries and filters to feed dashboards, reports, and alerts. Depending on the solution’s query function, it may seem “complicated” at first glance to work with tags that do not always occur in an event. But it’s not so bad. In such a case, the SIEM solution simply enriches events, coming from the mainframe, with corresponding default values.
Let’s conclude: Privilege user monitoring and privilege escalation detection definitely makes your IT secure and strong, and if it works properly, you auditors are happy. When implementing both, it’s crucial to understand the individual purpose and intention of each “privilege xyz” tag. Cooperate with your audit and SIEM teams when deciding on the details. These are “your customers” and they need to be able to define the required queries in order to properly feed their dashboards, reports, and alerts. When your auditor arrives, be prepared with detailed questions about how these tags are defined, how the classification “machine” behind the curtain works, and how the related processes and lists are secured, regularly reviewed, and audited. If this is this case, your chances for a success story are pretty high. Good luck!
The BCPii hypervisor control function is a pretty new z/OS feature. It allows you to control the mainframe on its HMC console level, meaning that you access the hypervisor directly from z/OS. Therefore, BCPii authority stands for owning “real power.” Without doubt BCPii enables great functionality in the field of smart system automation, such as dynamic capacity management, etc., and lets you avoid necessary personal visits to the cold and windy server rooms. On the other hand, professionally acting hackers also receive new levels of power by now potentially controlling your entire mainframe platform. How is this possible, since BCPii is a protected feature? The answer is easily given. Professional attacks always also imply the deactivation of your security system during use, meaning that of RACF, CA-TSS or CA-ACF2. This means that any perfectly configured BCPii protection simply becomes void for the hacker during such a professional attack. In case BCPii is enabled, hackers are much more easily able to perform cross-LPAR attacks, and to reach and touch your production systems. For example, in case the actual hack happens on a test LPAR sharing the same physical machine with production. One detail has to be mentioned here, namely that such attacks will not focus on direct data access but on the LPARs’ overall operational availability or behavior.
In case of a critical mainframe infrastructure the golden rule claims “no BCPIii without strong intrusion detection (IDS),” otherwise you – irresponsibly – accept a significant operational risk. The requirement “strong” means that it’s an intrusion detection system that is not just evaluating SMF and syslog records or similar regular sources, but is also able to detect dynamic manipulations, the bypassing of the security system, and other fancy tricks in real-time. We apologize for this clear and frank statement and opinion, and also for pointing to SF-Sherlock within this context. But its unique IDS component exactly focuses on even these most dangerous and tricky attack methods; this in order to secure your mainframe platform against professional attacks. If you are curious about what such an attack looks like, the best way is to invite us to carry out a penetration test.
Provide complete and comprehensive audit and log data in real-time to your security monitoring solution in use (ArcSight, Splunk, QRadar, SF-RiskSaver, etc.)
Successful cross-platform monitoring requires comprehensive and complete data in real-time. The connection kits available for SF-Sherlock and SF-NoEvasion allow an easy integration of the z platform in your overall security monitoring, regardless from the solution in use. Both are based on the “plug & play” principles, and provide the ultimate and required strength to your detection and correlation procedures in order to successfully combat any attack or fraud.
Using FTP for file transfers between clients and the z/OS server offers a wide range of capabilities. Each FTP user who requests services from the z/OS FTP server requires prior authentification with a valid z/OS user ID and password – without enabling “anonymous FTP”. This user also needs a (UNIX) UID from the user’s OMVS segment if it is not provided by a default UID definition. After authentification, a wide spectrum of operations is then possible depending on the FTP server’s configuration and the authority granted to the user in the security system. You should know that z/OS FTP is not limited to simple file transfers. For example, you may also submit jobs and receive spool output within your FTP session by changing the “file type” accordingly. If enabled, you may even send SQL statements to your DB2 subsystem. Of course, the permission defined by your security system still represents the baseline for all available resources, functions, and operations.
It becomes immediately obvious that any weaknesses inside the security system’s definitions may immediately expose your z/OS system to high risk. Carefully consider a given user ID. It will make a difference, if it is used in a batch job, started in a task-related context, or used inside an FTP session coming from the outside. In a scenario based on the misuse of a highly privileged user ID via FTP-based remote access from “far away”, you probably do not want to grant all authority to this user ID, but restrict possible operations to “task-based” performance only. For example, you will not allow a JCL submission to all users, but only when it is specifically required. How can you solve this problem without disabling FTP completely?
SHER-BLOCK, the blocking component of SF-Sherlock, lets you establish smart command verification for FTP. Aside from supporting FTP, RACF, and console commands, it also monitors critical interfaces that allow hidden operations. Its highly flexible concept, which also supports whitelists referring to RACF definitions, lets you precisely control and log critical activities by setting up corresponding rules. Here are some sample blocking definitions provided in the standard plug & play rule set: a) prevent JCL submission in FTP, b) prevent disablement of RACF command logging via the “SETROPT NOSAUDIT” command, c) prevent the “Z EOD” system command from being issued in any TSO SDSF session, and much more. If your company’s IT requires SOX compliance, then SHER-BLOCK will let you finally setup fully-automated controls around FTP.
You ask yourself: What do IT security & compliance have in common with the subprime lending dilemma? Triple A ratings from your solution providor may rely on a false trust in vendor size, brand name and bona fides. Read the news. The top subprime lenders are world-class banks. It does not make a difference whether you compare the integrity between rating agencies and rated financial institutions or between operating system vendors and their security & compliance solutions. If there is no integrity in the separation of powers, and conflicts are ignored in the name of easy access, then any guarantee against a subprime-like debacle becomes worthless. In this respect, however, security & compliance software completely differs completely from other software categories. The separation of powers, or simply, “competition”, is essential in achieving true AAAA+ levels in IT security & compliance! This is what we call the “Max Approach”.
This statement may seem somewhat abstract, but integrity, trust, quality, IT security & compliance, auditing, etc., all imply a sound ethical foundation. All related production is based on abstract ideals and goals. So, where else should motivation come from than Socrates & Co.?
The Payment Card Industry (PCI) Data Security Standard und ISO-27001/2 represent important security standards for financial service institutions. This article keeps you updated on current topics surrounding PCI and ISO 27001/2 compliance of the mainframe platform. Latest best practice experience has shown potential mainframe-related difficulties in important areas not only your auditor should know about:
|10.5||Secure audit trails so they cannot be altered,|
|2.2.3||Configure system security parameters to prevent misuse,|
|6.2||Establish a process to identify newly discovered security vulnerabilities …,|
|6.5.2||Broken access control, etc.|
|A10.10.3||Protection of log information,|
|A11.5.4||Use of system utilities,|
|A12.6.1||Control of technical vulnerabilities, etc.|
Furthermore, ISO sections A13.2.1, A13.1.1, and PCI sections 11.4, 11.5, 12.9, all stipulate the mandatory implementation of real-time monitoring for compliance.
The mainframe’s failure to fully comply with these requirements also stems from its missing ability to both detect and combat dynamic manipulation and malicious code processing regarding
authentication (user ID switching, authorization theft, etc.),
logging (manipulation or suppression of audit information, which means breaking the audit trail),
resource access (manipulation of resource access procedures).
Best practice experience has proven that full compliance becomes impossible by solely relying on the capabilities of the standard operating and security systems, even when implementing some real-time SMF-based auditing (for technical details see below).
We are partners with the world’s largest companies and institutions in successfully achieving and maintaining extreme secure environments. Our unique real-time security monitoring technology, which we call SF-Sherlock, goes far beyond standard monitoring. It will “plug into” your mainframe environment and successfully detect and combat any attack. You become compliant easily.
Is single-vendor sourcing of all security software really the best strategy? Read why this is only ostensibly so
At first glance, it might sound odd, but key functions of security software performance target the “invisible workings of your IT” that require transparency. It is the job of security monitoring to locate hidden security gaps and weaknesses, even those unknown to their developers, as well as non-compliant operations and fraud in an apparently flawless and safe environment.
This key feature makes security software unique. Procurement-related decisions likewise require slightly different evaluations and strategies. Aside from quality-related concerns and technological depth, the vendor’s independence is paramount. Only independent vendors of security monitoring solutions are free to publish, discuss, target and ruthlessly remove all weaknesses of your IT platform, solution or product. Independence is both the requirement and the guarantee for not ignoring or bypassing any company-internal weaknesses or taboos internal to the company, such as not to tampering with the image or sales projections of any product.
Is this single argument really strong enough to supersede company policies like minimizing the number of software vendors? Isn’t it critical for independent security software vendors to remain compatible and up-to-date concerning the monitored platform, products, or solutions? Doesn’t this aspect demand a single-source strategy? Well, this argument is often used, and it sounds plausible that only the developer of the operating system, solution or platform is aware of the latest features and improvements for future implementation. But the reality of the professional software market proves otherwise, especially regarding operating systems – not just open source. Today, professional software vendors receive all the required information “just in time” to be capable of adapting their software products. General and specific legal regulations and agreements guarantee the expeditious release of the required current software to the customer.
We can conclude that missing competition in the area of security monitoring automatically results in a lower level of security on the user’s site. Like all monopolies, your reward will be low performance. The stimulating factor of competition is the precondition and guarantor for your success and sovereign control in IT.
Will the auditor really accept my company’s technical solutions to achieve compliance? Or does the auditor come with concrete requirements such as product and vendor lists?
IT Baseline Protection for the Z Platform (Mainframe)« of the German Federal Office for Information Security (BSI) Set your company’s rating, such as for Basel II, on a secure foundation by going far beyond the requirements of the U.S. Department of Defense (DOD)
The “OS/390 and z/OS Security Technical Implementation Guide” of the US Department of Defense (DOD) provides only a basic approach for a secure implementation of z/OS. The German Federal Office for Information Security (BSI) is far more comprehensive.
Since 2004 the German Federal Office for Information Security (BSI) has been focusing on necessary security measures for the mainframe platform in its central security guide, the “IT Baseline Protection Manual” (www.bsi.bund.de). Section 6.10 focuses on the z platform and describes the risks and related basic protection measures for a secure z platform.
The German security guide describes today’s demand for using real-time monitoring technology for securing the systems against manipulation and the exploitation of possible z/OS-specific weaknesses. According to the IT Baseline Protection Manual, “…such detection measures are practically indispensable if the greater damage is to be expected. …” and the necessary “… use of a real-time security monitor for z/OS systems in determining security infringements faster.…” (both passages are taken from the “Basic IT Security Protection Manual, 2004 Edition, Section “M6.67 Use of detection measures for security incidents”). Real-time monitoring of only a single isolated security aspect, such as SMF records, is still insufficient. Monitoring the entire z/OS with all its components and complex relations and details is crucial.
The strict requirements of the BSI demonstrate the high relevance of mainframe security and emphasize the need for additional protection against today’s risks. The z platform has thus become a “conventional” server platform with “conventional” risks, i.e. “less SNA, more TCP/IP” or “not only MVS, but also UNIX System Services”. This means that companies with an increasing demand for security and quality require further technical measures for their z platform not supplied by the standard security system. At this point, it is important to note that a real-time security monitor is not a standard component of the z/OS. BSI’s conclusion? Standard z security is not sufficient for companies with an increased need for security.
What types of companies have an increased need for z security and quality?
When you take into account the high investment and operational costs of mainframes, you realize that all mainframe users are affected, especially those in the financial and insurance sectors, commercial IT providers, health insurance companies, among others.
Which motivation, or rather, what pressure is there to act now to achieve such increased security measures? Enormous pressure, according to the new legislation and regulations, such as Basel II, IT Baseline Protection Manual (German Federal Office for Information Security), KonTraG, SOX, U.S. DOD Regulations, etc. Such pressure could even result in considerable “trouble” for companies and their management when the damage has already been incurred and any exonerating evidence is missing.
Nevertheless, even before something happens, insufficient security can prove expensive in the long run. This point has been reinforced by legislation, such as Basel II and SOX, among others. The key factor is the so-called “rating”, which acts as a consolidated measure for legislative compliance, professionalism, and stability for safeguarding companies and their business processes. As a result, IT becomes burdened with this added responsibility. On the one hand, business processes are based and dependent on IT. Given this relationship, IT becomes a source of risk. On the other hand, IT lets you minimize risks, e.g. through concepts that control, monitor, identify and eventually combat risks. IT security is therefore of central significance for rating agencies and governing bodies. One thing is clear: these organizations possess highly specialized knowledge of all platforms, including those belonging to the “good old” mainframes.
The assessment by rating agencies of companies is done by a so-called “rating”. A company’s rating can, for example, negatively impact the cost of credit. After all, a few per thousand of interest can add up to a large sum of money. The idea is that higher risk must correspond to an “extra charge”. In short, a company with a bad rating will have higher credit costs much like an insurance premium. Conclusion: Insufficient security costs more money in the long run.
Don’t you, as someone working in or for the financial IT service sector, feel affected by this? You ask yourself: how does a bank become rated? Why do only banks “rate” their customers, and not vice versa? While this may be true, remember that all companies are subject to a rating. After all, even banks are also debtors. Besides the legislation, there are several governing authorities that continuously rate banks. Some are state-controlled, such as the German Federal Financial Supervisory Authority (BaFin).
What can and should be done to protect the z platform against malicious code and exploits? It is necessary to set up pro-active and comprehensive measures to eliminate and control risks.
Isolated technical measures alone, even when operating in real-time, are not enough. Commercially offered systems for recording SMF records in real-time, such as RACF SMF records, are a good example. You have a good objective, but this is simply not enough, since the whole system must be comprehensively and systematically protected against manipulation. A simple scenario supports this argument. If a professional attacker manipulates his or her authorization and breaks the audit trail, he can easily disable any SMF records. This is a mere “bit” of effort. The old law of physics also applies here: “Nothing comes from nothing” – no cause, no effect. Missing SMF records will not bring up detection and notification even in a real-time live-evaluation and cannot be used for an automatic reaction.
In this context, some hard questions regarding “malicious code” arise:
• What exactly are all the APF-authorized modules, which come from so many different software vendors, doing?
• To what extent can these programs be misused for other purposes? Can they be misused for such actions as suppressing and deactivating SMF protocolling by dynamic (memory) manipulation?
• Which undocumented, security-critical functions can specialists uncover in the program code of modules when using corresponding analysis tools?
• To what extent can the development departments create and use authorized modules?
The questions are endless. Nevertheless, one thing is certain: no one z/OS user can clearly answer them, even when the software is installed properly with SMP/E. Conclusion: All processes running in the system must be monitored in real-time for “improper behavior”.
Another important legal aspect concerns the requirements and conditions for proper operations, transparency and completeness, especially in the areas of bookkeeping and financial data processing. In short, you require an invulnerable audit trail for the purpose of audit data completeness and authenticity. This is your primary concern. In addition, new concerns involving risk precautions and prevention require that this complete and correct audit data is not merely archived, but analysed immediately and properly as well.
What solution do we suggest?
As developers of SF-Sherlock, whose comprehensive z/OS real-time monitoring technology is unique to the market, we can gladly propose the successful way of implementing z/OS security and quality automation. The SF-Sherlock solution not only entirely complies with basic legal requirements and recommendations, but also accomplishes much more to keep your company on the right track. Apart from security concerns, SF-Sherlock also supports the goal of constant availability, for example, with its IPL simulation, parmlib syntax and semantic checking and many more quality checks. We supply connection kits that further enable SF-Sherlock’s integration into comprehensive cross-platform solutions, such as those of Symantec, Tivoli, CA, etc.
By using SF-Sherlock, you can eliminate the technical risks as well as successfully achieve the required automatic control and monitoring required by legislation. Both improve your rating significantly and thus lower costs. In this way, SF-Sherlock gives an increased added value to your company.
The concepts of automatic and complete monitoring, as well as the plug&play implementation, let you reach this goal with minimal time and cost. Our effective installation and implementation concept will convince you.
Leading banks, insurance companies, industrial companies and IT service providers have been successfully securing their mainframe platform for years with SF-Sherlock. Our company, our technology, our services, and our “value added”-based pricing have the references to prove that improving security and reducing costs are no longer contradictory.
The topic of internal attacks is an extremely sensitive one. Both determining the risks from bad colleagues and employees and communicating this to them is a rather undesirable task and also legally difficult. No wonder the term “intrusion detection” has developed such a biased connotation in the last years. Intrusion detection systems (IDS) have been “reduced” to focusing network-based and external attacks.
However, practice has shown that the main danger really does lie in internal attacks. Insider knowledge considerably reduces the effort and hurdles required for a successful “attack”. Without a doubt, it is essentially more difficult for an outsider to penetrate the system, then find and reach the data sources. That compared to an insider, who can easily transport the data to the outside – from the known location.
The relatively new term “extrusion detection” expands on the idea of this reduced “IDS” definition. The purpose of an “extrusion detection system” (EDS) is to keep track of procedures and events within the company in order to combat internal attacks. A close relationship between auditing and revision is obvious.
Automatically executed technical reactions are especially associated with “prevention”. No doubt actual prevention takes place in the awareness of the staff. And in general, there is high respect for technical prevention and an automatic reaction to incidents. This goes along with the “false positive” problem and erroneous decisions that could endanger production processes and their availability.
From the very beginning, SF-Sherlock’s “logical trap” concept was linked to the detection aspects of both intrusion and extrusion, combined with an optional reaction. Altogether, system attacks are detected as well as the “escape” of important data to the “outside”, e.g. by way of FTP.
The SF-Sherlock technology for security and quality automation is
• both an intrusion and extrusion detection and prevention system,
• host-based, but network activities are also monitored, such as Firewall and TCP/IP,
• z/OS-specific, while also monitoring the operating system, applications and subsystems (e.g. DB2, LDAP, etc.),
• able to evaluate SMF or any log data,
• equipped with necessary pre-defined attack patterns (“logical traps”) and
• can be supplemented with installation-defined attack patterns and installation-specific monitoring characteristics.
This comprehensive approach to our security technology guarantees a maximum return on investment and maximum security.
Your mainframe installation uses z/Linux, z/VSE and/or z/VM in addition to z/OS, and you need to integrate these platforms in a cross-platform real-time monitoring? SF-Sherlock provides the following powerful options:
• z/Linux without z/VM+RACF-based authentication: The syslog of all z/Linux systems will be automatically forwarded to SF-Sherlock for passing its intensive security scan.
• z/Linux combined with authentication via z/VM+RACF: Aside from the automated syslog scan, all SMF records resulting from z/VM’s RACF will also become part of the SF-Sherlock-based security monitoring. For example, to immediately detect any suspicious behavior in the mass of z/Linux logins.
• z/VSE: The Basic Security Manager (BSM) is a standard component of z/VSE’s kernel, and provides basic security capabilities. With version 4.1 this includes the creation of RACF-compatible SMF records. Simply forward these records to SF-Sherlock for being analyzed, and your z/VSE is covered by its strong monitoring.
• z/VM: RACF on z/VM creates regular RACF SMF records that will be forwarded to SF-Sherlock for setting up a correspondingly high surveillance level on z/VM.
Employing the SF technologies provides the easiest way to fully secure the “entire z”.
Buffer Overflow and Format String Attacks on the Mainframe Technical principles and possible countermeasures by “encapsulating an application”
One generally associates the terms “buffer overflow (attack)” and “format string attack” with a kind of primary risk. Both represent the threat of an outside attack on applications with an (IP) interface. Applications could be vulnerable and exploitable, in that “sophisticated” input is cleverly passed on to them, such as strings that are too long, strings with embedded control characters, program code, or the like. Both types of attack intend to overwrite memory areas through tricky specifications and passing parameters. Usually, the attacker’s goal is to overwrite the memory with executable machine code to reach high authorization levels within the targeted application. A possible scenario can be the misuse of the web server and its high authorization through attacks on a web application.
A technical basis for attacks is the application of a so-called “stack” as both a temporary and transfer memory through the runtime environment of an application. This program stack is not to be mistaken for the TCP/IP stack. In general, the stack is used to store parameters and save the so-called return address in the context of a subprogram call. Before program X calls subroutine Y and branches into it, the arguments required by routine Y will be “pushed” onto the stack, from which routine Y then “pops” them. Correspondingly, by calling Y from X, the return address – from Y back to X – will also be pushed onto the stack. The nesting of routine calls then lets the stack grow and shrink during runtime – it “swings”. There is one thing to remember. While the Intel stack grows in the direction of a smaller address –from top to bottom–, the z stack, as explained below, grows in the direction of bigger addresses.
The stack can already be implemented either on a processor level through its design and corresponding instructions, or through software emulation. For instance, the stack concept of Intel’s processors is an elementary part of the design and is supported on the level of machine code through corresponding instructions (PUSH, POP, etc.). A specific register, known as the stack register, identifies the stack and allows its addressing.
On the Z platform, the stack is virtually emulated in a “merely” software technical manner. The memory of the z architecture is a large linear memory with 32 or 64 bit addressing, whereas the Intel memory is segmented. A C language program, for instance, running on the z platform has a corresponding memory area within this 32 or 64 bit address space, where it places the emulated stack during runtime. A stack pointer, which identifies the current top address of the stack, moves up and down during program execution and is completely analogous to the special stack register of the Intel processor. From the perspective of the actual effect and function for a running program, there is no real difference between both processor worlds. Altogether the “z stack” is emulated through the runtime environment of the compiler. This reveals the potential for additional protection, in which the runtime environment carries out extensive plausibility checks. It is very important that the program code residing in the z stack can be executed since there is only “the one” memory. This aspect is a recent development and presents a marked difference from the Intel world. To protect programs against stack-based attacks, the new Intel processors can prevent the execution of program code residing in the stack (memory). An entire category of attacks was rendered impossible by this new measure, namely those transferring the executable program code as a command argument.
Let us turn now to the principles of both attacks. Both are based on the fact that programs expect parameter data of fixed length or that parameter data do not exceed a specific number of bytes. These programs do not check whether the incoming arguments are too long or too numerous, or whether they contain special format characters (e.g. “%n”) that affect a reference beyond the intended memory area by a special interpretation through the runtime system (e.g. through the print function). Here comes the point of attack, where an intentional passing of “unsuitable” or “special” arguments can cause a distinct saving of information. This can lead either to an unnoticed stack (memory) overflow of the running program, which can cause the application to crash, or an attempt to take over program control. In the latter case, the goal is to overwrite the memory with program code (machine instructions, such as an SVC instruction) and/or manipulate the return address in order to continue in a particular program code.
Important conclusion: In general, the primary deficiencies of buffer overflow and format string attacks come from the application or routine itself, which does not consistently check received input for plausibility and conformity. The culprit is less likely to be the processor and/or the operating system. Strong attention should be paid to code inspections, which can even be partially automated, particularly during software development. In commercial or open source software, the user’s influence is minimized and the software must be applied “as is”.
Which possibilities for further protection of applications, such as web applications, exist on the z/OS platform in addition to the standard measures for securing applications? Aside from the real-time monitoring of applications with SF-Sherlock, there is the possibility of “encapsulating” the application with SF-Sherlock’s “logical trap” concept. Since there is only partial or no influence on an existing application, the application’s external behavior and character must be categorized and shielded. By describing or recording the “normal behavior” of the application, irregularities can be revealed quickly and securely; for instance, when a web server supplying services to customers suddenly accesses the payroll information and no longer only the product information. This is a real indication of a possible attack. The encapsulation of applications is a successfully applied measure of the SF-Sherlock practice. One important point to consider is that the purpose of each attack is to reach higher authorizations for performing subsequent accesses and operations. In the context of this abuse, regardless of the methods of attack, the buffer overflow and format string attack methods are only two possibilities. There are many others.
SF-Sherlock’s real-time monitoring and logical trap concept lets you achieve a higher and more solid protection of your critical applications.
Join our newsletter list
Stay updated with our news and events.
Worldwide toll-free phone number
+800 - 37 333 853
or simply dial:
+800 - DRFEDTKE
+41 41 710 7444
(+ represents the prefix for international calls; in most countries it is 00, and you have to dial 00800-37333853; in the U.S. it corresponds to 011, and you have to dial 011-800-37333853)