Security Requirements
Security from the enterprise perspective is essentially an exercise in risk management, whether those risks arise from randomly occurring accidental events or intentionally malevolent events. Examples of accidents include fire, hardware failure, software bugs, and data entry errors, while malevolence manifests itself in fraud, denial of service, theft, and sabotage. To some extent, the end result to the enterprise is lost business, which translates to lost money, regardless of whether a realized risk is accidental or intentional. However, the legal means of recourse can be quite different. In each case, data and applications may be deleted or modified, while physical systems may be damaged or destroyed. For example, Web sites with inadequate access controls on their source files that are displayed as Web pages are often defaced, with their contents modified for fun or profit. Computer systems that are stolen obviously have to be replaced, but the amount of reconfiguration required can be significant, as can the underlying loss of data.
Given that so many "mission critical" systems, such as those in banking and government, are now completely electronic, how would end users cope with the complete loss of their records? Worse still, personal data may be exposed to the world—embarrassing for the subscribers of an "adult" site, very dangerous for participants of a witness protection program.
Managing risk obviously involves a human element that is beyond the scope of this book—social-engineering attacks are widespread, and few technological solutions (except biometrics!) may ameliorate such problems. However, failures attributable to human factors can be greatly reduced with the kinds of automation and logical evaluation provided by computation. For example, an automated identification system that recognizes individuals entering an office suite, classifying them as employees or visitors, would reduce the requirements on local staff to "know" everybody's face in the organization. Because employees may object to the storage of their photographs on a work computer, you may want to consider using a one-way hash—instead of storing a photograph, a one-way hash can be computed from the image data and stored, while the original image is discarded. When an employee enters the building, a hash is computed of that image, and if a match is made against any of the hashes stored in the database, then the employee can be positively identified. In this way, organizational security can be greatly enhanced while not compromising the privacy of individual workers. The relationship between security and privacy need not be orthogonal, since well-designed technology can often provide solutions that ensure both privacy and security.
Security Architecture
Solaris security includes the need to protect individual files, as well as entire systems, from unauthorized access, especially using remote-access tools. However, these individual actions need to be placed within a context that logically covers all aspects of security, typically known as levels. A level is an extra barrier that must be breached to obtain access to data.
In terms of physical security, a bank provides an excellent analogy. Breaking into a bank's front counter and teller area is as easy as walking through the door, because these doors are publicly accessible. However, providing this level of access sometimes opens doors deeper inside the building. For example, the private banking area, which may normally be accessed only by staff and identified private banking customers, may allow access using a smart card. If a smart card is stolen from a staff member, it could be used to enter the secure area, because the staff member's credentials would be authenticated. Entering this level would not necessarily provide access to the vault: superuser privileges would be required. However, a thorough physical search of the private banking area might yield the key required for entry, or a brute-force attack on the safe's combination might be used to guess the correct combination. Having accessed the vault, if readily negotiated currency or bullion is contained therein, an intruder could easily steal it. However, if the vault contains checks that need to be countersigned, the intruder may not be able to make use of the contents. The lesson here is simple: banks provide public services that open up pathways straight to the cash. Banks know that any or all of the physical security layers may be breached. That's why the storage of negotiable securities is always minimized, because any system designed by humans can be broken by humans, with enough time and patience. The only sensible strategy is to make sure that external layers are as difficult to breach as possible and to ensure that security experts are immediately notified of breaches.
Similarly, public file areas, such as FTP and Web servers, are publicly accessible areas on computer systems that sometimes provide entry to a different level in the system. An easily guessed or stolen password may provide user-level (but unprivileged) access to the system. A brute-force attack against the local password database might even yield the superuser password. Accessing a local database might contain the target records of interest. However, instead of storing the data plaintext within tables, data may have been written using a stream cipher, making it potentially very difficult to obtain the data. However, because 40-bit ciphers have been broken in the past, obtaining the encrypted data might eventually lead to its dissemination. Again, a key strategy is to ensure that data is secured by as many external layers as possible, and also that the data itself is difficult to negotiate.
Increasing the number of levels of security typically leads to a decrease in system ease-of-use. For example, setting a password for accessing a printer requires users to remember and enter a password when challenged. Whether printer access needs this level of security will depend on organizational requirements. For a printer that prints on plain paper, no password may be needed. However, for a printer that prints on bonded paper with an official company letterhead, a password should be used to protect the printer and, optionally, a copy of the file being sent to the printer may need to be stored securely, for auditing purposes.
For government and military systems, a number of security specifications and policy documents are available that detail the steps necessary to secure Solaris systems in "top secret" installations. The U.S. Department of Defense, for example, publishes the "Orange Book," formally known as the "Department of Defense Trusted Computer System Evaluation Criteria." This publication describes systems that it has evaluated in terms of different protection levels, from weakest to strongest, including the following:
- Class D Systems that do not pass any tests and are therefore untrusted. No sensitive data should be stored on Class D systems.
- Class C1 Systems that require authentication based on a user model.
- Class C2 Systems that provide auditing and logging on a per-user basis, ensuring that file accesses and related operations can always be traced to the initiating user.
- Class B1 Systems that require security labeling for all files. Labels range from "top secret" to "unclassified."
- Class B3 Systems that requires that a standalone request monitor be available to authenticate all requests for file and resource access. In addition, the request monitor must be secured and all of its operations must be logged.
- Class A1 Systems that are formally tested and verified installations of a Class B3 system.
All of the strategies that are discussed in this chapter are focused on increasing the number of layers through which a potential cracker (or disgruntled staff member) must pass to obtain the data that they are illegally trying to access. Reducing the threat of remote-access exploits and protecting data are key components of this strategy.
Trusted Solaris
Trusted Solaris implements much stricter controls over UNIX than the standard releases, and it is capable of meeting B1-level security by default. It is designed for organizations that handle military-grade or commercially sensitive data. In addition to the mandatory use of Role-Based Access Control (as reviewed in Chapter 11), Trusted Solaris actually has no superuser at all: no single user is permitted to have control over every aspect of system service. This decentralization of authority is necessary in situations where consensus and/or authorization is required to carry out specific activities. For example, a system administrator installing a new Web server might inadvertently interfere with the operations of an existing service. For a server that's handling sensitive production data, the results could be catastrophic.
Once a system has been installed in production, it's crucial to define a set of roles that specifies what operations need to be performed by a particular individual. For example, the role of managing a firewall is unrelated to the database administration role, so the two roles should be separated rather than run from a single superuser account. In addition, access to files is restricted by special access control lists, which define file contents as being anything from "unclassified" up to "top secret." Access to data that is labeled as more secret requires a higher level of authentication and authorization than does access to unclassified data.
Four roles are defined by default under Trusted Solaris for system management purposes:
- Security officer Manages all aspects of security on the system, such as auditing, logging, and password management
- System manager Performs all system management tasks that are not related to security, except for installing new software
- Root account Used for installing new software
New roles can be created for other tasks, such as database and Web server administration, where necessary. Some aspects of a Trusted Solaris installation already form part of a standard Solaris installation. For example, Trusted Solaris requires that centralized authentication be performed across an encrypted channel using NIS+. This feature is also available on Solaris, although many sites are now moving to LDAP-based authentication.
Trust
When two or more parties communicate with each other, they must have some level of trust. This trust level will determine the extent to which most of the requirements discussed in this section will be required. For example, if your "trust domain" includes your lawyer's network, then the level of authorization required to access resources inside your network would be lower than the level required if an untrusted entity made a request for access. Similarly, if a principal is authenticated from a trusted domain, then they do not need to be separately authenticated for all subsequent resource requests.
The extent to which parties trust each other underlies the different types of security architectures that can be implemented. If a dedicated ISDN line connects two business partners, then they may well trust their communications to be more secure than if connections were being made across the public Internet. Again, it's a question of assessing and managing risks systematically before implementing a specific architecture.
Part of the excitement concerning the "Trusted Computing Platform" developed by Microsoft and others is that the trust equation is reversed between clients and servers, and becomes more symmetric. Currently, clients trust servers to process and store their data correctly—so, when you transfer $100 from a savings to a checking account using Internet banking, you trust that the bank will perform the operation. Indeed, the bank has sophisticated messaging and reliable delivery systems to ensure that such transactions are never lost. However, when server data is downloaded to a client, the client is pretty much free to do what they want to it, and the server is essentially powerless to control what the client does with this data. For example, imagine that a user pays to download an MP3 file to her computer from a music retailer. Once that physical file is stored on the client's hard drive, it can be easily e-mailed to others or shared using a file-swapping program. Even if each MP3 file was digitally watermarked on a per-client basis, so that illegally shared files could be traced back to the original purchaser, this is still not going to prevent sharing.
So, the notion of making the trust relationship between client and server symmetric simply means that the client trusts the server to honor certain types of data and operations. Thus, even if an MP3 file is downloaded to a client's hard drive, the client operating system must ensure that this cannot be accessed by any application other than an MP3 player. The only question is whether users will permit this kind of trust level, given that malicious server applications could potentially take control of the client's computer. For example, if an illegal MP3 file were detected on the client's system, would the server have the ability, if not the explicit right, to delete it?
Integrity and Accuracy
Integrity refers to whether data is valid or invalid. Invalid data may result from a number of different sources, both human and computer in origin. For example, many (flawed) business processes require multiple entry of the same data multiple times, potentially by multiple users. This can lead to integrity breaches if errors are made. More commonly, though, communication errors can occur when data is transmitted over a network, or when data is being transferred in memory. Most network protocols use parity to ensure data integrity—an extra bit is added to every 8 bits of data, which is used with a checksum algorithm to determine whether an error has been detected. Parity mechanisms can detect errors in data but are not capable of fixing errors—typically, a retransmission is required, but this is better than losing data altogether.
Memory corruption can occur when a program reads memory that is allocated to a different application or, more seriously, attempts to overwrite data of another application. Fortunately, most modern memory hardware contains error-correction coding (ECC) that ensures that data errors can be easily identified and fixed before they cause any problems.
Since network protocols and hardware protocols generally handle invalid data caused by system hardware and software, the Application layer is generally where there is great concern over data integrity, especially where that data has been transmitted over a network, because data can potentially be intercepted, modified, and then relayed by a malicious third party (for pleasure or profit). For example, a share-trading application might require traders to authenticate themselves over a Secure Sockets Layer (SSL) connection, but then revert to plaintext mode for processing all buy and sell requests, because SSL is too slow to encrypt all traffic to and from the broker's Web site, particularly when thousands of users are trading concurrently. A malicious third party might take control of a downstream router and write a filter that changes all co-occurrences of "BUY" and "1000" shares to "10000" shares. Such an attack would be difficult to thwart, unless SSL was used for all transmissions.
One way to determine data integrity is to use a one-way hash function, like a message digest. These functions can be computed from a string of arbitrary length and return an almost unique identifier, such as b6d624gaf995c9e7c7da2a816b7deb33. Even a small change in the source string changes the computed function, so the message digests can be used to detect data tampering. There are several algorithms available to compute such hash functions, including MD5 and SHA-1, which all generate a different bit length digest—the longest digests provide a relatively stronger guarantee of collision resistance; that is, when the same digest is computed from the same piece of data. Fortunately, the probability of doing this is very low. The bit lengths of MD5 and SHA-1 are 128 and 160 bits, respectively.
Accurate data has all the properties of data integrity, plus an assurance that what a piece of data claims to be is what it actually is. Thus, a username of nine characters cannot be an accurate login if the maximum length is eight characters. Indeed, the many buffer-overflow attacks of recent years have exploited excess data length as a key mechanism for bringing down applications that don't explicitly check for data accuracy.
Authenticity and Consistency
An issue related to integrity is authenticity—that is, given a piece of data that has demonstrated integrity, how do you know that it is authentic? If multiple copies of data exist, and they all pass integrity checks, how do you know whether they are consistent with each other? If multiple copies of a piece of data exist, and one or more copies are inconsistent, how do you establish which one (or more) copies is authentic? The issues of authenticity and consistency are closely related in distributed systems.
Identification and Authentication
How can you determine whether data is authentic? The most common method is to authenticate the principal who is presenting the data. The most common form of authentication is a username and password combination. The username represents the identity of a specific principal, and is known publicly, while the password is a secret token that (in theory) is known only by the user. In practice, users create passwords that can be easily guessed (e.g., birth date, middle name, vehicle registration) or that are written down somewhere (e.g., on a Post-it Note, a sheet of paper in the top drawer, or a whiteboard). If the password consists of a string of random characters of sufficient length that is equally probable as any other random string to be guessed, and if the password is secret, then the system works well. Defining sufficient length is sometimes difficult—the UNIX standard for passwords is eight case-sensitive alphanumeric characters, while most ATM PINs are four digits. Thus, there are 104 (10,000) possible ATM PIN permutations, while there are approximately 948 (6,095,689,385,410,816) possible UNIX password permutations.
UNIX authentication typically permits three incorrect logins before a delay of 15 seconds, to prevent brute-force attacks. If an automated sequence of three login attempts took 1 second, without any delays, then a search of all possible passwords would take around 193,293,042 years. Of course, there are potentially ways around this—if the shadow password file can be directly obtained, then the search space can be partitioned and the generation of candidate passwords can be parallelized. Using a good password-guessing program like Crack on a fast computer with a shadow password file can usually yield results within a matter of minutes or hours if passwords are weak.
There are more sophisticated mechanisms for authentication that revolve around two different strategies—strong identification and strong authentication. Strong identification typically means using an identifier that cannot be presented by anyone other than the intended user. These identifiers are usually biometric—iris scans, face recognition, and fingerprint recognition are becoming more commonly used as identifiers. Of course, there are concerns that the strength of identification can be easily compromised—an eye could be plucked and presented to the monitor, or its patterns transcribed holographically. In these cases, the scanners are sensitive enough to detect whether the eye is living and will reject scans that do not meet this criteria. Face recognition systems are very reliable with a small number of samples, but often have problems "scaling up" to identify individuals within pools of thousands, millions, or even billions of potential users. Fingerprint systems have been shown to be weak because an imprint is left on the glass of the device—a mould can be easily taken and used to produce fake prints that fool most devices.
While these teething problems should be overcome with further research, it is always recommended to combine strong identification with strong authentication. The username and password combination can be greatly enhanced by the use of a one-time pad, to implement two-factor authentication. Here, a user is authenticated using the fixed user password—when this is accepted by the system, it computes a second password that is not transmitted and is (say) time dependent. This password is also generated by a device or physical pad that the user carries with them. Once the one-time password is transmitted by the user, and she is authenticated a second time, the password becomes invalid. The password need not be time based—a random or chaotic function could be iterated with a different seed or initial parameter to generate a fixed sequence of passwords for each user. As long as that function remains secret, strong authentication can be assured.
One reason why these strong authentication measures are necessary is that usernames and passwords transmitted in the clear over a network can be intercepted by a malicious third party who is promiscuously reading the contents of packets from a network. Thus, if a sniffer application runs on a router between the client and server, the username and password can be intercepted. If the link cannot be secured by using a Virtual Private Network (VPN) of some description, or even a secure client such as Secure Shell (SSH), then a one-time pad is ideal—all tokens can be intercepted because they are valid for a single session only; the tokens cannot be used to successfully log in a second time, because the generated password is invalidated once the first login has been accepted. Even if the link can be secured, a one-time pad is still useful because the client may not be trusted, and all keystrokes could be logged and saved for future malicious use. For example, keyboard listeners installed on Internet cafĂ© PCs could record username and password combinations and automatically e-mail them to a cracker. These would be unusable if a one-time pad were used because of the expiry time of the second factor.
Credits:
Solaris 10: The Complete Reference