Scalable Architectural Support for Trusted Software
- Hide Paper Summary
Link: https://ieeexplore.ieee.org/document/5416657
Year: HPCA 2010
Keyword: Security; Enclave; Bastion
Highlight:
-
Enclaves can be declared by the modules using a list of private and shared pages and the entry points;
-
Enclaves can be enforced by adding an extra module ID field to the TLB and then comparing the current ID of the module being executed with the TLB entry;
-
Control flow can be secured by letting the system to fault naturally on transfer of the control flow in or out of a module’s private code space, and then updating the current module ID;
-
Main memory can be protected with both encryption and validation using a key and hash pair
-
External storage can be protected similarly to the main memory, just that you need to save the key and hash pair in a secure storage. The root of the trust for secure storage can be built into the CPU core as a few non-volatile registers storing the metadata storage’s key and hash and the credential.
Comments:
-
If the three persistent registers are set by the hypervisor that creates the secure storage, and later boots only checks the hash of the current hypervisor against the stored hash of the first-time hypervisor, what if the hypervisor is corrupted from the beginning? Of course, the paper can just assume that first-time boot of the hypervisor is conducted by a trusted party using an already validated hypervisor.
-
Module stack pointer is introduced in Section 4.1 once, and never mentioned later. What is the purpose of it? Why do you need separate notions for the stack in addition to the module page list?
-
Who computes the module hash? Definitely not the OS as a malicious OS could report any arbitrary number. Does the hypervisor know when a binary is to be loaded? Or module hashes are just known a priori? The last point may make sense as modules also need to declare shared accesses with the long form IDs of other modules.
-
I don’t think it is the best practice to give the name “secure launch” to both the boot loader instruction for establishing the root of trust and to the hypercall that accepts memory access right declaration.
-
When an access violation occurs, does the system raise an exception to the hypervisor? If yes, then OS system calls that need to access memory will be really slow as each memory access will raise an exception, but the OS has to perform these operations nevertheless (with encrypted data of course). Otherwise, how would you solve the case where two modules sharing one page, and the TLB can only hold one module_id? I mean this problem is easy to address (when guest is running in OS mode then do not raise exception), but the paper did not give any discussion on this issue.
-
Is there any scenario where i_bit and c_bit are not both set or clear? The paper seems to suggest that they are either both set of both clear for modules and non-modules, respectively.
This paper proposes Bastion, a secure computing architecture built into the CPU core that allows secure execution of trusted hypervisors, software modules, and secure storage of data on external devices. The paper is motivated by the fact existing mechanisms for secure computing are neither comprehensive, nor efficient, indicating that a single mechanism is not sufficient to protect all system components without incurring high overhead. Bastion, on the other hand, provides a set of simple hardware interfaces for integrity verification, access control, and encryption/decryption, and relies on a trusted (and dynamically verified) hypervisor to protect both dynamic execution, data access, and external I/O of applications.
The paper begins with a discussion of the limitations of existing secure computing techniques. Virtual Machine Monitors (VMMs) are a popular option, because it offers hardware-assisted isolation between kernels. This provides a great advantage over schemes that rely on the kernel to be not compromised, as the protected application can be executed on a separate VMM instance using its own trusted OS. VMMs, however, incurs great overhead as well, since switching to the protected application involves a world switch into another VMM instance. The paper also lists limitations of other proposed designs. First, some of them lack scalability, since they need dedicated hardware resource per protected region, meaning that the number of protected regions is upper bounded by the number of available hardware resources. Second, these proposals lack flexibility since they do not take software context into consideration (I did not understand what does this mean in the paper). Lastly, some proposals assume that the hardware will be physically intact, which are prone to hardware attacks, such as using malicious bus agents to tap on the memory bus or actively corrupt data, or accessing the external storage by physically mounting it on a non-secure host.
Bastion surpasses previous proposals by having a rather comprehensive threat model. Bastion assumes that all system components, including the OS, the memory bus, the main memory modules, and the external devices, are all unreliable in a way that they are prone to software and/or hardware attacks. The only two trusted components in Bastion are the hypervisor, which is from a trusted software distributor and is always integrity-verified before every boot, and the CPU chip containing the core pipeline, the caches, and the Bastion hardware. The paper argues that despite the fact that the CPU chip can also be physically tapped or even altered, the technical difficulty in doing so is astronomical as modern chips are manufactured into multiple layers of logical gates and wires, making it almost impossible to attach a CPU chip. The paper also noted that Bastion does not intend to protect application from side-channel or covert channel attacks.
As mentioned above, Bastion relies on a trusted hypervisor to carry out all the security related actions, which itself is verified at boot time. Protected applications can be co-hosted by the same hypervisor in the same untrusted OS. The hypervisor may perform its address translation from guest virtual to host physical using either shadow page tables, or nested page tables (the latter is adopted by later hardware virtualization features). No matter which address translation scheme is used, the paper also assumes that the hypervisor has ultimate control of the translation entries installed into the TLB. On an architecture that does not have a page walker and relies the OS to install the entry, this can be done naturally as control is transferred to the hypervisor for installing the entry on a TLB miss. On architectures with a page walker, the paper proposes that the MMU should raise an interrupt to the hypervisor after the page table entry has been fetched from the memory, such that the hypervisor always has a chance to modify the entry to be inserted into the TLB (e.g., the entry to be installed is made available to the hypervisor).
The Bastion root of the trust starts with a boot-time validation of the hypervisor. The validation is performed by a short code sequence stored in an on-chip secure memory (which is assumed to be safe from attacks). The code sequence is triggered by an instruction, namely secure_launch, which is invoked by the untrusted boot loader after it has loaded the hypervisor module into the main memory. The secure_launch instruction calls into the validation procedure, which computes the hash of the current hypervisor image, and stores it in a register “hypervisor_hash”. This hash value is then compared with a non-volatile register, storage_owner_hash, which is the hash value of the hypervisor that created the secure storage. If both hashes match, then the other two non-volatile registers, namely secure_storage_key and secure_storage_hash, are exposed to the hypervisor as the cryptographic key for decoding the contents in the secure storage and for validating the integrity of the secure storage, respectively. The three volatile registers are initialized when a hypervisor is booted on the machine for the first time, and their values will persist across power cycles (it is not known how to implement them on real hardware, though). Secure storage will be discussed later as one of the security features provided by Bastion. If the hypervisor validation fails, either because the hypervisor is corrupted, resulting to a different hash, or because a malicious boot loader circumvents the secure_launch instruction, the secure storage will not be accessible, and the Bastion hardware will simply just destroy the secure storage by erasing the persistent registers.
During the secure boot, a secret key is also generated from a reliable source of randomness to server as the key for encryption and decryption of volatile data. The secret key is a one-time key that is not preserved across reboots. The last-level cache uses this key to encrypt outbound data before it is sent on the memory bus, and decrypt inbound data before it is stored as clear text in the cache hierarchy.
The first feature of Bastion is secure access to the memory. Despite already having per-process virtual address space as the means of isolation, this abstraction is way to coarse-grained, and faces several security challenges (the following is not in the paper, and I summarized them myself.). First, the virtual memory does not enforce any constraint on an application to access its own data at the same privilege level. This may be generally fine for a normally behaving application, but not acceptable if it goes rogue (e.g., being hijacked by a buffer overflow). Second, the OS always has the highest privilege, and therefore, is not bound by the access rules of virtual memory. If an OS is malicious, then user applications have no secret to hide. Bastion provides a finer-grained and stronger isolation primitive based on pages. It allows applications or modules within an application to explicitly declare access rights to only a subset of the pages within the module’s own addressable memory, shared pages with other modules, and allowed entry points. These set of rules will be strictly enforced both within the module and between the OS and the module, such that any non-declared access will result in either a fault (for application level accesses) or reading encrypted data. The latter is necessary for the OS to function correctly, as the OS may sometimes need to access module data in a system call on behalf of the module (e.g., during a swap-in/-out the OS will copy user data around). In this case, the OS can still be granted the access, but the content of memory it has read is encrypted, and therefore purely transparent to the OS.
We next describe the details of secure memory access. This feature works under the unit of modules, which is just a collection of code and data in page granularity, and it could be an entire application, a library, or just certain parts of an application, and so on. Each module has a long form identifier, which is the hash value of the module’s initial code and data pages after being loaded into the main memory. Each module also has a short form module ID, which is assigned internally by the hypervisor when a new module is seen, and is used at hardware level to identify a module. The hypervisor maintains a mapping table between the two forms of ID for translation. Modules not explicit declaring access rights are assumed to be on module zero, which is just the rest of the system and applications that do not use Bastion. Modules declare the memory access rights to the hypervisor via a SECURE_LAUNCH hypercall, which accepts a pointer to a security segment structure consisting of those explicit access right declarations, and the necessary arguments for parsing the table. The module has three types of access rights to declare. The first type is the private pages belonging to the module. They are declared with the virtual page address, and an access right mask indicating the read, write and execute permission. These private pages must not alias with any pages in other modules. The second type is shared pages, which are declared with the long form module ID of the sharing module, the virtual address of pages, and the access permission. The third type is entry points, meaning that control can only be transferred into the module at these specific sites. The entry points are simply just a list of virtual addresses where a function call could jump to.
After receiving the security segment, the hypervisor first performs a validity check for the access rights based on two rules. First, private pages must be truly private, i.e., they are not shared by any other modules. Second, all modules sharing a page must acknowledge each other. To this end, the hypervisor maintains a mapping table from physical page numbers to short form module IDs that declare access rights on the page. The table is used for both rule validation and context switch, as we see from below. After passing the validation, the access right declarations are entered into a table, called the Module State Table, for later reference.
The hardware extensions for enforcing the tightened memory access rights are discussed as follows. First, the processor’s execution context is extended with an extra module ID register for strong the ID of the current module being executed. One process could have multiple modules, and this register is updated when a procedure call enters and leaves a module, respectively, as we will see later. The TLB entries are also extended with an extra module_id field, which stores the short form module ID that could access the page. When a TLB miss occurs, the hypervisor either sets this field by performing the page walk by itself and using the page address to lookup the Module State Table, or wait for MMU’s page walk to finish, and then updates the module_id of the entry to be inserted. Note that for a shared page, multiple module_ids may be used. The hypervisor selects the current context’s module ID, if there is one in the Module State Table, or simply just signal an access right violation. During execution, the context module ID is checked against the module_id field in the TLB entry on every memory access, and if mismatch occurs, the hypervisor is signaled by raising an exception. Note that an exception can sometimes be resolved, if the current context has a different module_id than the TLB entry, but the page is legally shared between the current module and the module whose ID is in the TLB entry. In this case, the hypervisor resolves the exception by updating the TLB entry’s module_id field to the current context value.
As discussed above, the context module ID is updated when the control flow enters and leaves a module’s executable code space. This is naturally triggered by the TLB access right violation: When an external module (can be module zero) calls into the code space of a module, due to the fact that that module_id of the instruction page TLB entry will be a mismatch with the current context ID, an exception will be raised to the hypervisor. The hypervisor first uses the faulting address to find the module’s state table entry, and then it checks whether the call target address is a valid entry point. If the check passes, the hypervisor updates the current module ID to be the destination module’s ID. The return address is also saved in the hypervisor’s private memory for later validation. Leaving a module by a return instruction changes the module_id in a similar way, and additionally, the return address of the instruction is also checked against the earlier one saved by the hypervisor in order to avoid return address attacks.
Bastion also protects the physical memory to avoid physical attacks that attempt to obtain bus transferred data, and/or to corrupt the memory blocks. The former is achieved by encrypting part of the physical address space, using the one-time key generated during boot sequence, and only decrypting them when they are fetched into the cache hierarchy. The latter is achieved using a Merkle tree, which has become the de facto standard for memory integrity verification. The paper also suggests that only cache blocks belonging to modules are encrypted and verified, while blocks belonging to module zero are still used in the normal way. Two extra bits, namely the i_bit and the c_bit, are added to the cache for indicating these two status.
Bastion provides protection on external storage in a way that is similar to physical memory, i.e., using both encryption and integrity verification. Bastion has its own secure storage area, which is protected by the key and hash values stored in non-volatile registers. As mentioned earlier, these two values are only exposed to the hypervisor, if the hypervisor’s identity (the hash) is identical to the one that creates the secure storage. Within the secure storage, Bastion stores the key and hash values for each individual modules, which may also create their own secure storage. After unlocking the hypervisor’s own secure storage, these keys and hashes can be read by the hypervisor to encrypt/decrypt and to verify each module’s secure storage just like how it is done for the main memory.