In-fat pointer: hardware-assisted tagged-pointer spatial memory safety defense with subobject granularity protection
- Hide Paper Summary
Link: https://dl.acm.org/doi/10.1145/3445814.3446761
Year: ASPLOS 2021
Keyword: Fat Pointer; Memory Safety
Highlight:
-
Pointer tagging schemes can validate both at the object-level and sub-object level. In the latter case, we need to store the layout of objects in per-type layout tables and check them dynamically by hardware.
-
Object bounds metadata can be stored in various forms. They can be appended to the end of the object and addressed with an offset field, or be stored in per-type arenas whose address can be computed directly from the pointer values, or be stored in a global table whose index is stored in the upper bits of pointers.
Comments:
-
The paper assumes 16 unused bits in virtual address pointers. While this assumption remains true in today’s architecture, it may no longer hold on future architectures that have a bigger virtual address space.
-
In the subheap scheme, we do not need the index control register, because arena-based allocators have built-in mechanism to find the arena base address given an arbitrary pointer.
This paper proposes In-fat pointer, a hardware pointer tagging scheme that protects applications from spatial memory errors caused by pointer-related bugs. In-fat pointer leverages the higher unused bits in virtual address pointers and embeds object bound and layout information to help specialized hardware check the validity of the pointer when the pointers are dereferenced. Compared to prior works, in-fat pointer achieves a stronger protection guarantee which is at sub-object level without incurring high hardware and software overhead.
The paper is motivated by the inefficiencies of prior works. First, some prior works extend the width of a regular virtual address pointer to store the extra bits containing the base and bound information of the pointer. While not limited by the number of usable bits in the upper part of virtual pointers, such a scheme is likely to increase the application’s memory consumption due to wider pointers, and also break backward compatibility at the binary level because applications compiled to use the legacy interface is unable to be executed under the new model. Second, some other proposals, most notably AddressSanitizer, adopts a memory-centric approach where metadata is stored in shadow memory where every address that a pointer can possibly point to is reserved a shadow location to store the metadata. The shadow memory address of a pointer can then be derived using simple arithmetics. This design, while retaining the conventional pointer width, consumes an abnormally high amount of memory especially when the application allocates large arrays. In addition, the amount of memory that needs to be reserved is proportional to the size of the working set, rather than to the number of memory allocations. Lastly, there have also been explorations on tagged-pointer schemes where the metadata bits are embedded in the unused higher bits of virtual address pointers. Due to the relatively limited number of bits that can be freely repurposed, the metadata bits usually encode a metadata object that is stored somewhere else in the memory, and the metadata object must be fetched using some complicated addressing schemes. However, prior works on tagged-pointer designs often only guard memory accesses at the object level without considering sub-object level protection as a viable option, which limits the strength of the protection that they offer.
In-fat pointer addresses the limitations of prior works with two novel features. First, in-fat pointers embrace multiple metadata schemes and hence allow the compiler to select the optimal one during compilation time according to the static property of the object. Second, in-fat pointers also offer sub-object level protection that detects spatial memory errors, i.e., if a pointer is generated to point to a non-array field of an object, the pointer cannot be used to access adjacent fields within the same object. In-fat pointer is designed to be a tool that assists software debugging and prevents application bugs. It does not, however, completely stop malicious attackers from abusing memory bugs as in-fat pointer does not validate the metadata bits.
We next describe the hardware mechanism of in-fat pointer. The proposed design embeds metadata bits that encode both the location of the metadata and the field index that the pointer points to. The paper assumes 16 unused bits in virtual address pointers. The highest two bits are “poison bits” that indicate whether the pointer has failed validation or not. The two poison bits are set by bounds-checking hardware to store the result of the check, which can be in one of the three states: valid, meaning that the check passes; invalid, meaning that the check did not pass; and recoverable, which is implementation-dependent but usually off-by-one pointers (allowed by C standard but must not be dereferenced). The next two bits are scheme selector bits which determine the metadata scheme used for this pointer and affect the interpretation of the rest 12 bits, which we describe as follows.
The first metadata scheme is called the local offset scheme, where the metadata is appended to an object whose offset to the current pointer value is stored in 6 out of the 12 remaining bits. The other 6 bits store the field index that the pointer points to if the field is in a struct. The compiler is responsible for computing the offset when generating the pointer of a field and when pointer arithmetic is applied to the pointer value. Note that the hardware can guarantee that the offset is always valid since the pointer will be validated at the runtime to be within the dynamic bounds. The metadata field appended to the object is also statically generated by the compiler and remains invisible to programmers. This scheme supports objects whose size is less than several hundred bytes, which covers the majority of the cases. It can also be used for objects of all storage classes, including stack objects, heap objects, and global objects.
The second metadata scheme is the subheap scheme, which embeds the metadata object in heap allocation units called arenas. The subheap scheme assumes that the memory allocator will place same-type objects on an address-aligned bigger block called an “arena”. The metadata object is then placed at the beginning of the arena for easy address computation. Since arenas are address-aligned on the virtual address space, the metadata object can be accessed by rounding the pointer value down to the nearest arena boundary. In this scheme, all the lower 12 bits can be used to store the field index. This scheme is only applicable to heap-allocated objects but it consumes less memory since objects in the same arena share the same metadata object. Note: The paper describes a slightly different scheme where the base address of the metadata object is stored in one of the 16 control registers. In this scenario, 4 of the lower 12 bits must store the register index, and the software is responsible for loading the control registers with metadata addresses. This scheme is more expensive and less versatile than what I had above because it adds extra registers which must be properly maintained and context switched and cannot support more than 16 types.
The last metadata scheme is the global table scheme, where the metadata object is stored in a global table. All 12 bits are dedicated to the table index, and in-fat pointer does not offer sub-object level protection in this case. This scheme is used when the object is too large and/or cannot be allocated on the heap such that neither of the prior two works. The global table is generated statically by compilers, and the index to the global table is OR’ed to the pointer value (with code implicitly issued by the compiler) when a pointer is generated from such variables.
In all the above cases, the metadata object contains the base and bound of the object which delimits the memory region that the pointer can point to. The hardware will check the validity of the pointer when it performs load and store operations using the address and returns the status of the check using the poison bits. Compilers may also statically issue “promote” instructions to force a check. The authors also suggest adding a set of shadow registers to extend all general-purpose registers with the base and bound information. When a hardware validation is performed, the base and bound are then fetched from the metadata object and stored in the shadow register that corresponds to the pointer, using the scheme selector bits to determine the addressing scheme. Later validations of the same pointer will simply use the cached values instead of fetching them from the memory. The values of the shadow registers are invalidated when the register is reloaded with a new pointer value.
The in-fat pointer design also includes a customized compiler that is responsible for generating all the metadata objects during compilation, issuing code to perform validations, and assigning tags to pointers. Besides, when a pointer value is changed by pointer arithmetics, the compiler should also adjust the tag if the pointer uses the local offset scheme.
To enable sub-object level address checks, in-fat pointer further extends the metadata object with a pointer to a per-type layout table. The layout table is another compiler-generated data structure that describes the base and bounds of all fields within a struct or union. Each entry of the layout table consists of a base, bound, as the metadata object, and a parent field pointing to the containing struct of the current one (when structs are recursively defined). Recursively defined structs are flattened, i.e., all fields in child structs will appear in the outermost struct’s layout table. For array objects, only one entry is included in the layout table with the size of the array element and the element count. Recall from previous sections that the tagged pointers include the index of the field that the pointer is currently pointing to. To verify whether a pointer is valid at the sub-object level, the hardware first follows the metadata object to locate the layout table. Then the hardware uses the filed index in the pointer to locate the entry. If the entry is not in the outermost struct, the hardware will then recursively traverse the hierarchy until the outermost struct is reached, calculating the base and bound of the field along the way. After the base and bound of the field is computed, the pointer is then validated by checking whether it lies in the region delimited by the base and bound values.