4
[llvm-dev] RFC: Strong GC References in LLVM
source link: http://lists.llvm.org/pipermail/llvm-dev/2016-June/101522.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Strong GC References in LLVM
Fri Jun 24 00:22:39 PDT 2016
More information about the llvm-dev mailing list
[llvm-dev] RFC: Strong GC References in LLVM
Sanjoy Das via llvm-dev llvm-dev at lists.llvm.orgFri Jun 24 00:22:39 PDT 2016
This is a proposal to add strong GC reference types to LLVM. We have some local (downstream) patches that are needed to prevent LLVM's optimizer from making transforms that are problematic in the presence of a precise relocating GC. Adding a notion of a strong GC reference to LLVM will let us upstream these patches in a principled manner, and will act as a measure to avoid new problematic transforms. The word "strong" here is used as opposed to the weaker pointers that LLVM currently has which can be converted to and from integers without repercussions (i.e. inttoptr and ptrtoint are nop-like non-sideeffecting instructions). Let me emphasize first that this email is intended to *start* a conversation; and all of the points made here are open for discussion and debate. This proposal is informed by only _our_ GC and runtime and other runtimes may have different requirements. # Preamble & "Problem Statement" A quick introduction to some GC basics: a precise relocating garbage collector needs co-operation from the code generator / compiler to be able to scan and relocate objects pointed to from frame local state like CPU registers and stack slots. Specifically, it needs a list of the locations of all the GC references present in a stack frame; and at "safepoints" it stops "mutator" threads (via some mechanism not pertinent here), traverses all of the stack frames and uses the aforementioned list to scan and potentially rewrite those references to point to the new locations for the same objects. In addition, some garbage collectors need read and write barriers -- these are small bits of code that run when reading and writing GC references (distinct from memory reordering barriers). For more details on GC specific stuff, I suggest referring to http://www.memorymanagement.org/ and our 2014 LLVM-Dev talk http://llvm.org/devmtg/2014-10/#talk4. Getting LLVM ToT working with a precise relocating GC involves taking the following steps: Note: GCREF is the type representing a pointer that the GC needs to know about and potentially relocate. Currently it is `<Ty> addrspace(1)*`. 1. Generate IR, and represent GC pointer SSA values as values of type GCREF. 2. Run opt -O3. 3. Run RewriteStatepointForGC. It rewrites IR values of type GCREF to "normal" pointers that the backend can lower (note -- we don't physically rewrite the llvm::Types to keep things simple and fast). It does this by making all of the the data flow required to express relocation semantics explicit in the IR. 4. Rewrite loads and stores of GCREF type to execute load and store barriers (logically can be part of (3)). 5. Lower into machine code. For this discussion let's treat the internals of (3), (4) and (5) as black boxes, and center our discussion on what properties we require of the IR after step (2) so that step (3), (4) and (5) can do their job. However, if you think the overall design can be simplified by changing (3), (4) or (5), please feel free to point that out. :) After (2) and before (3), we want to have the following invariant: ''' Given any location "L" in the CFG of a function, let the set of all values of type GCREF whose defs dominate "L" be "G": A. "G" is a superset of the references that need to be reported to the GC if we put a safepoint at "L". References need to be reported to the GC if the GC may want to scan through them or rewrite them to point to the new locations of the objects they used to point to. B. Everything in "G" is a valid GC reference. This directly implies that all values of type GCREF hold a valid GC reference (since "L" could be the location immediately following the def). ''' Few important things to note about "valid GC reference": 1. There is no fixed notion of a valid GC reference. What constitutes a valid GC reference can change at arbitrary points in time. 2. There may not be a runtime bitwise predicate that can reliably distinguish a "valid" GC reference from an invalid one. The validity of a GC reference is a function of its provenance as well as its bitwise content. 3. (Related to the previous point) Arbitrary GEPs off of "valid GC references" are also "valid GC references". This includes out of bounds GEPs (whose bitwise representation can have any value, for all practical intents and purposes). [Aside: We could phrase the invariant in terms of fixed safepoint locations, but I think focusing on this (stronger) invariant actually makes the problem simpler (and no harder).] The invariant is somewhat vague and relies on white-box knowledge of the GC to define what values "need to be reported to the GC" (this will become clearer as we see some examples); and the "problem statement" is to come up with a precise set of rules governing the GCREF type so that the invariants we need fall out. Today we enforce the invariants downstream by `#if 0` ing out code that breaks it. However that's fragile; and more importantly, it does not help other people interested in precise GC and LLVM. # Examples of Bad Transforms Let's look at some some examples of bad transforms that are legal today (in upstream LLVM) with GCREF == "addrspace(1)*". Right now we'll justify their "badness" by invoking ad-hoc GC specific rules on a case by case basis: ## Integer -> GCREF Conversion (%value is unused) %value = load i64, i64* %ptr == Transformed to ==> %value = load GCREF, GCREF* (bitcast %ptr) This is a problem since we've introduced a value of type GCREF that may not be a valid GC reference at runtime. If the compiler can prove that %value is actually a valid GC reference then this transform is valid, but typically it won't be able to except in cases like "%value is provably 0" in which case it is better to just fold the load away. %value = load i64, i64* %ptr == Transformed to ==> %value = load i64, i64* %ptr %value_gc = inttoptr %value to GCREF (%value_gc unused) is invalid for the same reason. ## GCREF -> Integer conversion Converting a GCREF to an integer is fine if the integer is unused. However, cases like these are problematic: %v0 = load GCREF, GCREF* %ptr_0 %v1 = load GCREF, GCREF* %ptr_1 %cmp = icmp eq %v0, %v1 == Transformed to ==> %v0 = load i64, i64* %ptr_0 %v1 = load i64, i64* %ptr_1 %cmp = icmp eq %v0, %v1 Since in the second version if "L" is, say, between %v0 and %v1, "G" would not include "v0" which contains a GC reference that the GC may want to scan or relocate[0]. For similar reasons %v0 = load GCREF, GCREF* %ptr_0 %v1 = load GCREF, GCREF* %ptr_0 %cmp = icmp eq %v0, %v1 == Transformed to ==> %v0 = load GCREF, GCREF* %ptr_0 %v0.i = ptrtoint %v0 %v1 = load GCREF, GCREF* %ptr_0 %v1.i = ptrtoint %v1 %cmp = icmp eq %v0.i, %v1.i is also invalid. ## Round tripping GCREF's through Integers The following transform is invalid: %v0 = load GCREF, GCREF* %loc0 store GCREF %v0, GCREF* %loc1 == Transformed to ==> %v0 = load i64, i64* (bitcast %loc0) store i64 %v0, i64* (bitcast %loc1) because if "L" in the second version is between the load and the store, "G" will not include `%v0` even though we need to report it to the GC (otherwise we could propagate a stale bitwise representation of the GC reference we loaded in `%v0`). ## Bad types due to hoisting loads out of control flow This is bad if (is_reference) { %val = load GCREF, GCREF* %ptr } == Transformed to ==> %val = load GCREF, GCREF* %ptr if (is_reference) { } unless the compiler can prove that %val is a GC reference at its new location. Downstream we model the Java type system in LLVM IR to try to prove that %val is a GC reference in its new location, but for starters we will have to be conservative upstream. # Proposed Solution: We introduce a "new" LLVM type. I will still refer to it as GCREF here, but it may actually still be "<ty> addrspace(k)*" where k is specially noted in the datalayout. Semantics: 1. GCREF represents an equivalence class of values (equivalence relation being "points to a fixed semantic object"). The bitwise representation fluctuates constantly outside the compiler's control (the dual of `undef`), but may have invariants (in particular, we'd like to be able to specify alignment, nonnull etc.). At any given point in time all GCREF instances pointing to the same object have the same bitwise representation (we need this to make `icmp eq` is well-defined). 2. GCREF instances can only contain a valid gc reference (otherwise they can't meaningfully "fluctuate" among the various possible bitwise representations of a reference). 3. Converting GCREF to integers is fine in general, but you'll get an arbitrary "snapshot" of the bitwise value that will generally not be meaningful (unless you are colluding with the GC in implementation defined ways). 4. Converting integers to GCREF is allowed only if source integer is a bitwise representation of a valid GC reference that is not an out of bounds derived reference. However, this is difficult for the compiler to infer since it typically will have no fundamental knowledge of what bitwise representation can be a valid GC reference. 5. Operations that use a GCREF-typed value are "atomic" in using the bitwise representation, i.e., loading from a GCREF typed value does not "internally" convert the GCREF to a normal integer-pointer and then use the integer-pointer, since that would mean there is a window in which the integer-pointer can become stale[1]. 6. A GCREF stored to a location in the heap continues to fluctuate, and keeps itself in sync with the right bitwise representation. In a way, there isn't a large distinction between the GC and the heap -- the heap is part of (or managed by) the GC. I think (6) is the most controversial of the semantics above, but it isn't very different from how `undef` stored to the heap remains `undef` (i.e. a non-deterministic N-bit value) and a later load can recover `undef` instead of getting a normal N-bit value. ## How do we fare with the examples? Let's take a look some of the examples, and see how the semantics specified above help us: ### Demoting loads / stores from GCREF to integers is invalid (case 1): %v0 = load GCREF, GCREF* %ptr_0 %v1 = load GCREF, GCREF* %ptr_1 %cmp = icmp eq %v0, %v1 ==> %v0 = load i64, i64* %ptr_0 %v1 = load i64, i64* %ptr_1 %cmp = icmp eq %v0, %v1 Invalid because the initial IR is comparing two GC fluctuating references, while the the transformed IR will compare their bitwise snapshot taken at an arbitrary point in time. ### Demoting loads / stores from GCREF to integers is invalid (case 2): %v0 = load GCREF, GCREF* %loc0 store GCREF %v0, GCREF* %loc1 ==> %v0 = load i64, i64* (bitcast %loc0) store i64 %v0, i64* (bitcast %loc1) Invalid because there are two semantic differences between the initial and final IR: 1. In the initial IR we store a fluctuating and up-to-date GCREF, whereas in the final IR we store a potentially stale bitwise snapshot of the value. 2. In the initial IR, after the store *%loc1 has a fluctuating GC reference, while in the final IR *%loc1 only has a once-valid snapshot of a GC reference (that will no longer fluctuate). ### Basic store forwarding is valid: store GCREF %v0, GCREF* %loc0 %v1 = load GCREF, GCREF* %loc0 ==> store GCREF %v0, GCREF* %loc0 %v1 = %v0 The store forwarding is valid since we stored a GCREF that semantically points to object O (say), and loaded back a GCREF pointing to the same object O. ### Hoisting inttoptr is (generally) invalid: if (<condition>) { %val_gc = ptrtoint %val to GCREF } ==> %val_gc = ptrtoint %val to GCREF if (<condition>) {} is invalid since we do not know that `%val` is a valid bitwise representation of a GCREF at the point in time we do the `ptrtoint` (i.e. <<condition>> could be controlling whether `%val` is a valid bitwise representation of a GCREF). In theory this isn't different from loads, divisions or any other potentially trapping operation, but I think a lot of LLVM directly encodes the fact that `CastInst` s are obviously speculatable, so maybe Integer -> GCREF casts should be outlawed at the type system level, and be done via an intrinsic? ### Store forwarding between integers and GCREFs is valid store i64 %val, i64* %loc %val = load GCREF, GCREF* (bitcast %loc) ==> store i64 %val, i64* %loc %val = inttoptr %val to GCREF is valid. Both the initial and the final program assume that %val is a valid bitwise representation of a GC reference. (Note: it may not be valid to speculate the inttoptr instruction above control flow). ### Load forwarding between integers and GCREFs is invalid %val_i = load i64, i64* %loc %val_p = load GCREF, GCREF* (bitcast %loc) ==> %val_i = load i64, i64* %loc %val_p = inttoptr %val_i to GCREF is invalid if *%loc contains a GC reference. Given the model that we have, the first program loads some bitwise representation of fluctuating a GC reference, and then loads the same GC reference as a GCREF; whereas the second program tries to convert a potentially stale bitwise representation of a GC reference into a fluctuating GCREF (which is not allowed). ## Implementation Issues & Impact to LLVM This is a proposed path forward to implementing GCREF in LLVM. The caveat mentioned at the start of the email (everything is open for discussion) especially applies here. ### Changes to IR to represent GCREF We change `datalayout` to denote pointers of a specific address space as GCREFs. We add verifier rules forbidding `inttoptr` conversion of integers into GCREFs and add an intrinsic (that can be control dependent) that converts integers to GCREFs. ### Changes to the optimizer We change the optimizer to - Not speculate loads of GCREF type - Not change uses of GCREF types to uses of non-GCREF types Locally we already have changes fixing most of the places that'd need to be fixed for the above to hold. ### Lowering We do not teach LLC about GCREF's at all (in fact, we can change it to assert that the datalayout does not contain a `gc` directive). We make the RewriteStatepointForGC pass rewrite the IR to make relocations explicit (so that no more GCREF-like semantics are necessary) and remove the GC directive from the Module's datalayout. Thoughts? -- Sanjoy Footnotes: [0]: To see why this is bad assume to begin with %ptr_0 and %ptr_1 both contain a GC reference pointing to object X, currently at address 0x500. Now the following happens %v0 = load i64, i64* %ptr_0 %v0 is 0x500 << safepoint >> GC moves X to 0x600, and update *%ptr_1 %v1 = load i64, i64* %ptr_1 %v1 = 0x600 %cmp = icmp eq %v0, %v1 %cmp = false, when it should be true If %v0 was "reported to the GC" in <<safepoint>> then it would have been updated in place to 0x600, and %cmp would evaluate to true, as expected. [1]: In reality, this is handled by _not_ putting safepoints in problematic locations. For instance, if lowering a CompareAndSwap in some weird architecture would involve doing: // CAS(From, To, Location) Location ^= << Constant >> // The GC cannot parse the state at this location "L", since we've // done some arbitrary arithmetic on Location. ... Do some magic with Location, To, From then we _cannot_ put a safepoint at location "L".
More information about the llvm-dev mailing list
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK