

LLVM internals, part 4: attributes and attribute groups
source link: https://blog.yossarian.net/2021/11/29/LLVM-internals-part-4-attributes-and-attribute-groups
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

attributes and attribute groups
ENOSUCHBLOG
Programming, philosophy, pedaling.
LLVM internals, part 4: attributes and attribute groups
Nov 29, 2021
Series: llvm-internals
Preword
This is the last post I plan to do on parsing LLVM’s bitcode, at least for a while. I’ll keep working on the various libraries that I’ve started under the mollusc umbrella, but they won’t be accompanied by regular posts anymore — I’d like to refocus on writing smaller, less dense (read: lower-pressure) posts for a while.
Also, as a sort of update: all four of the patchsets1 that I mentioned in the previous post have been fully merged into LLVM, including a couple of documentation and bitcode parsing bugfixes. Many thanks to the LLVM maintainers for reviewing and approving my changes!
Intrinsics, attributes, metadata?
LLVM has no less2 than three different ways to represent some of the metadata that gets stored in its intermediate representation of a program: metadata, attributes, and intrinsics. All three are represented differently in the bitcode format and this post will focus only on attributes, but it’s important to understand the semantic difference between the three.
Using this excellent 2016 LLVM developers’ meeting presentation as a reference:
-
Intrinsics are internal functions whose semantics are well-defined by and well-known to LLVM. Common examples of these include the
@llvm.memcpy.*
family for semantics similar to thememcpy(3)
libc
routine, as well as various mathematics intrinsics corresponding roughly tolibm
routines (such as@llvm.ceil
forceil(3)
).Generation of these intrinsics by language frontends (such as
clang
) is vital to LLVM’s ability to perform a wide variety of memory and arithmetic optimizations3, andclang
will not hesitate to produce them even for unoptimized runs:int main(void) { char x[1024]; char y[1024]; if (rand()) { memcpy(x, y, 1024); } else { memcpy(y, x, 1024); } return 0; }
yields:
define dso_local i32 @main() #0 !dbg !9 { %1 = alloca i32, align 4 %2 = alloca [1024 x i8], align 16 %3 = alloca [1024 x i8], align 16 store i32 0, i32* %1, align 4 call void @llvm.dbg.declare(metadata [1024 x i8]* %2, metadata !14, metadata !DIExpression()), !dbg !19 call void @llvm.dbg.declare(metadata [1024 x i8]* %3, metadata !20, metadata !DIExpression()), !dbg !21 %4 = call i32 @rand() #4, !dbg !22 %5 = icmp ne i32 %4, 0, !dbg !22 br i1 %5, label %6, label %9, !dbg !24 6: ; preds = %0 %7 = getelementptr inbounds [1024 x i8], [1024 x i8]* %2, i64 0, i64 0, !dbg !25 %8 = getelementptr inbounds [1024 x i8], [1024 x i8]* %3, i64 0, i64 0, !dbg !25 call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 16 %7, i8* align 16 %8, i64 1024, i1 false), !dbg !25 br label %12, !dbg !27 9: ; preds = %0 %10 = getelementptr inbounds [1024 x i8], [1024 x i8]* %3, i64 0, i64 0, !dbg !28 %11 = getelementptr inbounds [1024 x i8], [1024 x i8]* %2, i64 0, i64 0, !dbg !28 call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 16 %10, i8* align 16 %11, i64 1024, i1 false), !dbg !28 br label %12 12: ; preds = %9, %6 %13 = load i32, i32* %1, align 4, !dbg !30 ret i32 %13, !dbg !30 }
(View it on Godbolt.)
Consequently, LLVM’s intrinsics are correctness-bearing in IR programs: it is not safe, in the general case, to remove calls to intrinsic functions without providing an adequate substitute function4.
-
Attributes are markers that define special properties on a small subset of program features: functions (and their callsites), individual parameters, and return values.
In the textual representation of LLVM’s IR, attributes are associated with program features with one of two syntaxes. For parameters and return values, attributes are shown inline at their definition sites. Using a
memcpy
intrinsic variant as an example:declare void @llvm.memcpy.p0i8.p0i8.i64(i8* noalias nocapture writeonly, i8* noalias nocapture readonly, i64, i1 immarg) #3
Here, we have the following parameter attributes:
- Our two
i8*
parameters arenoalias
andnocapture
, meaning that (1) memory accesses through each pointer are exclusive, and (2) that neither pointer is “captured” within the function, e.g. by storing its value to a global or other memory location that outlives the function call. - Our first
i8*
iswriteonly
and our second isreadonly
, which enforce precisely what they say on the tin: that the function may only write or read through these respective pointers. - Finally, our
i1
parameter is marked asimmarg
, meaning that its parameter must be an immediate value (along with some other constraints) — anything that would have to be further loaded or executed to produce a value is invalid.
Similarly for a definition of
malloc
:declare dso_local noalias align 16 i8* @malloc(i64) #2
Here our single parameter has no attributes of its own, but our return value has two:
noalias
andalign 16
.Our second syntax is also visible in the snippets above: the
#N
anchor ties each function (or function callsite) to a list of attributes that apply to the entire function or callsite.For example, the
memcpy
intrinsic is associated with#3
, which is:attributes #3 = { argmemonly nofree nounwind willreturn }
(We’re getting into the weeds now, but a whirlwind tour:
argmemonly
asserts that the function only modifies memory through any pointer arguments it’s given;nofree
asserts that the function does not call afree
-type function on its pointer argument(s);nounwind
asserts that the function never raises an exception;willreturn
asserts that the function either contains UB or returns back to its callsite in some call stack.)Each of these, again, has important effects on the program’s correctness, so naively removing them5 is not safe. They also appear on many callsites, e.g.:
%5 = call i32 @rand() #4, !dbg !24 ; ... snip ... attributes #4 = { nounwind }
- Our two
-
Metadata is a lot of things. Among them:
-
Source-level debugging information that LLVM is responsible for lowering into a format like DWARF or CodeView.
Virtually everything that ends up in one of these binary debugging formats corresponds to one or more metadata entries (indicated textually as
!N
) on LLVM instructions, functions, &c. For example, here’s how LLVM relates an IR-level%struct
back to its original C-level layout with metadata entries:typedef struct { int x; char y; struct bar { long z0; } z; } foo; int main(void) { foo x; foo y; /* ... snip ... */ }
Here, the
x
andy
declarations get@llvm.dbg.declare
intrinsics with associated metadata entries (comments added):%1 = alloca i32, align 4 %2 = alloca %struct.foo, align 8 ; 'foo x' %3 = alloca %struct.foo, align 8 ; 'foo y' store i32 0, i32* %1, align 4 call void @llvm.dbg.declare(metadata %struct.foo* %2, metadata !12, metadata !DIExpression()), !dbg !24 call void @llvm.dbg.declare(metadata %struct.foo* %3, metadata !25, metadata !DIExpression()), !dbg !26
Following
metadata !12
, we can see that it accurately describes bothx
and its type, by way of subsequent metadata entries:!12 = !DILocalVariable(name: "x", scope: !7, file: !8, line: 13, type: !13) !13 = !DIDerivedType(tag: DW_TAG_typedef, name: "foo", file: !8, line: 10, baseType: !14) !14 = distinct !DICompositeType(tag: DW_TAG_structure_type, file: !8, line: 4, size: 128, elements: !15) !15 = !{!16, !17, !19} !16 = !DIDerivedType(tag: DW_TAG_member, name: "x", scope: !14, file: !8, line: 5, baseType: !11, size: 32) !17 = !DIDerivedType(tag: DW_TAG_member, name: "y", scope: !14, file: !8, line: 6, baseType: !18, size: 8, offset: 32) !18 = !DIBasicType(name: "char", size: 8, encoding: DW_ATE_signed_char) !19 = !DIDerivedType(tag: DW_TAG_member, name: "z", scope: !14, file: !8, line: 9, baseType: !20, size: 64, offset: 64) !20 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "bar", file: !8, line: 7, size: 64, elements: !21) !21 = !{!22} !22 = !DIDerivedType(tag: DW_TAG_member, name: "z0", scope: !20, file: !8, line: 8, baseType: !23, size: 64) !23 = !DIBasicType(name: "long int", size: 64, encoding: DW_ATE_signed)
That’s a lot of metadata!
-
Optional optimization assistance whose presence or absence does not affect program correctness. When present this metadata can be used to progressively enhance some optimizations. For example, the
!range
metadatum can be attached to aload
,call
, orinvoke
instruction to indicate that the value produced by the instruction can only be in one of the associated ranges.For example, adapted from the LLVM LangRef:
; `%a` can only be {0, 1, 3, 4} %a = load i8, i8* %x, align 1, !range !0 ; ... snip ... ; valid ranges: [0, 2), [3, 5) !0 = !{ i8 0, i8 2, i8 3, i8 5 }
Another great example is
!callees
, which helps LLVM perform indirect call promotion (which, in turn, decreases pressure on the indirect branch predictor at runtime).
-
So, to wrap things up: a correct bitcode parser needs to accurately handle intrinsics and attributes, while metadata can be more or less ignored until a brave masochistic soul feels the needs it for their own purposes.
Let’s get to it.
Parsing attributes from the bitcode
For reasons that are unclear to me, LLVM describes all attributes as “parameter attributes”
at the bitcode/bitstream level, even when said attributes refer to entire functions or function
return values. Similarly confusingly, LLVM splits said “parameter attributes” into two separate
bitstream level blocks: PARAMATTR_BLOCK
and PARAMATTR_GROUP_BLOCK
6.
The former references the latter (by way of indices), so the latter needs to be parsed first. As such, we’ll start with it.
PARAMATTR_GROUP_BLOCK
Here’s what LLVM’s bitcode docs have to say about this block:
The
PARAMATTR_GROUP_BLOCK
block (id 10) contains a table of entries describing the attribute groups present in the module. These entries can be referenced withinPARAMATTR_CODE_ENTRY
entries.
The PARAMATTR_GROUP_BLOCK
block can only contain one kind of record: PARAMATTR_GRP_CODE_ENTRY
.
Each PARAMATTR_GRP_CODE_ENTRY
record looks like this:
[ENTRY, grpid, paramidx, attr0, attr1, ...]
…where grpid
is a unique numeric identifier for this group of attributes,
and paramidx
identifies one of the following:
0
: This group contains attributes for the return value (i.e., call-site?7) of the function that references it (bygrpid
).0xFFFFFFFF
: This group contains attributes for the function itself.[1, 0xFFFFFFFF)
: This group contains attributes for theN
th function parameter, starting at 1.
We can represent this with a tidy enum
and a total mapping from u32
:
#[derive(Clone, Copy, Debug)]
pub enum AttributeGroupDisposition {
Return,
Parameter(u32),
Function,
}
impl From<u32> for AttributeGroupDisposition {
fn from(value: u32) -> Self {
match value {
u32::MAX => Self::Function,
0 => Self::Return,
_ => Self::Parameter(value),
}
}
}
(Docs link.)
Each attrN
has, in turn, even more internal structure:
{ kind, key [, ...], [value [, ...]] }
…where kind
indicates the layout of key
and value
:
0
:key
is an integer indicating a “well-known” attribute, such asnoalias
(9
). These are sometimes referred to as “enum attributes” in the LLVM source code. These attributes have no associated value — all necessary information is wholly present in the key itself.1
:key
is an integer indicating a “well-known” attribute, andvalue
is also an integer, representing a value associated with the attribute. For example,dereferenceable(<n>)
(41
) takes<n>
via thevalue
field, which in turn represents the address space that thedereferenceable
attribute applies to.-
3
:key
is a null-terminated string, indicating a not well-known “string” attribute. This string can be free-form and isn’t accompanied by a value. The absence of well-known string attributes means that there’s no single source of truth for values that are understood by various parts of LLVM. That being said, passing-fsplit-stack
onclang
13 seems to reliably introduce the"split-stack"
string attribute to any function that isn’t marked with__attribute__((no_split_stack))
:define dso_local i32 @main() #2 !dbg !9 { ; ... snip ... } ; ... snip ... attributes #2 = { noinline nounwind optnone uwtable "split-stack" }
(View it on Godbolt.)
-
4
:key
is a null-terminated string, andvalue
is also a null-terminated string. Like with the key-only variant, neither string is well-known and both are free-form. These are comparatively common, and we can see LLVM translate C-level__attribute__
s to them:__attribute__((__target__("sse4.2"))) int add1(int x, int y); __attribute__((__target__("sse4.1"))) int add2(int x, int y);
These become (simplified):
attributes #0 = { "target-features"="+cx8,+fxsr,+mmx,+popcnt,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87" } attributes #2 = { "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+sse3,+sse4.1,+ssse3,+x87" }
(View it on Godbolt.)
Observe that
#2
doesn’t includesse4.2
in itstarget-features
value, since we’ve capped it atsse4.1
.
The astute will notice that kind=2
isn’t specified. Why? Beats me8!
Once again, we can model this with a relatively tidy enum
:
// Each variant's value is exhaustively enumerated in turn, where possible.
// For example, `EnumAttribute` is not just a `u32` newtype but an exhaustive
// `enum` of all currently known "enum"-kinded LLVM attributes.
pub enum Attribute {
Enum(EnumAttribute),
Int(IntAttribute),
Str(String),
StrKeyValue(String, String),
}
(Docs link.)
…and to tie it all together, our model for each “attribute group” in PARAMATTR_GROUP_BLOCK
:
#[derive(Clone, Debug)]
pub struct AttributeGroup {
disposition: AttributeGroupDisposition,
attributes: Vec<Attribute>,
}
(Docs link.)
And our parsing procedure PARAMATTR_GROUP_BLOCK
:
- Create an initial mapping of
grpid -> AttributeGroup
- For each
PARAMATTR_GRP_CODE_ENTRY
inPARAMATTR_GROUP_BLOCK
:- Extract the
grpid
andparamidx
fields, which are always present and are always the first two in the record. Convert theparamidx
into its correspondingAttributeGroupDisposition
. - Initialize
fieldidx
as2
, indicating that we’ve already consumedgrpid
andparamidx
. - Parse
Attribute
s from the record using thekind
rules above, starting atfieldidx
and increasingfieldidx
by the number of record fields consumed at each parse step. Complete whenfieldidx == record.fields().len()
. - Construct a new
AttributeGroup
whose disposition and attributes are those just parsed. - Add the new
grpid
andAttributeGroup
to the mapping
- Extract the
…and here’s what that looks like, in Rust:
impl IrBlock for AttributeGroups {
const BLOCK_ID: IrBlockId = IrBlockId::ParamAttrGroup;
fn try_map_inner(block: &UnrolledBlock, _ctx: &mut MapCtx) -> Result<Self, BlockMapError> {
let mut groups = HashMap::new();
for record in block.all_records() {
let code = AttributeCode::try_from(record.code()).map_err(AttributeError::from)?;
if code != AttributeCode::GroupCodeEntry {
return Err(AttributeError::WrongBlock(code).into());
}
if record.fields().len() < 3 {
return Err(RecordMapError::BadRecordLayout(format!(
"too few fields in {:?}, expected {} >= 3",
code,
record.fields().len()
))
.into());
}
let group_id = record.fields()[0] as u32;
let disposition: AttributeGroupDisposition = (record.fields()[1] as u32).into();
let mut fieldidx = 2;
let mut attributes = vec![];
while fieldidx < record.fields().len() {
let (count, attr) = Attribute::from_record(fieldidx, record)?;
attributes.push(attr);
fieldidx += count;
}
if fieldidx != record.fields().len() {
return Err(RecordMapError::BadRecordLayout(format!(
"under/overconsumed fields in attribute group record ({} fields, {} consumed)",
fieldidx,
record.fields().len(),
))
.into());
}
groups.insert(
group_id,
AttributeGroup {
disposition,
attributes,
},
);
}
Ok(AttributeGroups(groups))
}
}
That leaves us with the final product of mapping the PARAMATTR_GROUP_BLOCK
block: a
mapping of grpid -> AttributeGroup
. Let’s see how the PARAMATTR_BLOCK
uses this mapping.
PARAMATTR_BLOCK
The other half of the attributes equation is the PARAMATTR_BLOCK
block, which is
documented by LLVM as follows:
The
PARAMATTR_BLOCK
block (id9
) contains a table of entries describing the attributes of function parameters. These entries are referenced by 1-based index in theparamattr
field of module blockFUNCTION
records, or within theattr
field of function blockINST_INVOKE
andINST_CALL
records.Entries within
PARAMATTR_BLOCK
are constructed to ensure that each is unique (i.e., no two indices represent equivalent attribute lists).
There are two valid record codes in the PARAMATTR_BLOCK
:
-
PARAMATTR_CODE_ENTRY_OLD
: This is a legacy record, emitted by LLVM 3.2 and earlier. We don’t bother handling it, since modern versions of LLVM won’t emit it at all and 3.2 is nearly a decade old at this point. However, for completeness, it looks like this:[ENTRY, paramidx0, attr0, paramidx1, attr1...]
Look familiar? That’s because it’s nearly identical to the
PARAMATTR_GRP_CODE_ENTRY
record format that we just comprehensively walked through: the only difference is the absence of the “group” concept. LLVM (presumably) deprecated this format when someone realized that many parameters and whole functions share entire groups of attributes, allowing for deduplication by referencing the groups rather than listing all attributes every time. -
PARAMATTR_CODE_ENTRY
: This is the referee record mentioned in thePARAMATTR_GROUP_BLOCK
documentation. EachPARAMATTR_CODE_ENTRY
looks like this:[ENTRY, attrgrp0, attrgrp1, ...]
…where
attrgrpN
is a attribute group index. And the circle is complete: we use our mapping of thePARAMATTR_GROUP_BLOCK
to fetch each list of attributes corresponding to each group index, and append them all together into one big list of eachPARAMATTR_CODE_ENTRY
. These lists, in turn, are referenced by 1-based indices in other blocks and records throughout the bitcode representation.That ends up being nice and simple:
impl IrBlock for Attributes { const BLOCK_ID: IrBlockId = IrBlockId::ParamAttr; fn try_map_inner(block: &UnrolledBlock, ctx: &mut MapCtx) -> Result<Self, BlockMapError> { let mut entries = vec![]; for record in block.all_records() { let code = AttributeCode::try_from(record.code()).map_err(AttributeError::from)?; match code { AttributeCode::Entry => { let mut groups = vec![]; for group_id in record.fields() { let group_id = *group_id as u32; log::debug!("group id: {}", group_id); groups.push( ctx.attribute_groups()? .get(group_id) .ok_or(AttributeError::BadAttributeGroup(group_id))? .clone(), ); } entries.push(AttributeEntry(groups)); } AttributeCode::GroupCodeEntry => { // This is a valid attribute code, but it isn't valid in this block. return Err(AttributeError::WrongBlock(code).into()); } _ => { return Err(BlockMapError::Unsupported(format!( "unsupported attribute block code: {:?}", code, ))) } } } Ok(Attributes(entries)) } }
Takeaways
Parsing the blocks responsible for LLVM’s attributes was moderately troublesome: nowhere nearly
as annoying as the type table9, but not as easy as the identification block or the string
table. All told, the current implementation requires
slightly under 900 lines of code, much of which
is documentation and enum
variants.
The end result of it all can be seen with in the debug logs of the
unroll-bitstream
example provided by the llvm-mapper
crate:
RUST_LOG=debug ./target/debug/examples/unroll-bitstream some-input.bc
…which, amidst a great deal of other output, should yield some messages like this (formatted for readability):
[2021-11-29T06:54:51Z DEBUG llvm_mapper::block::module] attributes:
Some(Attributes([
AttributeEntry(
[
AttributeGroup {
disposition: Function,
attributes: [
Enum(NoInline),
Enum(NoUnwind),
Enum(OptimizeNone),
Enum(UwTable),
StrKeyValue("correctly-rounded-divide-sqrt-fp-math", "false"),
StrKeyValue("disable-tail-calls", "false"),
StrKeyValue("frame-pointer", "all"),
StrKeyValue("less-precise-fpmad", "false"),
StrKeyValue("min-legal-vector-width", "0"),
StrKeyValue("no-infs-fp-math", "false"),
StrKeyValue("no-jump-tables", "false"),
StrKeyValue("no-nans-fp-math", "false"),
StrKeyValue("no-signed-zeros-fp-math", "false"),
StrKeyValue("no-trapping-math", "false"),
StrKeyValue("stack-protector-buffer-size", "8"),
StrKeyValue("target-cpu", "x86-64"),
StrKeyValue("target-features", "+cx8,+fxsr,+mmx,+sse,+sse2,+x87"),
StrKeyValue("unsafe-fp-math", "false"),
StrKeyValue("use-soft-float", "false")
]
}
]
)
]))
This information isn’t exposed anywhere in mapped LLVM modules, yet: it’s kept purely as
state within the MapCtx
.
Future mapping work (e.g., for IR-level functions, blocks, instructions, &c) will
access that state to correctly associate themselves with their attributes.
As I’ve said in previous posts: mollusc still has no particular end goal or state in mind, other than my broad goal of being able to perform some amount of static analysis of LLVM IR in pure Rust. The beatings development will continue until the masochism curiosity abates.
-
And possibly more; it’s a big project. In particular, I have no clue how “operand bundles” work or what they do. ↩
-
Why does LLVM need these intrinsics, rather than unconditionally inserting a call to e.g. an extremely optimized
memcpy(3)
implementation? Because even that sometimes isn’t enough: sometimes you want to inline the call entirely, or to call a slightly different optimizedmemcpy
implementation. Or for even simpler reasons: having an intrinsic here gives LLVM a fuller picture of the program than an external call, even a well understood one, would normally allow. ↩ -
There are, however, some intrinsics than can be safely deleted without compromising program correctness. The
@llvm.dbg.*
family is an example of this, but the intricacies of how these intrinsics interact with LLVM’s metadata facilities are well outside the scope of this post. Perhaps another time. ↩ -
Or failing to interpret them during code generation, or optimization, &c. ↩
-
This part would be less confusing if the
PARAMATTR_BLOCK
andPARAMATTR_GROUP_BLOCK
blocks didn’t share a record code namespace. But they do, so it’s not clear why they’re separated at all, especially when they have a tight dependency relationship. ↩ -
I think? ↩
-
LLVM’s
BitcodeReader
doesn’t mentionkind=2
at all; see here. It does however mentionkind=5
andkind=6
, despite these not being documented. I haven’t seen these in any bitcode samples yet so I haven’t bothered digging into them, but they’re apparently “type” attribute formats. ↩ -
Which had to be rewritten entirely after I published the last post in this series, due to unavoidable lifetime issues with the previous approach. ↩
Discussions: Reddit
Recommend
-
29
This is the fourth part of the .NET Internals Cookbook series. For your convenience you can find other parts in the table of contents in
-
16
Using PowerShell To Add Groups To "AcceptMessagesOnlyFromDLMembers" Exchange Attribute Here’s a bit of an obscure task. In Exchange you can configure the AcceptMessagesOnlyFromDLMembers attribute which does what...
-
4
Bitcoin Internals, Part 2 Published
-
10
Tweaking LLVM Obfuscator + quick look into some of LLVM internals.16-May-2015: Tweaking LLVM Obfuscator + quick look into some of LLVM internals. Code obfuscating is a good example of
-
7
LLVM internals, part 2ENOSUCHBLOG Programming, philosophy, pedaling. LLVM internals, part 2: parsing the bitst...
-
5
LLVM internals, part 1ENOSUCHBLOG Programming, philosophy, pedaling. LLVM internals, part 1: the bitcode format
-
8
LLVM internals, part 3ENOSUCHBLOG Programming, philosophy, pedaling. LLVM internals, part 3: from bitcode to IR
-
4
Caleb Sander Posted on Nov 27
-
5
Chrome Browser Exploitation, Part 1: Introduction to V8 and JavaScript InternalsJack HalonI like to break into things; both physically and virtually.Web browsers, our...
-
6
Modeling Git Internals in Alloy, Part 1: Blobs and TreesBrian Hicks, April 3, 2023Today we're going to learn some Alloy by modeling
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK