3

LLVM从小白到放弃(一)- LLVM概述与LLVM环境搭建 - Tu9oh0st

 2 years ago
source link: https://www.cnblogs.com/Tu9oh0st/p/16143810.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

LLVM从小白到放弃(一)- LLVM概述与LLVM环境搭建

LLVM从小白到放弃(一)- LLVM概述与LLVM环境搭建

LLVM的历史

  • LLVM计划启动于2000年,开始由美国UIUC大学的Chris Lattner博士主持开展,后来Apple也加入其中。最初的目的是开发一套提供中间代码和编译基础设施的虚拟系统。
  • LLVM命名最早源自于底层虚拟机(Low Level Virtual Machine)的缩写,随着LLVM项目的不断发展,原先的全称已不再适用,目前LLVM就是该项目的全称。

什么是LLVM

  • 广义:LLVM是一个包括了很多模块的编译器框架。
  • 狭义:LLVM特指LLVM项目中的LLVM Core和Clang子模块。
  • 简单来收,可以将LLVM理解成为一个现代化、可扩展的编译器。

GCC的编译流程

  • GCC分为三个模块:前端、优化器和后端

暂时无法在文档外展示此内容

  • LLVM本质上也是三段式:

暂时无法在文档外展示此内容

LLVM的编译流程

  • 一个具体的例子:

暂时无法在文档外展示此内容

LLVM相对于GCC的优势

优势1:模块化
  • LLVM:LLVM是高度模块化设计的,每一个模块都可以从LLVM项目中抽离出来单独使用。
  • GCC:而GCC虽然也是三段式编译,但各个模块之间是难以抽离出来单独使用的。
优势2:可扩展
  • LLVM:LLVM为开发者提供了丰富的API,例如开发者可以通过LLVM Pass框架干预中间代码优化过程,并且配备了完善的文档
  • GCC:虽然GCC是开源的,但要在GCC的基础上进行扩展门槛很高、难度很大

LLVM编译过程总结

  • 对于C/C++程序来说,LLVM的编译过程如图所示:

LLVM环境搭建

Ubuntu/LLVM/CMake版本
  • Ubuntu 20.04
  • LLVM 12.0.1 / 9.0.9svn(ndkr21e)
  • Cmake 3.21.1
第一步:下载LLVM-Core和Clang源代码

https://github.com/llvm/llvm-project/releases/tag/llvmorg-12.0.1

clang-12.0.1.src.tar.xz

llvm-12.0.1.src.tar.xz

下载、解压、并重命名,并在同一目录下新建build文件夹,如下:

第二步:编译LLVM项目

在同一文件夹内创建build.sh文件,内容如下:

cd build 

cmake -G "Unix Makefiles" -DLLVM_ENABLE_PROJECTS="clang" -DCMAKE_BUILD_TYPE=Release -DLLVM_TARGETS_TO_BUILD="X86" -DBUILD_SHARED_LIBS=On ../llvm 

make 

make install

cmake 参数解释:

  • -G “Unix Makefiles”:生成Unix下的Makefile
  • -DLLVM_ENABLE_PROJECTS=“clang”:除了 LLVM Core 外,还需要编译的子项目。
  • -DLLVM_BUILD_TYPE=Release:在cmake里,有四种编译模式:Debug, Release, RelWithDebInfo, 和MinSizeRel。使用 Release 模式编译会节省很多空间。
  • -DLLVM_TARGETS_TO_BUILD=“X86”:默认是ALL,选择X86可节约很多编译时间。
  • -DBUILD_SHARED_LIBS=On:指定动态链接 LLVM 的库,可以节省空间。

LLVM基本用法

第一步:将源代码转化成LLVM IR
#include "iostream"



using namespace std;



int main() {

    cout << "Hello World!" << endl;



    return 0;

}

LLVM IR 有两种表现形式,一种是人类可阅读的文本形式,对应文件后缀为 .ll ;另一种是方便机器处 理的二进制格式,对应文件后缀为 .bc 。使用以下命令将源代码转化为 LLVM IR:

clang -S -emit-llvm hello.cpp -o hello.ll

clang -c -emit-llvm hello.cpp -o hello.bc

; ModuleID = 'hello.cpp'

source_filename = "hello.cpp"

target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"

target triple = "x86_64-unknown-linux-gnu"



%"class.std::ios_base::Init" = type { i8 }

%"class.std::basic_ostream" = type { i32 (...)**, %"class.std::basic_ios" }

%"class.std::basic_ios" = type { %"class.std::ios_base", %"class.std::basic_ostream"*, i8, i8, %"class.std::basic_streambuf"*, %"class.std::ctype"*, %"class.std::num_put"*, %"class.std::num_get"* }

%"class.std::ios_base" = type { i32 (...)**, i64, i64, i32, i32, i32, %"struct.std::ios_base::_Callback_list"*, %"struct.std::ios_base::_Words", [8 x %"struct.std::ios_base::_Words"], i32, %"struct.std::ios_base::_Words"*, %"class.std::locale" }

%"struct.std::ios_base::_Callback_list" = type { %"struct.std::ios_base::_Callback_list"*, void (i32, %"class.std::ios_base"*, i32)*, i32, i32 }

%"struct.std::ios_base::_Words" = type { i8*, i64 }

%"class.std::locale" = type { %"class.std::locale::_Impl"* }

%"class.std::locale::_Impl" = type { i32, %"class.std::locale::facet"**, i64, %"class.std::locale::facet"**, i8** }

%"class.std::locale::facet" = type <{ i32 (...)**, i32, [4 x i8] }>

%"class.std::basic_streambuf" = type { i32 (...)**, i8*, i8*, i8*, i8*, i8*, i8*, %"class.std::locale" }

%"class.std::ctype" = type <{ %"class.std::locale::facet.base", [4 x i8], %struct.__locale_struct*, i8, [7 x i8], i32*, i32*, i16*, i8, [256 x i8], [256 x i8], i8, [6 x i8] }>

%"class.std::locale::facet.base" = type <{ i32 (...)**, i32 }>

%struct.__locale_struct = type { [13 x %struct.__locale_data*], i16*, i32*, i32*, [13 x i8*] }

%struct.__locale_data = type opaque

%"class.std::num_put" = type { %"class.std::locale::facet.base", [4 x i8] }

%"class.std::num_get" = type { %"class.std::locale::facet.base", [4 x i8] }



@_ZStL8__ioinit = internal global %"class.std::ios_base::Init" zeroinitializer, align 1

@__dso_handle = external hidden global i8

@_ZSt4cout = external dso_local global %"class.std::basic_ostream", align 8

@.str = private unnamed_addr constant [13 x i8] c"Hello World!\00", align 1

@llvm.global_ctors = appending global [1 x { i32, void ()*, i8* }] [{ i32, void ()*, i8* } { i32 65535, void ()* @_GLOBAL__sub_I_hello.cpp, i8* null }]



; Function Attrs: noinline uwtable

define internal void @__cxx_global_var_init() #0 section ".text.startup" {

  call void @_ZNSt8ios_base4InitC1Ev(%"class.std::ios_base::Init"* nonnull dereferenceable(1) @_ZStL8__ioinit)

  %1 = call i32 @__cxa_atexit(void (i8*)* bitcast (void (%"class.std::ios_base::Init"*)* @_ZNSt8ios_base4InitD1Ev to void (i8*)*), i8* getelementptr inbounds (%"class.std::ios_base::Init", %"class.std::ios_base::Init"* @_ZStL8__ioinit, i32 0, i32 0), i8* @__dso_handle) #3

  ret void

}



declare dso_local void @_ZNSt8ios_base4InitC1Ev(%"class.std::ios_base::Init"* nonnull dereferenceable(1)) unnamed_addr #1



; Function Attrs: nounwind

declare dso_local void @_ZNSt8ios_base4InitD1Ev(%"class.std::ios_base::Init"* nonnull dereferenceable(1)) unnamed_addr #2



; Function Attrs: nounwind

declare dso_local i32 @__cxa_atexit(void (i8*)*, i8*, i8*) #3



; Function Attrs: noinline norecurse optnone uwtable mustprogress

define dso_local i32 @main() #4 {

  %1 = alloca i32, align 4

  store i32 0, i32* %1, align 4

  %2 = call nonnull align 8 dereferenceable(8) %"class.std::basic_ostream"* @_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc(%"class.std::basic_ostream"* nonnull align 8 dereferenceable(8) @_ZSt4cout, i8* getelementptr inbounds ([13 x i8], [13 x i8]* @.str, i64 0, i64 0))

  %3 = call nonnull align 8 dereferenceable(8) %"class.std::basic_ostream"* @_ZNSolsEPFRSoS_E(%"class.std::basic_ostream"* nonnull dereferenceable(8) %2, %"class.std::basic_ostream"* (%"class.std::basic_ostream"*)* @_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_)

  ret i32 0

}



declare dso_local nonnull align 8 dereferenceable(8) %"class.std::basic_ostream"* @_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc(%"class.std::basic_ostream"* nonnull align 8 dereferenceable(8), i8*) #1



declare dso_local nonnull align 8 dereferenceable(8) %"class.std::basic_ostream"* @_ZNSolsEPFRSoS_E(%"class.std::basic_ostream"* nonnull dereferenceable(8), %"class.std::basic_ostream"* (%"class.std::basic_ostream"*)*) #1



declare dso_local nonnull align 8 dereferenceable(8) %"class.std::basic_ostream"* @_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_(%"class.std::basic_ostream"* nonnull align 8 dereferenceable(8)) #1



; Function Attrs: noinline uwtable

define internal void @_GLOBAL__sub_I_hello.cpp() #0 section ".text.startup" {

  call void @__cxx_global_var_init()

  ret void

}



attributes #0 = { noinline uwtable "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" "unsafe-fp-math"="false" "use-soft-float"="false" }

attributes #1 = { "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" "unsafe-fp-math"="false" "use-soft-float"="false" }

attributes #2 = { nounwind "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" "unsafe-fp-math"="false" "use-soft-float"="false" }

attributes #3 = { nounwind }

attributes #4 = { noinline norecurse optnone uwtable mustprogress "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" "unsafe-fp-math"="false" "use-soft-float"="false" }



!llvm.module.flags = !{!0}

!llvm.ident = !{!1}



!0 = !{i32 1, !"wchar_size", i32 4}

!1 = !{!"clang version 12.0.1"}
第二步:优化LLVM IR

使用opt指令对LLVM IR进行优化

opt -load LLVMObfuscator.so -hlw -S hello.ll -o hello_opt.ll

  • -load 加载特定的LLVM Pass(集合)进行优化(通常为.so文件)
  • -hlw是LLVM Pass中自定义的参数,用来指定使用哪个Pass进行优化
第三步:编译LLVM IR为可执行文件

这一步我们通过Clang完成,从LLVM IR到可执行文件中间还有一系列复杂的流程,Clang帮助我们整合了这个过程

clang hello_opt.ll -o hello

__EOF__


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK