济宁网站建设那家好,西安建设局网站小孩把,中企动力科技股份有限公司扬州分公司,桂林网站开发建设我们在《论文阅读#xff1a;ThinLTO: Scalable and Incremental LTO》中介绍了ThinLTO论文的主要思想#xff0c;这里我们介绍下LLVM ThinLTO是如何实现的。本文主要分为如下几个部分#xff1a;
LLVM ThinLTO Object 含有哪些内容#xff1f;LLVM ThinLTO 是如何做优化的…我们在《论文阅读ThinLTO: Scalable and Incremental LTO》中介绍了ThinLTO论文的主要思想这里我们介绍下LLVM ThinLTO是如何实现的。本文主要分为如下几个部分
LLVM ThinLTO Object 含有哪些内容LLVM ThinLTO 是如何做优化的LLVM ThinLTO 能够enable哪些优化
LLVM ThinLTO Objects都包含了哪些
继续使用 Example of link time optimization 中的例子进行分析在《LLVM full LTO 学习笔记》中我们通过 magic number 作为切入点简单分析了 full lto 的过程。下面按照这个路子继续该分析
$ clang -fltothin -c a.c -o a_lto.o
$ clang -fltothin -c main.c -o main_lto.o
$ hexdump a_lto.o | head
0000000 4342 dec0 1435 0000 0005 0000 0c62 2430
0000010 594d 66be fb8d 4fb4 c81b 4424 3201 0005
0000020 0c21 0000 0266 0000 020b 0021 0002 0000
0000030 0016 0000 8107 9123 c841 4904 1006 3932
0000040 0192 0c84 0525 1908 041e 628b 1080 0245
0000050 9242 420b 1084 1432 0838 4b18 320a 8842
0000060 7048 21c4 4423 8712 108c 9241 6402 08c8
0000070 14b1 4320 8846 c920 3201 8442 2a18 2a28
0000080 3190 b07c 915c c420 00c8 0000 2089 0000
0000090 000e 0000 2232 0908 6220 0046 2b21 9824我们可以看到 magic number 为 4342 dec0说明对于 thin LTO 的 objects其文件格式还是 bitcode file 。通过阅读 ThinLTO 的文档发现其实文档中早已经说的很详细了。 In ThinLTO mode, as with regular LTO, clang emits LLVM bitcode after the compile phase. The ThinLTO bitcode is augmented with a compact summary of the module. During the link step, only the summaries are read and merged into a combined summary index, which includes an index of function locations for later cross-module function importing. Fast and efficient whole-program analysis is then performed on the combined summary index. 使用 llvm-dis a_lto.o 得到其可读的 IR。我们将其与 full lto 得到的 IR 进行对比后发现两者差异极小主要在于最后面的 summary 部分。以 a_lto.o 进行 thinLTO 和 full LTO 的对比如下。
// ---------------- Thin LTO ----------------//
!llvm.module.flags !{!0, !1, !2, !3}
!llvm.ident !{!4}!0 !{i32 1, !wchar_size, i32 4}
!1 !{i32 7, !uwtable, i32 1}
!2 !{i32 7, !frame-pointer, i32 2}
!3 !{i32 1, !EnableSplitLTOUnit, i32 0}
!4 !{!clang version 14.0.0 (https://github.com/llvm/llvm-project.git 58e7bf78a3ef724b70304912fb3bb66af8c4a10c)}^0 module: (path: a_lto.o, hash: (3489747275, 1762444854, 1461358598, 2667786215, 1835806708))
^1 gv: (name: foo2, summaries: (function: (module: ^0, flags: (linkage: external, visibility: default, notEligibleToImport: 0, live: 0, dsoLocal: 1, canAutoHide: 0), insts: 2, funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 1, alwaysInline: 0, noUnwind: 1, mayThrow: 0, hasUnknownCall: 0, mustBeUnreachable: 0), refs: (writeonly ^2)))) ; guid 2494702099028631698
^2 gv: (name: i, summaries: (variable: (module: ^0, flags: (linkage: internal, visibility: default, notEligibleToImport: 0, live: 0, dsoLocal: 1, canAutoHide: 0), varFlags: (readonly: 1, writeonly: 1, constant: 0)))) ; guid 2708120569957007488
^3 gv: (name: foo1, summaries: (function: (module: ^0, flags: (linkage: external, visibility: default, notEligibleToImport: 0, live: 0, dsoLocal: 1, canAutoHide: 0), insts: 13, funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 1, alwaysInline: 0, noUnwind: 1, mayThrow: 0, hasUnknownCall: 0, mustBeUnreachable: 0), calls: ((callee: ^5)), refs: (readonly ^2)))) ; guid 7682762345278052905
^4 gv: (name: foo4) ; guid 11564431941544006930
^5 gv: (name: foo3, summaries: (function: (module: ^0, flags: (linkage: internal, visibility: default, notEligibleToImport: 0, live: 0, dsoLocal: 1, canAutoHide: 0), insts: 2, funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 1, alwaysInline: 0, noUnwind: 1, mayThrow: 0, hasUnknownCall: 0, mustBeUnreachable: 0), calls: ((callee: ^4))))) ; guid 17367728344439303071
^6 blockcount: 5// ---------------- Full LTO ----------------//
!llvm.module.flags !{!0, !1, !2, !3, !4}
!llvm.ident !{!5}!0 !{i32 1, !wchar_size, i32 4}
!1 !{i32 7, !uwtable, i32 1}
!2 !{i32 7, !frame-pointer, i32 2}
!3 !{i32 1, !ThinLTO, i32 0}
!4 !{i32 1, !EnableSplitLTOUnit, i32 1}
!5 !{!clang version 14.0.0 (https://github.com/llvm/llvm-project.git 58e7bf78a3ef724b70304912fb3bb66af8c4a10c)}^0 module: (path: a_lto.o, hash: (0, 0, 0, 0, 0))
^1 gv: (name: foo2, summaries: (function: (module: ^0, flags: (linkage: external, visibility: default, notEligibleToImport: 1, live: 0, dsoLocal: 1, canAutoHide: 0), insts: 2, funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 1, alwaysInline: 0, noUnwind: 1, mayThrow: 0, hasUnknownCall: 0, mustBeUnreachable: 0), refs: (^2)))) ; guid 2494702099028631698
^2 gv: (name: i, summaries: (variable: (module: ^0, flags: (linkage: internal, visibility: default, notEligibleToImport: 1, live: 0, dsoLocal: 1, canAutoHide: 0), varFlags: (readonly: 1, writeonly: 1, constant: 0)))) ; guid 2708120569957007488
^3 gv: (name: foo1, summaries: (function: (module: ^0, flags: (linkage: external, visibility: default, notEligibleToImport: 1, live: 0, dsoLocal: 1, canAutoHide: 0), insts: 13, funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 1, alwaysInline: 0, noUnwind: 1, mayThrow: 0, hasUnknownCall: 0, mustBeUnreachable: 0), calls: ((callee: ^5)), refs: (^2)))) ; guid 7682762345278052905
^4 gv: (name: foo4) ; guid 11564431941544006930
^5 gv: (name: foo3, summaries: (function: (module: ^0, flags: (linkage: internal, visibility: default, notEligibleToImport: 1, live: 0, dsoLocal: 1, canAutoHide: 0), insts: 2, funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 1, alwaysInline: 0, noUnwind: 1, mayThrow: 0, hasUnknownCall: 0, mustBeUnreachable: 0), calls: ((callee: ^4))))) ; guid 17367728344439303071
^6 flags: 8
^7 blockcount: 5我们将重点的差别进行 highlight
DifferenceThin LTOFull LTOModule Flags!3 !{i32 1, !ThinLTO, i32 0}Global Value Summary module ^0^0 module: (path: a_lto.o, hash: (3489747275, 1762444854, 1461358598, 2667786215, 1835806708))^0 module: (path: a_lto.o, hash: (0, 0, 0, 0, 0))Global Value Summary foo2 ^1- notEligibleToImport: 0 - refs: (writeonly ^2)- notEligibleToImport: 1 - refs: (^2)Global Value Summary i ^2- notEligibleToImport: 0notEligibleToImport: 1Global Value Summary foo1 ^3- notEligibleToImport: 0 - refs: (readonly ^2)- notEligibleToImport: 1 - refs: (^2)Global Value Summary foo3 ^5notEligibleToImport: 0notEligibleToImport: 1
通过 Metadata 知道! 后面表示的是 metadata^表示的是 global value summary。 All metadata are identified in syntax by an exclamation point (‘!’). Compiling with ThinLTO causes the building of a compact summary of the module that is emitted into the bitcode. The summary is emitted into the LLVM assembly and identified in syntax by a caret (‘^’). 通过 Module Flags Metadata 来对 !3 !{i32 1, !ThinLTO, i32 0} 进行解释。module flags metadata 是一组三元组 triplets The first element is a behavior flag, which specifies the behavior when two (or more) modules are merged together.The second element is a metadata string that is a unique ID for the metadata.The third element is the value of the flag. !3 !{i32 1, !ThinLTO, i32 0}ThinLTO 的值为 0 表示非 ThinLTO另外一个表明是否为 ThinLTO 或者 FullLTOGLOBALVAL_SUMMARY_BLOCK 默认是 thin lto。
$ llvm-bcanalyzer -dump a_full_lto.oBlock ID #24 (FULL_LTO_GLOBALVAL_SUMMARY_BLOCK):Num Instances: 1Total Size: 789b/98.62B/24WPercent of file: 3.4924%Num SubBlocks: 0Num Abbrevs: 6Num Records: 7Percent Abbrevs: 57.1429%Record Histogram:Count # Bits b/Rec % Abv Record Kind3 218 72.7 100.00 PERMODULE1 22 BLOCK_COUNT1 22 FLAGS1 22 VERSION1 38 100.00 PERMODULE_GLOBALVAR_INIT_REFS
$ llvm-bcanalyzer -dump a_thin_lto.oBlock ID #20 (GLOBALVAL_SUMMARY_BLOCK):Num Instances: 1Total Size: 789b/98.62B/24WPercent of file: 3.4727%Num SubBlocks: 0Num Abbrevs: 6Num Records: 7Percent Abbrevs: 57.1429%Record Histogram:Count # Bits b/Rec % Abv Record Kind3 218 72.7 100.00 PERMODULE1 22 BLOCK_COUNT1 22 FLAGS1 22 VERSION1 38 100.00 PERMODULE_GLOBALVAR_INIT_REFS在有 global value summary 的情况下默认是 thin lto除非 ThinLTO module metadata flag 为 0 。
/// Emit the per-module summary section alongside the rest of
/// the modules bitcode.
void ModuleBitcodeWriterBase::writePerModuleGlobalValueSummary() {// By default we compile with ThinLTO if the module has a summary, but the// client can request full LTO with a module flag.bool IsThinLTO true;if (auto *MD mdconst::extract_or_nullConstantInt(M.getModuleFlag(ThinLTO)))IsThinLTO MD-getZExtValue();Stream.EnterSubblock(IsThinLTO ? bitc::GLOBALVAL_SUMMARY_BLOCK_ID: bitc::FULL_LTO_GLOBALVAL_SUMMARY_BLOCK_ID,4);// ...
}RFC
https://lists.llvm.org/pipermail/llvm-dev/2015-May/085526.html https://sites.google.com/site/llvmthinlto/
Patches
https://reviews.llvm.org/D13107?id35761
Function Importer
https://reviews.llvm.org/D14914 https://reviews.llvm.org/D18343
llvm-opt2/llvm-opt相关
关于 SyntheticCount的讨论
https://lists.llvm.org/pipermail/llvm-dev/2017-December/119701.htmlhttps://reviews.llvm.org/D43521?id135117#inline-388028
/// Compute synthetic function entry counts.
void computeSyntheticCounts(ModuleSummaryIndex Index);相关术语
BFI, block frequency inforamtionBPIprobability informationCGSCCcall graph scc analysishttps://lists.llvm.org/pipermail/llvm-dev/2016-June/100792.html