mirror of
https://github.com/Gericom/teak-llvm.git
synced 2025-06-23 05:25:50 -04:00

BTF is the debug format for BPF, a kernel virtual machine and widely used for tracing, networking and security, etc ([1]). Currently only instruction streams are passed to kernel, the kernel verifier verifies them before execution. In order to provide better visibility of bpf programs to user space tools, some debug information, e.g., function names and debug line information are desirable for kernel so tools can get such information with better annotation for jited instructions for performance or other reasons. The dwarf is too complicated in kernel and for BPF. Hence, BTF is designed to be the debug format for BPF ([2]). Right now, pahole supports BTF for types, which are generated based on dwarf sections in the ELF file. In order to annotate performance metrics for jited bpf insns, it is necessary to pass debug line info to the kernel. Furthermore, we want to pass the actual code to the kernel because of the following reasons: . bpf program typically is small so storage overhead should be small. . in bpf land, it is totally possible that an application loads the bpf program into the kernel and then that application quits, so holding debug info by the user space application is not practical. . having source codes directly kept by kernel would ease deployment since the original source code does not need ship on every hosts and kernel-devel package does not need to be deployed even if kernel headers are used. The only reliable time to get the source code is during compilation time. This will result in both more accurate information and easier deployment as stated in the above. Another consideration is for JIT. The project like bcc use MCJIT to compile a C program into bpf insns and load them to the kernel ([3]). The generated BTF sections will be readily available for such cases as well. This patch implemented generation of BTF info in llvm compiler. The BTF related sections will be generated when both -target bpf and -g are specified. Two sections are generated: .BTF contains all the type and string information, and .BTF.ext contains the func_info and line_info. The separation is related to how two sections are used differently in bpf loader, e.g., linux libbpf ([4]). The .BTF section can be loaded into the kernel directly while .BTF.ext needs loader manipulation before loading to the kernel. The format of the each section is roughly defined in llvm:include/llvm/MC/MCBTFContext.h and from the implementation in llvm:lib/MC/MCBTFContext.cpp. A later example also shows the contents in each section. The type and func_info are gathered during CodeGen/AsmPrinter by traversing dwarf debug_info. The line_info is gathered in MCObjectStreamer before writing to the object file. After all the information is gathered, the two sections are emitted in MCObjectStreamer::finishImpl. With cmake CMAKE_BUILD_TYPE=Debug, the compiler can dump out all the tables except insn offset, which will be resolved later as relocation records. The debug type "btf" is used for BTFContext dump. Dwarf tests the debug info generation with llvm-dwarfdump to decode the binary sections and check whether the result is expected. Currently we do not have such a tool yet. We will implement btf dump functionality in bpftool ([5]) as the bpftool is considered the recommended tool for bpf introspection. The implementation for type and func_info is tested with linux kernel test cases. The line_info is visually checked with dump from linux kernel libbpf ([4]) and checked with readelf dumping section raw data. Note that the .BTF and .BTF.ext information will not be emitted to assembly code and there is no assembler support for BTF either. In the below, with a clang/llvm built with CMAKE_BUILD_TYPE=Debug, Each table contents are shown for a simple C program. -bash-4.2$ cat -n test.c 1 struct A { 2 int a; 3 char b; 4 }; 5 6 int test(struct A *t) { 7 return t->a; 8 } -bash-4.2$ clang -O2 -target bpf -g -mllvm -debug-only=btf -c test.c Type Table: [1] FUNC name_off=1 info=0x0c000001 size/type=2 param_type=3 [2] INT name_off=12 info=0x01000000 size/type=4 desc=0x01000020 [3] PTR name_off=0 info=0x02000000 size/type=4 [4] STRUCT name_off=16 info=0x04000002 size/type=8 name_off=18 type=2 bit_offset=0 name_off=20 type=5 bit_offset=32 [5] INT name_off=22 info=0x01000000 size/type=1 desc=0x02000008 String Table: 0 : 1 : test 6 : .text 12 : int 16 : A 18 : a 20 : b 22 : char 27 : test.c 34 : int test(struct A *t) { 58 : return t->a; FuncInfo Table: sec_name_off=6 insn_offset=<Omitted> type_id=1 LineInfo Table: sec_name_off=6 insn_offset=<Omitted> file_name_off=27 line_off=34 line_num=6 column_num=0 insn_offset=<Omitted> file_name_off=27 line_off=58 line_num=7 column_num=3 -bash-4.2$ readelf -S test.o ...... [12] .BTF PROGBITS 0000000000000000 0000028d 00000000000000c1 0000000000000000 0 0 1 [13] .BTF.ext PROGBITS 0000000000000000 0000034e 0000000000000050 0000000000000000 0 0 1 [14] .rel.BTF.ext REL 0000000000000000 00000648 0000000000000030 0000000000000010 16 13 8 ...... -bash-4.2$ The latest linux kernel ([6]) can already support .BTF with type information. The [7] has the reference implementation in linux kernel side to support .BTF.ext func_info. The .BTF.ext line_info support is not implemented yet. If you have difficulty accessing [6], you can manually do the following to access the code: git clone https://github.com/yonghong-song/bpf-next-linux.git cd bpf-next-linux git checkout btf The change will push to linux kernel soon once this patch is landed. References: [1]. https://www.kernel.org/doc/Documentation/networking/filter.txt [2]. https://lwn.net/Articles/750695/ [3]. https://github.com/iovisor/bcc [4]. https://github.com/torvalds/linux/tree/master/tools/lib/bpf [5]. https://github.com/torvalds/linux/tree/master/tools/bpf/bpftool [6]. https://github.com/torvalds/linux [7]. https://github.com/yonghong-song/bpf-next-linux/tree/btf Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Differential Revision: https://reviews.llvm.org/D52950 llvm-svn: 344366
122 lines
3.8 KiB
C++
122 lines
3.8 KiB
C++
//===- llvm/CodeGen/DwarfFile.cpp - Dwarf Debug Framework -----------------===//
|
|
//
|
|
// The LLVM Compiler Infrastructure
|
|
//
|
|
// This file is distributed under the University of Illinois Open Source
|
|
// License. See LICENSE.TXT for details.
|
|
//
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
#include "Dwarf2BTF.h"
|
|
#include "DwarfFile.h"
|
|
#include "DwarfCompileUnit.h"
|
|
#include "DwarfDebug.h"
|
|
#include "DwarfUnit.h"
|
|
#include "llvm/ADT/SmallVector.h"
|
|
#include "llvm/CodeGen/AsmPrinter.h"
|
|
#include "llvm/CodeGen/DIE.h"
|
|
#include "llvm/IR/DebugInfoMetadata.h"
|
|
#include "llvm/MC/MCBTFContext.h"
|
|
#include "llvm/MC/MCContext.h"
|
|
#include "llvm/MC/MCStreamer.h"
|
|
#include <algorithm>
|
|
#include <cstdint>
|
|
|
|
using namespace llvm;
|
|
|
|
DwarfFile::DwarfFile(AsmPrinter *AP, StringRef Pref, BumpPtrAllocator &DA)
|
|
: Asm(AP), Abbrevs(AbbrevAllocator), StrPool(DA, *Asm, Pref) {}
|
|
|
|
void DwarfFile::addUnit(std::unique_ptr<DwarfCompileUnit> U) {
|
|
CUs.push_back(std::move(U));
|
|
}
|
|
|
|
// Emit the various dwarf units to the unit section USection with
|
|
// the abbreviations going into ASection.
|
|
void DwarfFile::emitUnits(bool UseOffsets) {
|
|
for (const auto &TheU : CUs)
|
|
emitUnit(TheU.get(), UseOffsets);
|
|
}
|
|
|
|
void DwarfFile::emitUnit(DwarfUnit *TheU, bool UseOffsets) {
|
|
if (TheU->getCUNode()->isDebugDirectivesOnly())
|
|
return;
|
|
|
|
DIE &Die = TheU->getUnitDie();
|
|
MCSection *USection = TheU->getSection();
|
|
Asm->OutStreamer->SwitchSection(USection);
|
|
|
|
TheU->emitHeader(UseOffsets);
|
|
|
|
Asm->emitDwarfDIE(Die);
|
|
}
|
|
|
|
// Compute the size and offset for each DIE.
|
|
void DwarfFile::computeSizeAndOffsets() {
|
|
// Offset from the first CU in the debug info section is 0 initially.
|
|
unsigned SecOffset = 0;
|
|
|
|
// Iterate over each compile unit and set the size and offsets for each
|
|
// DIE within each compile unit. All offsets are CU relative.
|
|
for (const auto &TheU : CUs) {
|
|
if (TheU->getCUNode()->isDebugDirectivesOnly())
|
|
continue;
|
|
|
|
TheU->setDebugSectionOffset(SecOffset);
|
|
SecOffset += computeSizeAndOffsetsForUnit(TheU.get());
|
|
}
|
|
}
|
|
|
|
unsigned DwarfFile::computeSizeAndOffsetsForUnit(DwarfUnit *TheU) {
|
|
// CU-relative offset is reset to 0 here.
|
|
unsigned Offset = sizeof(int32_t) + // Length of Unit Info
|
|
TheU->getHeaderSize(); // Unit-specific headers
|
|
|
|
// The return value here is CU-relative, after laying out
|
|
// all of the CU DIE.
|
|
return computeSizeAndOffset(TheU->getUnitDie(), Offset);
|
|
}
|
|
|
|
// Compute the size and offset of a DIE. The offset is relative to start of the
|
|
// CU. It returns the offset after laying out the DIE.
|
|
unsigned DwarfFile::computeSizeAndOffset(DIE &Die, unsigned Offset) {
|
|
return Die.computeOffsetsAndAbbrevs(Asm, Abbrevs, Offset);
|
|
}
|
|
|
|
void DwarfFile::emitAbbrevs(MCSection *Section) { Abbrevs.Emit(Asm, Section); }
|
|
|
|
// Emit strings into a string section.
|
|
void DwarfFile::emitStrings(MCSection *StrSection, MCSection *OffsetSection,
|
|
bool UseRelativeOffsets) {
|
|
StrPool.emit(*Asm, StrSection, OffsetSection, UseRelativeOffsets);
|
|
}
|
|
|
|
void DwarfFile::emitBTFSection(bool IsLittleEndian) {
|
|
Dwarf2BTF Dwarf2BTF(Asm->OutContext, IsLittleEndian);
|
|
for (auto &TheU : CUs)
|
|
Dwarf2BTF.addDwarfCU(TheU.get());
|
|
Dwarf2BTF.finish();
|
|
}
|
|
|
|
bool DwarfFile::addScopeVariable(LexicalScope *LS, DbgVariable *Var) {
|
|
auto &ScopeVars = ScopeVariables[LS];
|
|
const DILocalVariable *DV = Var->getVariable();
|
|
if (unsigned ArgNum = DV->getArg()) {
|
|
auto Cached = ScopeVars.Args.find(ArgNum);
|
|
if (Cached == ScopeVars.Args.end())
|
|
ScopeVars.Args[ArgNum] = Var;
|
|
else {
|
|
Cached->second->addMMIEntry(*Var);
|
|
return false;
|
|
}
|
|
} else {
|
|
ScopeVars.Locals.push_back(Var);
|
|
}
|
|
return true;
|
|
}
|
|
|
|
void DwarfFile::addScopeLabel(LexicalScope *LS, DbgLabel *Label) {
|
|
SmallVectorImpl<DbgLabel *> &Labels = ScopeLabels[LS];
|
|
Labels.push_back(Label);
|
|
}
|