C or C++ language standard. eg ‘c++11’ == ‘c++0x’ ‘c++17’ == ‘c++1z’, which ‘c++0x’,’c++17’ is develop codename
-Wunknown-pragmas
未知的pragma会报错(-Wno-unknown-pragmas 应该是相反的)
-fomit-frame-pointer
不生成栈帧指针,属于-O1优化
-Wstack-protector
没有防止堆栈崩溃的函数时warning (-fno-stack-protector)
-MMD
only user header files, not system header files.
-fexceptions
Enable exception handling.
-funwind-tables
Unwind tables contain debug frame information which is also necessary for the handling of such exceptions
-fasynchronous-unwind-tables
Generate unwind table in DWARF format. so it can be used for stack unwinding from asynchronous events
-fabi-version=n
Use version n of the C++ ABI. The default is version 0.(Version 2 is the version of the C++ ABI that first appeared in G++ 3.4, and was the default through G++ 4.9.) ABI: an application binary interface (ABI) is an interface between two binary program modules. Often, one of these modules is a library or operating system facility, and the other is a program that is being run by a user.
-fno-rtti
Disable generation of information about every class with virtual functions for use by the C++ run-time type identification features (dynamic_cast and typeid). If you don’t use those parts of the language, you can save some space by using this flag
-faligned-new
Enable support for C++17 new of types that require more alignment than void* ::operator new(std::size_t) provides. A numeric argument such as -faligned-new=32 can be used to specify how much alignment (in bytes) is provided by that function, but few users will need to override the default of alignof(std::max_align_t). This flag is enabled by default for -std=c++17.
-Wl, xxx
pass xxx option to linker, e.g., -Wl,-R/staff/shaojiemike/github/MultiPIM_icarus0/common/libconfig/lib specify a runtime library search path for dynamic libraries (shared libraries) during the linking process.
General Optimization Options
-O, -O2, -O3
-O3 turns on all optimizations specified by -O2
and also turns on the -finline-functions, -funswitch-loops, -fpredictive-commoning, -fgcse-after-reload, -ftree-loop-vectorize, -ftree-loop-distribute-patterns, -ftree-slp-vectorize, -fvect-cost-model, -ftree-partial-pre and -fipa-cp-clone options
Unroll loops whose number of iterations can be determined at compile time or upon entry to the loop. -funroll-loops implies -frerun-cse-after-loop. This option makes code larger, and may or may not make it run faster.
-funroll-all-loops
Unroll all loops, even if their number of iterations is uncertain when the loop is entered. This usually makes programs run more slowly.-funroll-all-loops implies the same options as -funroll-loops,
max-unrolled-insns
The maximum number of instructions that a loop should have if that loop is unrolled, and if the loop is unrolled, it determines how many times the loop code is unrolled. 如果循环被展开,则循环应具有的最大指令数,如果循环被展开,则它确定循环代码被展开的次数。
max-average-unrolled-insns
The maximum number of instructions biased by probabilities of their execution that a loop should have if that loop is unrolled, and if the loop is unrolled, it determines how many times the loop code is unrolled. 如果一个循环被展开,则根据其执行概率偏置的最大指令数,如果该循环被展开,则确定循环代码被展开的次数。
max-unroll-times
The maximum number of unrollings of a single loop. 单个循环的最大展开次数。
R_X86_64_PC32。重定位一个使用 32 位 PC 相对地址的引用。回想一下 3.6.3 节,一个 PC 相对地址就是距程序计数器(PC)的当前运行时值的偏移量。当 CPU 执行一条使用 PC 相对寻址的指令时,它就将在指令中编码的 32 位值加上 PC 的当前运行时值,得到有效地址(如 call 指令的目标),PC 值通常是下一条指令在内存中的地址。(将 PC 压入栈中来使用)
typedefstruct { long offset; /* Offset of the reference to relocate */ long type:32, /* Relocation type */ symbol:32; /* Symbol table index */ long addend; /* Constant part of relocation expression */ } Elf64_Rela;
目标文件与库的位置
链接器通常从左到右解析依赖项,这意味着如果库 A 依赖于库 B,那么库 B 应该在库 A 之前被链接。
objdump -g <archive_file>.a # 如果.o文件有debugging symbols,会输出各section详细信息 Contents of the .debug_aranges section (loaded from predict-c.o): # 没有则如下 cabac-a.o: file format elf64-x86-64
Because the compatibility problem may you should install pin with archlinux package
Installation
This part is always needed by pintool, for example Zsim, Sniper.
When you meet the following situation, you should consider update your pin version even you can ignore this warning by use flags like -ifeellucky under high compatibility risk.
1 2 3
shaojiemike@snode6 ~/github/ramulator-pim/zsim-ramulator/pin [08:05:47] > ./pin E: 5.4 is not a supported linux release
because this will easily lead to the problem
1
Pin app terminated abnormally due to signal 6. # or signal 4.
Install pintool(zsim) by reconfig pin version
My first idea is try a compatible pin version (passd a simple test pintool, whatever) instead of the old pin.
Find the suitable simpler pintool can reproduce the situation (old pin failed, but newest pin is passed)
TODO: build(fix pin2.14 CXX_ABI compatibility bug), test suitability
debug the pin tool in details (See in another blog)
for (SEC sec = IMG_SecHead(img); SEC_Valid(sec); sec = SEC_Next(sec)) { for (RTN rtn = SEC_RtnHead(sec); RTN_Valid(rtn); rtn = RTN_Next(rtn)) { // Prepare for processing of RTN, an RTN is not broken up into BBLs, // it is merely a sequence of INSs RTN_Open(rtn);
for (INS ins = RTN_InsHead(rtn); INS_Valid(ins); ins = INS_Next(ins)) { count++; }
// to preserve space, release data associated with RTN after we have processed it RTN_Close(rtn); } }
IMG_AddInstrumentFunction() Use this to register a call back to catch the loading of an image
插桩不仅可以对每个指令插桩,还可以通过分类筛选后,只对符合要求的指令进行插桩
比如,使用INS_InsertPredicatedCall()
遍历所有的指令
1 2 3 4 5
// Forward pass over all instructions in bbl for( INS ins= BBL_InsHead(bbl); INS_Valid(ins); ins = INS_Next(ins) ) // Forward pass over all instructions in routine for( INS ins= RTN_InsHead(rtn); INS_Valid(ins); ins = INS_Next(ins) )
遍历trace内BBLs
1 2 3 4 5 6
// Visit every basic block in the trace for (BBL bbl = TRACE_BblHead(trace); BBL_Valid(bbl); bbl = BBL_Next(bbl)) { // Insert a call to docount before every bbl, passing the number of instructions BBL_InsertCall(bbl, IPOINT_BEFORE, (AFUNPTR)docount, IARG_UINT32, BBL_NumIns(bbl), IARG_END); }
遍历指令里的memOperands
1 2 3 4 5 6 7
UINT32 memOperands = INS_MemoryOperandCount(ins); // Iterate over each memory operand of the instruction. for (UINT32 memOp = 0; memOp < memOperands; memOp++){ if (INS_MemoryOperandIsRead(ins, memOp)||INS_MemoryOperandIsWritten(ins, memOp) //xxx }
// IPOINT_BEFORE 时运行的分析函数 VOID printip(VOID* ip) { fprintf(trace, "%p\n", ip); } // Pin calls this function every time a new instruction is encountered VOID InstructionFuc(INS ins, VOID* v) { // Insert a call to printip before every instruction, and pass it the IP // IARG_INST_PTR:指令地址 一类的全局变量??? INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)printip, IARG_INST_PTR, IARG_END); }
uname -a #intel64 cdsource/tools/ManualExamples # source/tools/Config/makefile.config list all make option make all OPT=-O0 DEBUG=1 TARGET=intel64 |tee make.log|my_hl # or just select one: make obj-intel64/inscount0.so # $(OBJDIR)%$(PINTOOL_SUFFIX) - Default rule for building tools. # Example: make obj-intel64/mytool.so
测试运行
1
../../../pin -t obj-intel64/inscount0.so -- ./a.out #正常统计指令数 to inscount.out
下面介绍Pin 提供的debug工具:
首先创建所需的-g的stack-debugger.so和应用fibonacci.exe
1 2
cdsource/tools/ManualExamples make OPT=-O0 DEBUG=1 stack-debugger.test
其中OPT=-O0选项来自官方文档Using Fast Call Linkages小节,说明需要OPT=-O0选项来屏蔽makefile中的-fomit-frame-pointer选项,使得GDB能正常显示stack trace(函数堆栈?)
Debug application in Pin JIT mode
1 2 3 4
$ ../../../pin -appdebug -t obj-intel64/stack-debugger.so -- obj-intel64/fibonacci.exe 1000 Application stopped until continued from debugger. Start GDB, then issue this command at the prompt: target remote :33030
static ADDRINT OnStackChangeIf(ADDRINT sp, ADDRINT addrInfo) { TINFO *tinfo = reinterpret_cast<TINFO *>(addrInfo); // The stack pointer may go above the base slightly. (For example, the application's dynamic // loader does this briefly during start-up.) // if (sp > tinfo->_stackBase) return0; // Keep track of the maximum stack usage. // size_t size = tinfo->_stackBase - sp; if (size > tinfo->_max) tinfo->_max = size; //更新stack使用大小 // See if we need to trigger a breakpoint. // if (BreakOnNewMax && size > tinfo->_maxReported) return1; if (BreakOnSize && size >= BreakOnSize) return1; return0; } static VOID DoBreakpoint(const CONTEXT *ctxt, THREADID tid) { TINFO *tinfo = reinterpret_cast<TINFO *>(PIN_GetContextReg(ctxt, RegTinfo)); // Keep track of the maximum reported stack usage for "stackbreak newmax". // size_t size = tinfo->_stackBase - PIN_GetContextReg(ctxt, REG_STACK_PTR); if (size > tinfo->_maxReported) tinfo->_maxReported = size; ConnectDebugger(); // Ask the user to connect a debugger, if it is not already connected. // Construct a string that the debugger will print when it stops. If a debugger is // not connected, no breakpoint is triggered and execution resumes immediately. // tinfo->_os.str(""); tinfo->_os << "Thread " << std::dec << tid << " uses " << size << " bytes of stack."; PIN_ApplicationBreakpoint(ctxt, tid, FALSE, tinfo->_os.str()); }
staticvoidConnectDebugger() { if (PIN_GetDebugStatus() != DEBUG_STATUS_UNCONNECTED) //判断是不是已有debugger连接 return; DEBUG_CONNECTION_INFO info; if (!PIN_GetDebugConnectionInfo(&info) || info._type != DEBUG_CONNECTION_TYPE_TCP_SERVER) //PIN_GetDebugConnectionInfo()获取GDB所需的tcp连接端口 return; *Output << "Triggered stack-limit breakpoint.\n"; *Output << "Start GDB and enter this command:\n"; *Output << " target remote :" << std::dec << info._tcpServer._tcpPort << "\n"; *Output << std::flush; if (PIN_WaitForDebuggerToConnect(1000*KnobTimeout.Value())) //等待其余GDB窗口的连接 return; *Output << "No debugger attached after " << KnobTimeout.Value() << " seconds.\n"; *Output << "Resuming application without stopping.\n"; *Output << std::flush; }
Tips for Debugging a Pintool
这部分讲述了如何debug Pintool中的问题。(对Pintool的原理也能更了解
为此,pin使用了-pause_tool n 暂停n秒等待gdb连接。
1 2 3 4 5 6 7 8
../../../pin -pause_tool 10 -t /staff/shaojiemike/github/sniper_PIMProf/pin_kit/source/tools/ManualExamples/obj-intel64/stack-debugger.so -- obj-intel64/fibonacci.exe 1000 Pausing for 10 seconds to attach to process with pid 3502000 To load the debug info to gdb use: ***************************************************************** set sysroot /not/existing/dir file add-symbol-file /staff/shaojiemike/github/sniper_PIMProf/pin_kit/source/tools/ManualExamples/obj-intel64/stack-debugger.so 0x7f3105f24170 -s .data 0x7f31061288a0 -s .bss 0x7f3106129280 *****************************************************************
(gdb) add-symbol-file /staff/shaojiemike/github/sniper_PIMProf/pin_kit/source/tools/ManualExamples/obj-intel64/stack-debugger.so 0x7f3105f24170 -s .data 0x7f31061288a0 -s .bss 0x7f3106129280 (gdb) b main #或者 b stack-debugger.cpp:94 gef➤ info b Num Type Disp Enb Address What 1 breakpoint keep y <MULTIPLE> 1.1 y 0x00000000000f4460 <main> # 无法访问的地址,需要去除 1.2 y 0x00007f3105f36b65 in main(int, char**) at stack-debugger.cpp:94 (gdb) del 1.1 (gdb) c