调试 C++ 协程

简介

出于性能和其他架构原因,Clang 编译器中的 C++ 协程功能在编译器的两个部分中实现。语义分析在 Clang 中执行,而协程构造和优化在 LLVM 中间端进行。

然而,这种设计迫使我们生成不足的调试信息。通常,编译器在 Clang 前端生成调试信息,因为调试信息高度依赖于语言。但是,这对于协程帧来说是不可能的,因为帧是在 LLVM 中间端构造的。

为了缓解这个问题,LLVM 中间端试图生成一些调试信息,但不幸的是,由于中间端缺少大量语言特定信息,这些信息是不完整的。

本文档介绍了如何使用此调试信息更好地调试协程。

术语

由于 C++20 协程的最新性质,用于描述协程概念的术语尚未确定。本节定义了一种通用的、易于理解的术语,以便在本文档中一致地使用。

协程类型

“协程函数”是指包含任何协程关键字“co_await”、“co_yield”或“co_return”的任何函数。“协程类型”是这些“协程函数”的可能返回值类型。“任务”和“生成器”通常被称为协程类型。

协程

从技术定义上讲,“协程”是指可挂起的函数。但是,程序员通常使用“协程”来指代单个实例。例如

std::vector<Task> Coros; // Task is a coroutine type.
for (int i = 0; i < 3; i++)
  Coros.push_back(CoroTask()); // CoroTask is a coroutine function, which
                               // would return a coroutine type 'Task'.

在实践中,我们通常在上面的例子中说“Coros 包含 3 个协程”,虽然这并不完全正确。更准确地说,应该说“Coros 包含 3 个协程实例”或“Coros 包含 3 个协程对象”。

在本文件中,我们遵循使用“协程”来指代单个“协程实例”的惯例,因为术语“协程实例”和“协程对象”在这种情况下没有得到充分定义。

协程帧

C++ 标准使用“协程状态”来描述分配的存储。在编译器中,我们使用“协程帧”来描述生成的包含必要信息的数据结构。

协程帧结构

协程帧的结构定义如下:

struct {
  void (*__r)(); // function pointer to the `resume` function
  void (*__d)(); // function pointer to the `destroy` function
  promise_type; // the corresponding `promise_type`
  ... // Any other needed information
}

在调试器中,函数的名称可以从函数的地址获得。并且“恢复”函数的名称等于协程函数的名称。因此,一旦知道协程的地址,就可以获得协程的名称。

获取挂起点

调试协程的一个重要要求是了解挂起点,即协程当前挂起和等待的位置。

对于像上面这样的简单情况,检查协程帧中“__coro_index”变量的值效果很好。

然而,在非常复杂的情况下,情况并非如此简单。在这种情况下,需要使用协程库来插入行号。

例如

// For all the promise_type we want:
class promise_type {
  ...
+  unsigned line_number = 0xffffffff;
};

#include <source_location>

// For all the awaiter types we need:
class awaiter {
  ...
  template <typename Promise>
  void await_suspend(std::coroutine_handle<Promise> handle,
                     std::source_location sl = std::source_location::current()) {
        ...
        handle.promise().line_number = sl.line();
  }
};

在这种情况下,我们使用“std::source_location”在“promise_type”中存储 await 的行号。由于我们可以从协程的地址定位协程函数,因此我们也可以通过这种方式识别挂起点。

这里的缺点是,这是以增加运行时成本为代价的。这与 C++ 的“按需付费”哲学一致。

获取异步堆栈

调试协程的另一个重要要求是打印异步堆栈以识别协程的异步调用者。由于协程类型的许多实现都在 promise 类型中存储“std::coroutine_handle<> continuation”,因此识别调用者应该很简单。“continuation”通常是当前协程的等待协程。也就是说,异步父级。

由于“promise_type”可以从协程的地址获得,并且包含相应的 continuation(它本身是一个带有“promise_type”的协程),因此打印整个异步堆栈应该很简单。

这种逻辑应该很容易在调试脚本中捕获。

打印异步堆栈的示例

以下是一个打印正常任务实现的异步堆栈的示例。

// debugging-example.cpp
#include <coroutine>
#include <iostream>
#include <utility>

struct task {
  struct promise_type {
    task get_return_object();
    std::suspend_always initial_suspend() { return {}; }

    void unhandled_exception() noexcept {}

    struct FinalSuspend {
      std::coroutine_handle<> continuation;
      auto await_ready() noexcept { return false; }
      auto await_suspend(std::coroutine_handle<> handle) noexcept {
        return continuation;
      }
      void await_resume() noexcept {}
    };
    FinalSuspend final_suspend() noexcept { return {continuation}; }

    void return_value(int res) { result = res; }

    std::coroutine_handle<> continuation = std::noop_coroutine();
    int result = 0;
  };

  task(std::coroutine_handle<promise_type> handle) : handle(handle) {}
  ~task() {
    if (handle)
      handle.destroy();
  }

  auto operator co_await() {
    struct Awaiter {
      std::coroutine_handle<promise_type> handle;
      auto await_ready() { return false; }
      auto await_suspend(std::coroutine_handle<> continuation) {
        handle.promise().continuation = continuation;
        return handle;
      }
      int await_resume() {
        int ret = handle.promise().result;
        handle.destroy();
        return ret;
      }
    };
    return Awaiter{std::exchange(handle, nullptr)};
  }

  int syncStart() {
    handle.resume();
    return handle.promise().result;
  }

private:
  std::coroutine_handle<promise_type> handle;
};

task task::promise_type::get_return_object() {
  return std::coroutine_handle<promise_type>::from_promise(*this);
}

namespace detail {
template <int N>
task chain_fn() {
  co_return N + co_await chain_fn<N - 1>();
}

template <>
task chain_fn<0>() {
  // This is the default breakpoint.
  __builtin_debugtrap();
  co_return 0;
}
}  // namespace detail

task chain() {
  co_return co_await detail::chain_fn<30>();
}

int main() {
  std::cout << chain().syncStart() << "\n";
  return 0;
}

在示例中,“task”协程保存一个“continuation”字段,该字段将在“task”完成后恢复。换句话说,“continuation”是“task”的异步调用者。就像普通函数在函数完成后返回给其调用者一样。

因此,我们可以使用“continuation”字段来构建异步堆栈

# debugging-helper.py
import gdb
from gdb.FrameDecorator import FrameDecorator

class SymValueWrapper():
    def __init__(self, symbol, value):
        self.sym = symbol
        self.val = value

    def __str__(self):
        return str(self.sym) + " = " + str(self.val)

def get_long_pointer_size():
    return gdb.lookup_type('long').pointer().sizeof

def cast_addr2long_pointer(addr):
    return gdb.Value(addr).cast(gdb.lookup_type('long').pointer())

def dereference(addr):
    return long(cast_addr2long_pointer(addr).dereference())

class CoroutineFrame(object):
    def __init__(self, task_addr):
        self.frame_addr = task_addr
        self.resume_addr = task_addr
        self.destroy_addr = task_addr + get_long_pointer_size()
        self.promise_addr = task_addr + get_long_pointer_size() * 2
        # In the example, the continuation is the first field member of the promise_type.
        # So they have the same addresses.
        # If we want to generalize the scripts to other coroutine types, we need to be sure
        # the continuation field is the first member of promise_type.
        self.continuation_addr = self.promise_addr

    def next_task_addr(self):
        return dereference(self.continuation_addr)

class CoroutineFrameDecorator(FrameDecorator):
    def __init__(self, coro_frame):
        super(CoroutineFrameDecorator, self).__init__(None)
        self.coro_frame = coro_frame
        self.resume_func = dereference(self.coro_frame.resume_addr)
        self.resume_func_block = gdb.block_for_pc(self.resume_func)
        if self.resume_func_block is None:
            raise Exception('Not stackless coroutine.')
        self.line_info = gdb.find_pc_line(self.resume_func)

    def address(self):
        return self.resume_func

    def filename(self):
        return self.line_info.symtab.filename

    def frame_args(self):
        return [SymValueWrapper("frame_addr", cast_addr2long_pointer(self.coro_frame.frame_addr)),
                SymValueWrapper("promise_addr", cast_addr2long_pointer(self.coro_frame.promise_addr)),
                SymValueWrapper("continuation_addr", cast_addr2long_pointer(self.coro_frame.continuation_addr))
                ]

    def function(self):
        return self.resume_func_block.function.print_name

    def line(self):
        return self.line_info.line

class StripDecorator(FrameDecorator):
    def __init__(self, frame):
        super(StripDecorator, self).__init__(frame)
        self.frame = frame
        f = frame.function()
        self.function_name = f

    def __str__(self, shift = 2):
        addr = "" if self.address() is None else '%#x' % self.address() + " in "
        location = "" if self.filename() is None else " at " + self.filename() + ":" + str(self.line())
        return addr + self.function() + " " + str([str(args) for args in self.frame_args()]) + location

class CoroutineFilter:
    def create_coroutine_frames(self, task_addr):
        frames = []
        while task_addr != 0:
            coro_frame = CoroutineFrame(task_addr)
            frames.append(CoroutineFrameDecorator(coro_frame))
            task_addr = coro_frame.next_task_addr()
        return frames

class AsyncStack(gdb.Command):
    def __init__(self):
        super(AsyncStack, self).__init__("async-bt", gdb.COMMAND_USER)

    def invoke(self, arg, from_tty):
        coroutine_filter = CoroutineFilter()
        argv = gdb.string_to_argv(arg)
        if len(argv) == 0:
            try:
                task = gdb.parse_and_eval('__coro_frame')
                task = int(str(task.address), 16)
            except Exception:
                print ("Can't find __coro_frame in current context.\n" +
                      "Please use `async-bt` in stackless coroutine context.")
                return
        elif len(argv) != 1:
            print("usage: async-bt <pointer to task>")
            return
        else:
            task = int(argv[0], 16)

        frames = coroutine_filter.create_coroutine_frames(task)
        i = 0
        for f in frames:
            print '#'+ str(i), str(StripDecorator(f))
            i += 1
        return

AsyncStack()

class ShowCoroFrame(gdb.Command):
    def __init__(self):
        super(ShowCoroFrame, self).__init__("show-coro-frame", gdb.COMMAND_USER)

    def invoke(self, arg, from_tty):
        argv = gdb.string_to_argv(arg)
        if len(argv) != 1:
            print("usage: show-coro-frame <address of coroutine frame>")
            return

        addr = int(argv[0], 16)
        block = gdb.block_for_pc(long(cast_addr2long_pointer(addr).dereference()))
        if block is None:
            print "block " + str(addr) + "  is none."
            return

        # Disable demangling since gdb will treat names starting with `_Z`(The marker for Itanium ABI) specially.
        gdb.execute("set demangle-style none")

        coro_frame_type = gdb.lookup_type(block.function.linkage_name + ".coro_frame_ty")
        coro_frame_ptr_type = coro_frame_type.pointer()
        coro_frame = gdb.Value(addr).cast(coro_frame_ptr_type).dereference()

        gdb.execute("set demangle-style auto")
        gdb.write(coro_frame.format_string(pretty_structs = True))

ShowCoroFrame()

然后运行

$ clang++ -std=c++20 -g debugging-example.cpp -o debugging-example
$ gdb ./debugging-example
(gdb) # We've already set the breakpoint.
(gdb) r
Program received signal SIGTRAP, Trace/breakpoint trap.
detail::chain_fn<0> () at debugging-example2.cpp:73
73      co_return 0;
(gdb) # Executes the debugging scripts
(gdb) source debugging-helper.py
(gdb) # Print the asynchronous stack
(gdb) async-bt
#0 0x401c40 in detail::chain_fn<0>() ['frame_addr = 0x441860', 'promise_addr = 0x441870', 'continuation_addr = 0x441870'] at debugging-example.cpp:71
#1 0x4022d0 in detail::chain_fn<1>() ['frame_addr = 0x441810', 'promise_addr = 0x441820', 'continuation_addr = 0x441820'] at debugging-example.cpp:66
#2 0x403060 in detail::chain_fn<2>() ['frame_addr = 0x4417c0', 'promise_addr = 0x4417d0', 'continuation_addr = 0x4417d0'] at debugging-example.cpp:66
#3 0x403df0 in detail::chain_fn<3>() ['frame_addr = 0x441770', 'promise_addr = 0x441780', 'continuation_addr = 0x441780'] at debugging-example.cpp:66
#4 0x404b80 in detail::chain_fn<4>() ['frame_addr = 0x441720', 'promise_addr = 0x441730', 'continuation_addr = 0x441730'] at debugging-example.cpp:66
#5 0x405910 in detail::chain_fn<5>() ['frame_addr = 0x4416d0', 'promise_addr = 0x4416e0', 'continuation_addr = 0x4416e0'] at debugging-example.cpp:66
#6 0x4066a0 in detail::chain_fn<6>() ['frame_addr = 0x441680', 'promise_addr = 0x441690', 'continuation_addr = 0x441690'] at debugging-example.cpp:66
#7 0x407430 in detail::chain_fn<7>() ['frame_addr = 0x441630', 'promise_addr = 0x441640', 'continuation_addr = 0x441640'] at debugging-example.cpp:66
#8 0x4081c0 in detail::chain_fn<8>() ['frame_addr = 0x4415e0', 'promise_addr = 0x4415f0', 'continuation_addr = 0x4415f0'] at debugging-example.cpp:66
#9 0x408f50 in detail::chain_fn<9>() ['frame_addr = 0x441590', 'promise_addr = 0x4415a0', 'continuation_addr = 0x4415a0'] at debugging-example.cpp:66
#10 0x409ce0 in detail::chain_fn<10>() ['frame_addr = 0x441540', 'promise_addr = 0x441550', 'continuation_addr = 0x441550'] at debugging-example.cpp:66
#11 0x40aa70 in detail::chain_fn<11>() ['frame_addr = 0x4414f0', 'promise_addr = 0x441500', 'continuation_addr = 0x441500'] at debugging-example.cpp:66
#12 0x40b800 in detail::chain_fn<12>() ['frame_addr = 0x4414a0', 'promise_addr = 0x4414b0', 'continuation_addr = 0x4414b0'] at debugging-example.cpp:66
#13 0x40c590 in detail::chain_fn<13>() ['frame_addr = 0x441450', 'promise_addr = 0x441460', 'continuation_addr = 0x441460'] at debugging-example.cpp:66
#14 0x40d320 in detail::chain_fn<14>() ['frame_addr = 0x441400', 'promise_addr = 0x441410', 'continuation_addr = 0x441410'] at debugging-example.cpp:66
#15 0x40e0b0 in detail::chain_fn<15>() ['frame_addr = 0x4413b0', 'promise_addr = 0x4413c0', 'continuation_addr = 0x4413c0'] at debugging-example.cpp:66
#16 0x40ee40 in detail::chain_fn<16>() ['frame_addr = 0x441360', 'promise_addr = 0x441370', 'continuation_addr = 0x441370'] at debugging-example.cpp:66
#17 0x40fbd0 in detail::chain_fn<17>() ['frame_addr = 0x441310', 'promise_addr = 0x441320', 'continuation_addr = 0x441320'] at debugging-example.cpp:66
#18 0x410960 in detail::chain_fn<18>() ['frame_addr = 0x4412c0', 'promise_addr = 0x4412d0', 'continuation_addr = 0x4412d0'] at debugging-example.cpp:66
#19 0x4116f0 in detail::chain_fn<19>() ['frame_addr = 0x441270', 'promise_addr = 0x441280', 'continuation_addr = 0x441280'] at debugging-example.cpp:66
#20 0x412480 in detail::chain_fn<20>() ['frame_addr = 0x441220', 'promise_addr = 0x441230', 'continuation_addr = 0x441230'] at debugging-example.cpp:66
#21 0x413210 in detail::chain_fn<21>() ['frame_addr = 0x4411d0', 'promise_addr = 0x4411e0', 'continuation_addr = 0x4411e0'] at debugging-example.cpp:66
#22 0x413fa0 in detail::chain_fn<22>() ['frame_addr = 0x441180', 'promise_addr = 0x441190', 'continuation_addr = 0x441190'] at debugging-example.cpp:66
#23 0x414d30 in detail::chain_fn<23>() ['frame_addr = 0x441130', 'promise_addr = 0x441140', 'continuation_addr = 0x441140'] at debugging-example.cpp:66
#24 0x415ac0 in detail::chain_fn<24>() ['frame_addr = 0x4410e0', 'promise_addr = 0x4410f0', 'continuation_addr = 0x4410f0'] at debugging-example.cpp:66
#25 0x416850 in detail::chain_fn<25>() ['frame_addr = 0x441090', 'promise_addr = 0x4410a0', 'continuation_addr = 0x4410a0'] at debugging-example.cpp:66
#26 0x4175e0 in detail::chain_fn<26>() ['frame_addr = 0x441040', 'promise_addr = 0x441050', 'continuation_addr = 0x441050'] at debugging-example.cpp:66
#27 0x418370 in detail::chain_fn<27>() ['frame_addr = 0x440ff0', 'promise_addr = 0x441000', 'continuation_addr = 0x441000'] at debugging-example.cpp:66
#28 0x419100 in detail::chain_fn<28>() ['frame_addr = 0x440fa0', 'promise_addr = 0x440fb0', 'continuation_addr = 0x440fb0'] at debugging-example.cpp:66
#29 0x419e90 in detail::chain_fn<29>() ['frame_addr = 0x440f50', 'promise_addr = 0x440f60', 'continuation_addr = 0x440f60'] at debugging-example.cpp:66
#30 0x41ac20 in detail::chain_fn<30>() ['frame_addr = 0x440f00', 'promise_addr = 0x440f10', 'continuation_addr = 0x440f10'] at debugging-example.cpp:66
#31 0x41b9b0 in chain() ['frame_addr = 0x440eb0', 'promise_addr = 0x440ec0', 'continuation_addr = 0x440ec0'] at debugging-example.cpp:77

现在我们获得了完整的异步堆栈!也可以打印不在堆栈顶部的其他异步堆栈。我们可以通过将相应协程帧的地址传递给“async-bt”命令来实现。

通过调试脚本,我们可以打印任何协程帧,只要我们知道地址。例如,我们可以打印上述示例中“detail::chain_fn<18>()”的协程帧。从日志记录中,我们知道协程帧的地址在运行中为“0x4412c0”。然后我们可以

(gdb) show-coro-frame 0x4412c0
{
  __resume_fn = 0x410960 <detail::chain_fn<18>()>,
  __destroy_fn = 0x410d60 <detail::chain_fn<18>()>,
  __promise = {
    continuation = {
      _M_fr_ptr = 0x441270
    },
    result = 0
  },
  struct_Awaiter_0 = {
    struct_std____n4861__coroutine_handle_0 = {
      struct_std____n4861__coroutine_handle = {
        PointerType = 0x441310
      }
    }
  },
  struct_task_1 = {
    struct_std____n4861__coroutine_handle_0 = {
      struct_std____n4861__coroutine_handle = {
        PointerType = 0x0
      }
    }
  },
  struct_task__promise_type__FinalSuspend_2 = {
    struct_std____n4861__coroutine_handle = {
      PointerType = 0x0
    }
  },
  __coro_index = 1 '\001',
  struct_std____n4861__suspend_always_3 = {
    __int_8 = 0 '\000'
  }

获取正在运行的协程

调试协程时,另一个有用的任务是枚举正在运行的协程列表,这通常用线程来完成。虽然从技术上讲是可能的,但这项任务不建议在生产代码中执行,因为它在运行时成本很高。一个这样的解决方案是在集合中存储当前正在运行的协程列表

inline std::unordered_set<void*> lived_coroutines;
// For all promise_type we want to record
class promise_type {
public:
    promise_type() {
        // Note to avoid data races
        lived_coroutines.insert(std::coroutine_handle<promise_type>::from_promise(*this).address());
    }
    ~promise_type() {
        // Note to avoid data races
        lived_coroutines.erase(std::coroutine_handle<promise_type>::from_promise(*this).address());
    }
};

在上面的代码片段中,我们将每个存活协程的地址保存到名为 lived_coroutinesunordered_set 中。与之前一样,一旦我们知道协程的地址,我们就可以推导出函数、promise_type 以及框架的其他成员。因此,我们可以从该集合中打印出存活协程的列表。

请注意,从存储的角度来看,上述方法代价高昂,并且需要对集合进行一定程度的锁定(图中未显示)以防止数据竞争。