Clang nvlink 包装器¶

简介 ¶

此工具作为 NVIDIA nvlink 链接器的包装器。该包装器的目的是提供类似于 ld.lld 链接器的接口，同时仍然依赖于 NVIDIA 的专有链接器来生成最终输出。

nvlink 有一些已知的怪癖，这使得它难以在统一的卸载设置中使用。例如，它不接受 .o 文件，因为它们必须命名为 .cubin。静态归档不起作用，因此传递 .a 将导致链接器错误。 nvlink 也不支持链接时优化，并且忽略了许多标准链接器参数。此工具解决了这些问题。

用法 ¶

此工具可以使用以下选项。任何不打算仅用于链接器包装器的参数都将转发给 nvlink。

OVERVIEW: A utility that wraps around the NVIDIA 'nvlink' linker.
This enables static linking and LTO handling for NVPTX targets.

USAGE: clang-nvlink-wrapper [options] <options to passed to nvlink>

OPTIONS:
  --arch <value>       Specify the 'sm_' name of the target architecture.
  --cuda-path=<dir>    Set the system CUDA path
  --dry-run            Print generated commands without running.
  --feature <value>    Specify the '+ptx' freature to use for LTO.
  -g                   Specify that this was a debug compile.
  -help-hidden         Display all available options
  -help                Display available options (--help-hidden for more)
  -L <dir>             Add <dir> to the library search path
  -l <libname>         Search for library <libname>
  -mllvm <arg>         Arguments passed to LLVM, including Clang invocations,
                       for which the '-mllvm' prefix is preserved. Use '-mllvm
                       --help' for a list of options.
  -o <path>            Path to file to write output
  --plugin-opt=jobs=<value>
                       Number of LTO codegen partitions
  --plugin-opt=lto-partitions=<value>
                       Number of LTO codegen partitions
  --plugin-opt=O<O0, O1, O2, or O3>
                       Optimization level for LTO
  --plugin-opt=thinlto<value>
                       Enable the thin-lto backend
  --plugin-opt=<value> Arguments passed to LLVM, including Clang invocations,
                       for which the '-mllvm' prefix is preserved. Use '-mllvm
                       --help' for a list of options.
  --save-temps         Save intermediate results
  --version            Display the version number and exit
  -v                   Print verbose information

示例 ¶

此工具旨在在将 NVPTX 工具链直接作为交叉编译目标进行定位时调用。这可用于创建具有类似于标准编译的正常链接语义的独立 GPU 可执行文件。

clang --target=nvptx64-nvidia-cuda -march=native -flto=full input.c