首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >KVM GPU通过:第15组不可行。请确保iommu_group中的所有设备都绑定到它们的vfio总线驱动器上。

KVM GPU通过:第15组不可行。请确保iommu_group中的所有设备都绑定到它们的vfio总线驱动器上。
EN

Ask Ubuntu用户
提问于 2020-02-19 22:52:12
回答 2查看 19.9K关注 0票数 4

我跟踪了https://mathiashueber.com/windows-virtual-machine-gpu-passthrough-ubuntu/。然而,有一件事我没有跟随:我离开了noveau而不是正式的驱动程序,因为如果我按它说的做,当我重新启动时,我只会看到黑色的屏幕。此外,我还想在主机上使用noveau,而不是私有的、可能不安全的驱动程序。

我有一个里森72700X在一个千兆字节的B450m主板。我有一个GTX 1060,我想把它放在一个VM和一个GT 750在主机上使用。

AMD-Vi工作:

代码语言:javascript
运行
复制
lz@z:~$ dmesg |grep AMD-Vi
[    0.327637] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    0.330500] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.330501] pci 0000:00:00.2: AMD-Vi: Extended features (0xf77ef22294ada):
[    0.330504] AMD-Vi: Interrupt remapping enabled
[    0.330505] AMD-Vi: Virtual APIC enabled
[    0.330572] AMD-Vi: Lazy IO/TLB flushing enabled

以下是我的IOMMU组:

代码语言:javascript
运行
复制
IOMMU Group 0 00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU Group 10 00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
IOMMU Group 11 00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 59)
IOMMU Group 11 00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
IOMMU Group 12 00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 0 [1022:1460]
IOMMU Group 12 00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 1 [1022:1461]
IOMMU Group 12 00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 2 [1022:1462]
IOMMU Group 12 00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 3 [1022:1463]
IOMMU Group 12 00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 4 [1022:1464]
IOMMU Group 12 00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 5 [1022:1465]
IOMMU Group 12 00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 6 [1022:1466]
IOMMU Group 12 00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7 [1022:1467]
IOMMU Group 13 01:00.0 Non-Volatile memory controller [0108]: Kingston Technology Company, Inc. Device [2646:2263] (rev 03)
IOMMU Group 14 02:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset USB 3.1 XHCI Controller [1022:43d5] (rev 01)
IOMMU Group 14 02:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset SATA Controller [1022:43c8] (rev 01)
IOMMU Group 14 02:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Bridge [1022:43c6] (rev 01)
IOMMU Group 14 03:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
IOMMU Group 14 03:01.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
IOMMU Group 14 03:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
IOMMU Group 14 05:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 0c)
IOMMU Group 14 06:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF108 [GeForce GT 730] [10de:0f02] (rev a1)
IOMMU Group 14 06:00.1 Audio device [0403]: NVIDIA Corporation GF108 High Definition Audio Controller [10de:0bea] (rev a1)
>>>>>>>>>>>>>>> IOMMU Group 15 07:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU116 [GeForce GTX 1660] [10de:2184] (rev a1)
>>>>>>>>>>>>>>> IOMMU Group 15 07:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:1aeb] (rev a1)
>>>>>>>>>>>>>>> IOMMU Group 15 07:00.2 USB controller [0c03]: NVIDIA Corporation Device [10de:1aec] (rev a1)
>>>>>>>>>>>>>>> IOMMU Group 15 07:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device [10de:1aed] (rev a1)
IOMMU Group 16 08:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Raven/Raven2 PCIe Dummy Function [1022:145a]
IOMMU Group 17 08:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Platform Security Processor [1022:1456]
IOMMU Group 18 08:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Zeppelin USB 3.0 Host controller [1022:145f]
IOMMU Group 19 09:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Renoir PCIe Dummy Function [1022:1455]
IOMMU Group 1 00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
IOMMU Group 20 09:00.2 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
IOMMU Group 21 09:00.3 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) HD Audio Controller [1022:1457]
IOMMU Group 2 00:01.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
IOMMU Group 3 00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU Group 4 00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU Group 5 00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
IOMMU Group 6 00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU Group 7 00:07.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU Group 8 00:07.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
IOMMU Group 9 00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]

您可以看到,my GTX1060在第15组中,以及我不关心的其他事情,它们也可以进入VM。例如USB控制器。

我有10de:2184 (GTX 1060)和10de:1 1aeb (GTX音频)。Do我需要保存第15组中其他东西的in?我要试着处理所有这些问题,所以我保存了10de:1aec (USB)和10de:1aed (串行总线)

代码语言:javascript
运行
复制
lz@z:~$ cat /etc/initramfs-tools/modules 
# List of modules that you want to include in your initramfs.
# They will be loaded at boot time in the order below.
#
# Syntax:  module_name [args ...]
#
# You must run update-initramfs(8) to effect this change.
#
# Examples:
#
# raid1
# sd_mod
vfio vfio_iommu_type1 vfio_virqfd vfio_pci ids=10de:2184,10de:1aeb,10de:1aec,10de:1aed

代码语言:javascript
运行
复制
lz@z:~$ cat /etc/modules
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.

vfio vfio_iommu_type1 vfio_pci ids=10de:2184,10de:1aeb,10de:1aec,10de:1aed

代码语言:javascript
运行
复制
lz@z:~$ cat /etc/modprobe.d/vfio.conf 
options vfio-pci ids=10de:2184,10de:1aeb,10de:1aec,10de:1aed

代码语言:javascript
运行
复制
lz@z:~$ cat /etc/modprobe.d/kvm.conf 
options kvm ignore_msrs=1

现在看看我的lspci after重新引导

代码语言:javascript
运行
复制
lz@z:~$ lspci -nnv
00:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Root Complex [1022:1450]
    Subsystem: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Root Complex [1022:1450]
    Flags: fast devsel

00:00.2 IOMMU [0806]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) I/O Memory Management Unit [1022:1451]
    Subsystem: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) I/O Memory Management Unit [1022:1451]
    Flags: fast devsel, IRQ 25
    Capabilities: 

00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
    Flags: fast devsel

00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453] (prog-if 00 [Normal decode])
    Flags: bus master, fast devsel, latency 0, IRQ 26
    Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
    I/O behind bridge: None
    Memory behind bridge: f7600000-f76fffff [size=1M]
    Prefetchable memory behind bridge: None
    Capabilities: 
    Kernel driver in use: pcieport

00:01.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453] (prog-if 00 [Normal decode])
    Flags: bus master, fast devsel, latency 0, IRQ 27
    Bus: primary=00, secondary=02, subordinate=06, sec-latency=0
    I/O behind bridge: 0000d000-0000efff [size=8K]
    Memory behind bridge: f4000000-f53fffff [size=20M]
    Prefetchable memory behind bridge: 00000000e8000000-00000000f21fffff [size=162M]
    Capabilities: 
    Kernel driver in use: pcieport

00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
    Flags: fast devsel

00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
    Flags: fast devsel

00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453] (prog-if 00 [Normal decode])
    Flags: bus master, fast devsel, latency 0, IRQ 28
    Bus: primary=00, secondary=07, subordinate=07, sec-latency=0
    I/O behind bridge: 0000f000-0000ffff [size=4K]
    Memory behind bridge: f6000000-f70fffff [size=17M]
    Prefetchable memory behind bridge: 00000000d0000000-00000000e20fffff [size=289M]
    Capabilities: 
    Kernel driver in use: pcieport

00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
    Flags: fast devsel

00:07.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
    Flags: fast devsel

00:07.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454] (prog-if 00 [Normal decode])
    Flags: bus master, fast devsel, latency 0, IRQ 29
    Bus: primary=00, secondary=08, subordinate=08, sec-latency=0
    I/O behind bridge: None
    Memory behind bridge: f7200000-f74fffff [size=3M]
    Prefetchable memory behind bridge: None
    Capabilities: 
    Kernel driver in use: pcieport

00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
    Flags: fast devsel

00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454] (prog-if 00 [Normal decode])
    Flags: bus master, fast devsel, latency 0, IRQ 31
    Bus: primary=00, secondary=09, subordinate=09, sec-latency=0
    I/O behind bridge: None
    Memory behind bridge: f7500000-f75fffff [size=1M]
    Prefetchable memory behind bridge: None
    Capabilities: 
    Kernel driver in use: pcieport

00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 59)
    Subsystem: Gigabyte Technology Co., Ltd FCH SMBus Controller [1458:5001]
    Flags: 66MHz, medium devsel
    Kernel modules: i2c_piix4, sp5100_tco

00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
    Subsystem: Gigabyte Technology Co., Ltd FCH LPC Bridge [1458:5001]
    Flags: bus master, 66MHz, medium devsel, latency 0

00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 0 [1022:1460]
    Flags: fast devsel

00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 1 [1022:1461]
    Flags: fast devsel

00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 2 [1022:1462]
    Flags: fast devsel

00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 3 [1022:1463]
    Flags: fast devsel
    Kernel driver in use: k10temp
    Kernel modules: k10temp

00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 4 [1022:1464]
    Flags: fast devsel

00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 5 [1022:1465]
    Flags: fast devsel

00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 6 [1022:1466]
    Flags: fast devsel

00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7 [1022:1467]
    Flags: fast devsel

01:00.0 Non-Volatile memory controller [0108]: Kingston Technology Company, Inc. Device [2646:2263] (rev 03) (prog-if 02 [NVM Express])
    Subsystem: Kingston Technology Company, Inc. Device [2646:2263]
    Flags: bus master, fast devsel, latency 0, IRQ 60, NUMA node 0
    Memory at f7600000 (64-bit, non-prefetchable) [size=16K]
    Capabilities: 
    Kernel driver in use: nvme
    Kernel modules: nvme

02:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset USB 3.1 XHCI Controller [1022:43d5] (rev 01) (prog-if 30 [XHCI])
    Subsystem: ASMedia Technology Inc. 400 Series Chipset USB 3.1 XHCI Controller [1b21:1142]
    Flags: bus master, fast devsel, latency 0, IRQ 30
    Memory at f53a0000 (64-bit, non-prefetchable) [size=32K]
    Capabilities: 
    Kernel driver in use: xhci_hcd

02:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset SATA Controller [1022:43c8] (rev 01) (prog-if 01 [AHCI 1.0])
    Subsystem: ASMedia Technology Inc. 400 Series Chipset SATA Controller [1b21:1062]
    Flags: bus master, fast devsel, latency 0, IRQ 59
    Memory at f5380000 (32-bit, non-prefetchable) [size=128K]
    Expansion ROM at f5300000 [disabled] [size=512K]
    Capabilities: 
    Kernel driver in use: ahci
    Kernel modules: ahci

02:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Bridge [1022:43c6] (rev 01) (prog-if 00 [Normal decode])
    Flags: bus master, fast devsel, latency 0, IRQ 33
    Bus: primary=02, secondary=03, subordinate=06, sec-latency=0
    I/O behind bridge: 0000d000-0000efff [size=8K]
    Memory behind bridge: f4000000-f52fffff [size=19M]
    Prefetchable memory behind bridge: 00000000e8000000-00000000f21fffff [size=162M]
    Capabilities: 
    Kernel driver in use: pcieport

03:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01) (prog-if 00 [Normal decode])
    DeviceName: Broadcom 5762
    Flags: bus master, fast devsel, latency 0, IRQ 34
    Bus: primary=03, secondary=04, subordinate=04, sec-latency=0
    I/O behind bridge: None
    Memory behind bridge: None
    Prefetchable memory behind bridge: None
    Capabilities: 
    Kernel driver in use: pcieport

03:01.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01) (prog-if 00 [Normal decode])
    Flags: bus master, fast devsel, latency 0, IRQ 36
    Bus: primary=03, secondary=05, subordinate=05, sec-latency=0
    I/O behind bridge: 0000e000-0000efff [size=4K]
    Memory behind bridge: f5200000-f52fffff [size=1M]
    Prefetchable memory behind bridge: 00000000f2100000-00000000f21fffff [size=1M]
    Capabilities: 
    Kernel driver in use: pcieport

03:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01) (prog-if 00 [Normal decode])
    Flags: bus master, fast devsel, latency 0, IRQ 37
    Bus: primary=03, secondary=06, subordinate=06, sec-latency=0
    I/O behind bridge: 0000d000-0000dfff [size=4K]
    Memory behind bridge: f4000000-f50fffff [size=17M]
    Prefetchable memory behind bridge: 00000000e8000000-00000000f1ffffff [size=160M]
    Capabilities: 
    Kernel driver in use: pcieport

05:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 0c)
    Subsystem: Gigabyte Technology Co., Ltd Onboard Ethernet [1458:e000]
    Flags: bus master, fast devsel, latency 0, IRQ 35
    I/O ports at e000 [size=256]
    Memory at f5200000 (64-bit, non-prefetchable) [size=4K]
    Memory at f2100000 (64-bit, prefetchable) [size=16K]
    Capabilities: 
    Kernel driver in use: r8169
    Kernel modules: r8169

06:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF108 [GeForce GT 730] [10de:0f02] (rev a1) (prog-if 00 [VGA controller])
    Subsystem: NVIDIA Corporation GF108 [GeForce GT 730] [10de:0825]
    Flags: bus master, fast devsel, latency 0, IRQ 86
    Memory at f4000000 (32-bit, non-prefetchable) [size=16M]
    Memory at e8000000 (64-bit, prefetchable) [size=128M]
    Memory at f0000000 (64-bit, prefetchable) [size=32M]
    I/O ports at d000 [size=128]
    Expansion ROM at f5000000 [disabled] [size=512K]
    Capabilities: 
    Kernel driver in use: nouveau
    Kernel modules: nvidiafb, nouveau

06:00.1 Audio device [0403]: NVIDIA Corporation GF108 High Definition Audio Controller [10de:0bea] (rev a1)
    Subsystem: NVIDIA Corporation GF108 High Definition Audio Controller [10de:0825]
    Flags: bus master, fast devsel, latency 0, IRQ 35
    Memory at f5080000 (32-bit, non-prefetchable) [size=16K]
    Capabilities: 
    Kernel driver in use: snd_hda_intel
    Kernel modules: snd_hda_intel

>>>>>>>>>>>>>>>> 07:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU116 [GeForce GTX 1660] [10de:2184] (rev a1) (prog-if 00 [VGA controller])
    Subsystem: NVIDIA Corporation TU116 [GeForce GTX 1660] [10de:1324]
    Flags: bus master, fast devsel, latency 0, IRQ 11
    Memory at f6000000 (32-bit, non-prefetchable) [size=16M]
    Memory at d0000000 (64-bit, prefetchable) [size=256M]
    Memory at e0000000 (64-bit, prefetchable) [size=32M]
    I/O ports at f000 [size=128]
    Expansion ROM at 000c0000 [disabled] [size=128K]
    Capabilities: 
    Kernel driver in use: vfio-pci
    Kernel modules: nvidiafb, nouveau

>>>>>>>>>>>>>>>> 07:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:1aeb] (rev a1)
    Subsystem: NVIDIA Corporation Device [10de:1324]
    Flags: bus master, fast devsel, latency 0, IRQ 83
    Memory at f7080000 (32-bit, non-prefetchable) [size=16K]
    Capabilities: 
    Kernel driver in use: snd_hda_intel
    Kernel modules: snd_hda_intel

>>>>>>>>>>>>>>>> 07:00.2 USB controller [0c03]: NVIDIA Corporation Device [10de:1aec] (rev a1) (prog-if 30 [XHCI])
    Subsystem: NVIDIA Corporation Device [10de:1324]
    Flags: fast devsel, IRQ 47
    Memory at e2000000 (64-bit, prefetchable) [size=256K]
    Memory at e2040000 (64-bit, prefetchable) [size=64K]
    Capabilities: 
    Kernel driver in use: xhci_hcd

>>>>>>>>>>>>>>>> 07:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device [10de:1aed] (rev a1)
    Subsystem: NVIDIA Corporation Device [10de:1324]
    Flags: bus master, fast devsel, latency 0, IRQ 58
    Memory at f7084000 (32-bit, non-prefetchable) [size=4K]
    Capabilities: 
    Kernel driver in use: nvidia-gpu
    Kernel modules: i2c_nvidia_gpu

08:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Raven/Raven2 PCIe Dummy Function [1022:145a]
    Subsystem: Advanced Micro Devices, Inc. [AMD] Zeppelin/Raven/Raven2 PCIe Dummy Function [1022:145a]
    Flags: fast devsel
    Capabilities: 

08:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Platform Security Processor [1022:1456]
    Subsystem: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Platform Security Processor [1022:1456]
    Flags: bus master, fast devsel, latency 0, IRQ 80
    Memory at f7300000 (32-bit, non-prefetchable) [size=1M]
    Memory at f7400000 (32-bit, non-prefetchable) [size=8K]
    Capabilities: 
    Kernel driver in use: ccp
    Kernel modules: ccp

08:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Zeppelin USB 3.0 Host controller [1022:145f] (prog-if 30 [XHCI])
    Subsystem: Gigabyte Technology Co., Ltd Zeppelin USB 3.0 Host controller [1458:5007]
    Flags: bus master, fast devsel, latency 0, IRQ 48
    Memory at f7200000 (64-bit, non-prefetchable) [size=1M]
    Capabilities: 
    Kernel driver in use: xhci_hcd

09:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Renoir PCIe Dummy Function [1022:1455]
    Subsystem: Advanced Micro Devices, Inc. [AMD] Zeppelin/Renoir PCIe Dummy Function [1022:1455]
    Flags: fast devsel
    Capabilities: 

09:00.2 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51) (prog-if 01 [AHCI 1.0])
    Subsystem: Gigabyte Technology Co., Ltd FCH SATA Controller [AHCI mode] [1458:b002]
    Flags: bus master, fast devsel, latency 0, IRQ 63
    Memory at f7508000 (32-bit, non-prefetchable) [size=4K]
    Capabilities: 
    Kernel driver in use: ahci
    Kernel modules: ahci

09:00.3 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) HD Audio Controller [1022:1457]
    Subsystem: Gigabyte Technology Co., Ltd Family 17h (Models 00h-0fh) HD Audio Controller [1458:a182]
    Flags: bus master, fast devsel, latency 0, IRQ 85
    Memory at f7500000 (32-bit, non-prefetchable) [size=32K]
    Capabilities: 
    Kernel driver in use: snd_hda_intel
    Kernel modules: snd_hda_intel

我高度评价了第15组中的设备,只有NVIDIA 1060正在被vfio-pci使用,其他的正在被其他内核模块使用。Is --这是问题的根源?为了传递GTX,我必须传递group 15中的所有内容,但是这些其他东西是由其他驱动程序使用的,而不是vfio-pci

代码语言:javascript
运行
复制
Unable to complete install: 'internal error: qemu unexpectedly closed the monitor: 2020-02-19T22:48:02.001713Z qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.01H:ECX.x2apic [bit 21]
2020-02-19T22:48:02.002255Z qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.01H:ECX.x2apic [bit 21]
2020-02-19T22:48:02.002845Z qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.01H:ECX.x2apic [bit 21]
2020-02-19T22:48:02.003340Z qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.01H:ECX.x2apic [bit 21]
2020-02-19T22:48:02.003842Z qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.01H:ECX.x2apic [bit 21]
2020-02-19T22:48:02.024485Z qemu-system-x86_64: -device vfio-pci,host=07:00.0,id=hostdev0,bus=pci.4,addr=0x0: vfio 0000:07:00.0: group 15 is not viable
Please ensure all devices within the iommu_group are bound to their vfio bus driver.'

请看一下

Please确保iommu_group中的所有设备都绑定到它们的vfio总线驱动程序

这证实了我的想法,并不是所有的设备都被vfio-pci控制,尽管我明确地告诉了他们。

我想他在这部分是这么做的,但是对于nvidia的司机来说:

为了在nvidia驱动程序之前将加载顺序更改为vfio_pci,通过sudo /etc/ modprobe.d /nvidia.conf在modprobe.d文件夹中创建一个文件,并添加以下行: softdep nouveau pre: vfio_pci softdep nvidia pre: vfio_pci softdep nvidia* pre: vfio-pci softdep nvidia*pre: vfio-pci softdep nvidia*pre: vfio-pci

除了新手,有没有办法做同样的事?

EN

回答 2

Ask Ubuntu用户

回答已采纳

发布于 2020-02-19 23:42:03

我发现有一种方法可以手动解除pci中特定设备的内核模块,所以我编写了这个小脚本

代码语言:javascript
运行
复制
echo -n "0000:07:00.1" > /sys/bus/pci/drivers/snd_hda_intel/unbind
echo -n "0000:07:00.1" > /sys/bus/pci/drivers/vfio-pci/bind

echo -n "0000:07:00.2" > /sys/bus/pci/drivers/xhci_hcd/unbind
echo -n "0000:07:00.2" > /sys/bus/pci/drivers/vfio-pci/bind

echo -n "0000:07:00.3" > /sys/bus/pci/drivers/nvidia-gpu/unbind
echo -n "0000:07:00.3" > /sys/bus/pci/drivers/vfio-pci/bind

由于echo -n "0000:07:00.3" > /sys/bus/pci/drivers/nvidia-gpu/unbind行,它挂起一段时间(大约2分钟),但当它结束时,这是lspci -nnv的输出:

代码语言:javascript
运行
复制
7:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU116 [GeForce GTX 1660] [10de:2184] (rev a1) (prog-if 00 [VGA controller])
    Subsystem: NVIDIA Corporation TU116 [GeForce GTX 1660] [10de:1324]
    Flags: bus master, fast devsel, latency 0, IRQ 11
    Memory at f6000000 (32-bit, non-prefetchable) [size=16M]
    Memory at d0000000 (64-bit, prefetchable) [size=256M]
    Memory at e0000000 (64-bit, prefetchable) [size=32M]
    I/O ports at f000 [size=128]
    Expansion ROM at 000c0000 [disabled] [size=128K]
    Capabilities: 
    Kernel driver in use: vfio-pci
    Kernel modules: nvidiafb, nouveau

07:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:1aeb] (rev a1)
    Subsystem: NVIDIA Corporation Device [10de:1324]
    Flags: fast devsel, IRQ 83
    Memory at f7080000 (32-bit, non-prefetchable) [size=16K]
    Capabilities: 
    Kernel driver in use: vfio-pci
    Kernel modules: snd_hda_intel

07:00.2 USB controller [0c03]: NVIDIA Corporation Device [10de:1aec] (rev a1) (prog-if 30 [XHCI])
    Subsystem: NVIDIA Corporation Device [10de:1324]
    Flags: fast devsel, IRQ 46
    Memory at e2000000 (64-bit, prefetchable) [size=256K]
    Memory at e2040000 (64-bit, prefetchable) [size=64K]
    Capabilities: 
    Kernel driver in use: vfio-pci

07:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device [10de:1aed] (rev a1)
    Subsystem: NVIDIA Corporation Device [10de:1324]
    Flags: fast devsel, IRQ 58
    Memory at f7084000 (32-bit, non-prefetchable) [size=4K]
    Capabilities: 
    Kernel driver in use: vfio-pci
    Kernel modules: i2c_nvidia_gpu

正如您所看到的,它们都在使用vfio。然后,我简单地将GPU添加到virt管理器中,它就起作用了。然而,我仍然在调查为什么在windows 10安装过程中,整个ubuntu会永远冻结。

<#>更新:

手动解除绑定可以解除GPU,但是如果必须解除绑定,这意味着GPU的linux驱动程序已经接触到了GPU,所以现在GPU知道它在linux上。当您将它绑定到VM并启动VM时,GPU的Windows驱动程序将读取GPU状态,并知道有人(linux)以前曾与它发生过冲突,因此由于NVIDIA糟糕透顶而拒绝工作。

不要手动解除绑定,或者至少尝试一下,但是它可能无法工作。相反,确保linux驱动程序永远不要碰GPU。

票数 2
EN

Ask Ubuntu用户

发布于 2021-04-23 15:09:36

我高度评价了第15组中的设备,只有NVIDIA GTX 1060由vfio使用,其他的则被其他内核模块使用。这就是问题的根源吗?为了通过GTX,我必须通过第15组中的所有内容,但是这些其他的东西正在被其他的驱动程序使用,而不是vfio。

Probably是的,但至少四分之三应该由vfio

安装了<#>On gtx2070的机器如下:

  1. VGA兼容控制器,
  2. 音频设备,
  3. 串行总线控制器
代码语言:javascript
运行
复制
lspci -knn

GPU slot 1 GT 710
0b:00.0 VGA compatible controller [0300]: NVIDIA Corporation GK208 [GeForce GT 710B] [10de:128b] (rev a1)
        Subsystem: Micro-Star International Co., Ltd. [MSI] GK208B [GeForce GT 710] [1462:8c93]
        Kernel driver in use: nvidia
        Kernel modules: nvidia
0b:00.1 Audio device [0403]: NVIDIA Corporation GK208 HDMI/DP Audio Controller [10de:0e0f] (rev a1)
        Subsystem: Micro-Star International Co., Ltd. [MSI] GK208 HDMI/DP Audio Controller [1462:8c93]
        Kernel driver in use: snd_hda_intel
        Kernel modules: snd_hda_intel

GPU slot 2 gtx2070
0c:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:1e84] (rev a1)
        Subsystem: Gigabyte Technology Co., Ltd Device [1458:4008]
        Kernel driver in use: vfio-pci
        Kernel modules: nvidia
0c:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10f8] (rev a1)
        Subsystem: Gigabyte Technology Co., Ltd Device [1458:4008]
        Kernel driver in use: vfio-pci
        Kernel modules: snd_hda_intel
0c:00.2 USB controller [0c03]: NVIDIA Corporation Device [10de:1ad8] (rev a1)
        Subsystem: Gigabyte Technology Co., Ltd Device [1458:4008]
        Kernel driver in use: xhci_hcd
        Kernel modules: xhci_pci
0c:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device [10de:1ad9] (rev a1)
        Subsystem: Gigabyte Technology Co., Ltd Device [1458:4008]
        Kernel driver in use: vfio-pci

在设置我的机器时,我也遵循了https://mathiashueber.com/的指示。

  • 我在我的机器里安装了两个gpu。在第一个插槽(将用于我的linux机器的插槽)中,我放置了一张低能耗的nvidia卡。在第二个插槽中,我安装了应该传递给vm的gtx2070。
  • 我安装了虚拟机软件:sudo apt install ovmf virt-manager qemu-kvm
  • 在Bios中激活IOMMU (vt/vt等),并在Grub中添加以下一行:GRUB_CMDLINE_LINUX_DEFAULT="amd_iommu=on iommu=pt kvm_amd.npt=1 kvm_amd.avic=1 kvm.ignore_msrs=1 video=vesafb:off,efifb:off disable_idle_d3=1"
  • 确保我的gpu属于自己的组:
代码语言:javascript
运行
复制
IOMMU Group 29 0c:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:1e84] (rev a1)
IOMMU Group 29 0c:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10f8] (rev a1)
IOMMU Group 29 0c:00.2 USB controller [0c03]: NVIDIA Corporation Device [10de:1ad8] (rev a1)
IOMMU Group 29 0c:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device [10de:1ad9] (rev a1)

  • 而不是在引导时添加了我希望由vfio-pci绑定的I。
代码语言:javascript
运行
复制
sudo nano /etc/initramfs-tools/modules
vfio_pci ids=10de:1e84,10de:10f8,10de:1ad8,10de:1ad9

sudo update-initramfs -u -k --all 

  • 在此之后,用lspci -knn重新启动一次是否有效的检查。正如你从我上面的图片中看到的,它对id 10de:1ad8不起作用。但幸运的是,这不是问题。即使0c:00.2 USB控制器没有被vfio-pci所采用,我的win10 vm仍然工作得很完美。

我使用的软件版本和内核如下:

代码语言:javascript
运行
复制
qemu-system-x86_64 --version
QEMU emulator version 5.0.0 (Debian 1:5.0-14~bpo10+1)
Copyright (c) 2003-2020 Fabrice Bellard and the QEMU Project developers
uname -r
5.7.0-0.bpo.2-amd64

关于如何成功通过gpu的整个主题是非常复杂的。有许多不同的问题可能会发生。

以下是我的一些经历:

  1. 我记得在使用Lubuntu16.04时,当使用一个千兆字节ga-p55-ud7时,我在引导时遇到了将我的gtx970绑定到pci存根的问题,所以我必须手动使用bind/unbind命令来完成它,就像您所做的那样。( 将用于qemu/kvm通过的Nvidia gpu黑名单 )
  2. 使用我的新机器ROG X570-F游戏和debian破坏者(如上文所示),我可以在引导过程中由nvidia驱动程序启动并让我的主卡( gt710)和我的gtx2070由vfio进行引导。
  3. 有了另一台ASRockRack EPYC3251D4I-2T与debian的结合,我在试图通过我的gtx970到一个windows客户时确实遇到了很大的问题。为了避免这些问题,我不得不复制一个脚本并在后台运行它(参见https://www.reddit.com/r/Amd/comments/7gp1z7/threadripper_kvm_gpu_马斯鲁_测试员_所需/ )

告诉你,你为什么会有这些问题,我不知道。也许:过时的软件?您使用的分发版以及该分发版如何与加载模块进行交互?Bios /主板固件没有被制造商正确地编程?很可能有一个BIOS更新可用?

票数 1
EN
页面原文内容由Ask Ubuntu提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://askubuntu.com/questions/1211666

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档