LINUX.ORG.RU
ФорумAdmin

Помогите настроить проброс устройств в OpenNebula

 


0

2

Всем привет.

Писал в группу OpenNebula в телеграме, никто не ответил. Писал на оф. форуме ON, ответил один человек, но тоже не помогает. Гугл вообще молчит. Единственная дока мне не помогла, хоть и по ней делал https://docs.opennebula.io/6.0/open_cluster_deployment/kvm_node/pci_passthrough.html

OpenNebula 6.

Хост OpenNebula
OS: CentOS 8.3
Kernel: Linux 4.18.0-240.22.1.el8_3.x86_64
IP: 192.168.10.171

Установленные компоненты:
yum -y install opennebula opennebula-sunstone opennebula-fireedge opennebula-gate opennebula-flow opennebula-provision

Хост KVM (добавлен в OpenNebula)

OS: CentOS 8.3
Kernel: Linux 4.18.0-240.22.1.el8_3.x86_64
IP: 192.168.10.169

Установленные компоненты:
yum -y install opennebula-node-kvm

Цель: чтобы ON увидел nVidia RTX 2080

Мои действия на хосте KVM:

[root@kvm-gpu-test ~]# lspci -nn | grep -i nvidia
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU102 [GeForce RTX 2080 Ti Rev. A] [10de:1e07] (rev a1)
01:00.1 Audio device [0403]: NVIDIA Corporation TU102 High Definition Audio Controller [10de:10f7] (rev a1)
01:00.2 USB controller [0c03]: NVIDIA Corporation TU102 USB 3.1 Host Controller [10de:1ad6] (rev a1)
01:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU102 USB Type-C UCSI Controller [10de:1ad7] (rev a1)
[root@kvm-gpu-test ~]# lspci -vs 01:00.0
01:00.0 VGA compatible controller: NVIDIA Corporation TU102 [GeForce RTX 2080 Ti Rev. A] (rev a1) (prog-if 00 [VGA controller])
        Subsystem: eVga.com. Corp. Device 2489
        Flags: bus master, fast devsel, latency 0, IRQ 141
        Memory at 95000000 (32-bit, non-prefetchable) [size=16M]
        Memory at 80000000 (64-bit, prefetchable) [size=256M]
        Memory at 90000000 (64-bit, prefetchable) [size=32M]
        I/O ports at 6000 [size=128]
        Expansion ROM at 96000000 [disabled] [size=512K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Legacy Endpoint, MSI 00
        Capabilities: [100] Virtual Channel
        Capabilities: [250] Latency Tolerance Reporting
        Capabilities: [258] L1 PM Substates
        Capabilities: [128] Power Budgeting <?>
        Capabilities: [420] Advanced Error Reporting
        Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
        Capabilities: [900] Secondary PCI Express
        Capabilities: [bb0] Resizable BAR <?>
        Kernel driver in use: nouveau
        Kernel modules: nouveau

Сначала добавил vfio-pci:

vi /etc/modprobe.d/vfio.conf
options vfio-pci ids=10de:1e07,10de:10f7,10de:1ad6,10de:1ad7

echo 'vfio-pci' > /etc/modules-load.d/vfio-pci.conf

grubby --update-kernel=ALL --args="rd.driver.blacklist=nouveau nouveau.modeset=0"
mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak
dracut /boot/initramfs-$(uname -r).img $(uname -r)
echo 'blacklist nouveau' > /etc/modprobe.d/nouveau-blacklist.conf

reboot

После ребута kernel driver стал vfio-pci:

[root@kvm-gpu-test ~]# lspci -vs 01:00.0
01:00.0 VGA compatible controller: NVIDIA Corporation TU102 [GeForce RTX 2080 Ti Rev. A] (rev a1) (prog-if 00 [VGA controller])
        Subsystem: eVga.com. Corp. Device 2489
        Flags: fast devsel, IRQ 11
        Memory at 95000000 (32-bit, non-prefetchable) [disabled] [size=16M]
        Memory at 80000000 (64-bit, prefetchable) [disabled] [size=256M]
        Memory at 90000000 (64-bit, prefetchable) [disabled] [size=32M]
        I/O ports at 6000 [disabled] [size=128]
        Expansion ROM at 96000000 [disabled] [size=512K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Legacy Endpoint, MSI 00
        Capabilities: [100] Virtual Channel
        Capabilities: [250] Latency Tolerance Reporting
        Capabilities: [258] L1 PM Substates
        Capabilities: [128] Power Budgeting <?>
        Capabilities: [420] Advanced Error Reporting
        Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
        Capabilities: [900] Secondary PCI Express
        Capabilities: [bb0] Resizable BAR <?>
        Kernel driver in use: vfio-pci
        Kernel modules: nouveau

Затем в /etc/libvirt/qemu.conf добавил:

[root@kvm-gpu-test ~]# find /sys/kernel/iommu_groups/ -type l
/sys/kernel/iommu_groups/7/devices/0000:00:17.0
/sys/kernel/iommu_groups/15/devices/0000:03:00.0
/sys/kernel/iommu_groups/5/devices/0000:00:15.1
/sys/kernel/iommu_groups/5/devices/0000:00:15.0
/sys/kernel/iommu_groups/13/devices/0000:00:1f.0
/sys/kernel/iommu_groups/13/devices/0000:00:1f.5
/sys/kernel/iommu_groups/13/devices/0000:00:1f.4
/sys/kernel/iommu_groups/3/devices/0000:00:12.0
/sys/kernel/iommu_groups/11/devices/0000:00:1c.1
**/sys/kernel/iommu_groups/1/devices/0000:00:01.0**
**/sys/kernel/iommu_groups/1/devices/0000:01:00.2**
**/sys/kernel/iommu_groups/1/devices/0000:01:00.0**
**/sys/kernel/iommu_groups/1/devices/0000:01:00.3**
/sys/kernel/iommu_groups/1/devices/0000:01:00.1
/sys/kernel/iommu_groups/8/devices/0000:00:1b.0
/sys/kernel/iommu_groups/16/devices/0000:06:00.0
/sys/kernel/iommu_groups/16/devices/0000:05:00.0
/sys/kernel/iommu_groups/6/devices/0000:00:16.4
/sys/kernel/iommu_groups/6/devices/0000:00:16.0
/sys/kernel/iommu_groups/14/devices/0000:02:00.0
/sys/kernel/iommu_groups/4/devices/0000:00:14.2
/sys/kernel/iommu_groups/4/devices/0000:00:14.0
/sys/kernel/iommu_groups/12/devices/0000:00:1e.0
/sys/kernel/iommu_groups/2/devices/0000:00:08.0
/sys/kernel/iommu_groups/10/devices/0000:00:1c.0
/sys/kernel/iommu_groups/0/devices/0000:00:00.0
/sys/kernel/iommu_groups/9/devices/0000:00:1b.5

vi /etc/libvirt/qemu.conf
cgroup_device_acl = [
    "/dev/null", "/dev/full", "/dev/zero",
    "/dev/random", "/dev/urandom",
    "/dev/ptmx", "/dev/kvm", "/dev/kqemu",
    "/dev/rtc","/dev/hpet", "/dev/vfio/vfio",
    "/dev/vfio/1"
]

Затем на хосте OpenNebula в /etc/libvirt/qemu.conf добавил:

vi /var/lib/one/remotes/etc/im/kvm-probes.d/pci.conf
#                                                                            #
# Unless required by applicable law or agreed to in writing, software        #
# distributed under the License is distributed on an "AS IS" BASIS,          #
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.   #
# See the License for the specific language governing permissions and        #
# limitations under the License.                                             #
#--------------------------------------------------------------------------- #

# This option specifies the main filters for PCI card monitoring. The format
# is the same as used by lspci to filter on PCI card by vendor:device(:class)
# identification. Several filters can be added as a list, or separated
# by commas. The NULL filter will retrieve all PCI cards.
#
# From lspci help:
#     -d [<vendor>]:[<device>][:<class>]
#            Show only devices with specified vendor, device and  class  ID.
#            The  ID's  are given in hexadecimal and may be omitted or given
#            as "*", both meaning "any value"#
#
# For example:
#   :filter:
#     - '10de:*'      # all NVIDIA VGA cards
#     - '10de:11bf'   # only GK104GL [GRID K2]
#     - '*:10d3'      # only 82574L Gigabit Network cards
#     - '8086::0c03'  # only Intel USB controllers
#
# or
#
#   :filter: '*:*'    # all devices
#
# or
#
#   :filter: '0:0'    # no devices
#
:filter:
  - '10de:*'

# The PCI cards list restricted by the :filter option above can be even more
# filtered by the list of exact PCI addresses (bus:device.func).
#
# For example:
#   :short_address:
#     - '07:00.0'
#     - '06:00.0'
#
:short_address: []

# The PCI cards list restricted by the :filter option above can be even more
# filtered by matching the device name against the list of regular expression
# case-insensitive patterns.
#
# For example:
#   :device_name:
#     - 'Virtual Function'
#     - 'Gigabit Network'
#     - 'USB.*Host Controller'
#     - '^MegaRAID'
#
:device_name:
  - 'VGA'

reboot

После ребута хоста KVM, на хосте OpenNebula запустил синхронизацию с хостом KVM:

[oneadmin@on-test ~]$ onehost sync -f
* Adding 192.168.10.169 to upgrade
[========================================] 1/1 192.168.10.169
All hosts updated successfully.
[oneadmin@on-test ~]$ onehost forceupdate
All hosts updated successfully.

Проверяю:

[root@on-test ~]# onehost show 1
HOST 1 INFORMATION
ID                    : 1
NAME                  : 192.168.10.169
CLUSTER               : default
STATE                 : MONITORED
IM_MAD                : kvm
VM_MAD                : kvm
LAST MONITORING TIME  : 05/09 17:16:35

HOST SHARES
RUNNING VMS           : 0
MEMORY
  TOTAL               : 31.1G
  TOTAL +/- RESERVED  : 31.1G
  USED (REAL)         : 388.8M
  USED (ALLOCATED)    : 0K
CPU
  TOTAL               : 400
  TOTAL +/- RESERVED  : 400
  USED (REAL)         : 8
  USED (ALLOCATED)    : 0

MONITORING INFORMATION
ARCH="x86_64"
CPUSPEED="4433"
HOSTNAME="kvm-gpu-test"
HYPERVISOR="kvm"
IM_MAD="kvm"
KVM_CPU_MODEL="Skylake-Client-IBRS"
KVM_CPU_MODELS="486 pentium pentium2 pentium3 pentiumpro coreduo n270 core2duo qemu32 kvm32 cpu64-rhel5 cpu64-rhel6 qemu64 kvm64 Conroe Penryn Nehalem Nehalem-IBRS Westmere Westmere-IBRS SandyBridge SandyBridge-IBRS IvyBridge IvyBridge-IBRS Haswell-noTSX Haswell-noTSX-IBRS Haswell Haswell-IBRS Broadwell-noTSX Broadwell-noTSX-IBRS Broadwell Broadwell-IBRS Skylake-Client Skylake-Client-IBRS Skylake-Client-noTSX-IBRS Skylake-Server Skylake-Server-IBRS Skylake-Server-noTSX-IBRS Cascadelake-Server Cascadelake-Server-noTSX Icelake-Client Icelake-Client-noTSX Icelake-Server Icelake-Server-noTSX Cooperlake athlon phenom Opteron_G1 Opteron_G2 Opteron_G3 Opteron_G4 Opteron_G5 EPYC EPYC-IBPB Dhyana"
KVM_MACHINES="pc-i440fx-rhel7.6.0 pc pc-i440fx-rhel7.0.0 pc-q35-rhel7.6.0 pc-i440fx-rhel7.5.0 pc-q35-rhel8.2.0 q35 pc-i440fx-rhel7.1.0 pc-i440fx-rhel7.2.0 pc-q35-rhel7.3.0 pc-q35-rhel7.4.0 pc-i440fx-rhel7.3.0 pc-q35-rhel8.0.0 pc-i440fx-rhel7.4.0 pc-q35-rhel8.1.0 pc-q35-rhel7.5.0"
MODELNAME="Intel(R) Xeon(R) E-2224 CPU @ 3.40GHz"
RESERVED_CPU=""
RESERVED_MEM=""
VERSION="6.0.0.1"
VM_MAD="kvm"

NUMA NODES

  ID CORES    USED FREE
   0 - - - -  0    4

NUMA MEMORY

 NODE_ID TOTAL    USED_REAL            USED_ALLOCATED       FREE
       0 31G      0K                   0K                   0K

NUMA HUGEPAGES

 NODE_ID SIZE     TOTAL    FREE     USED
       0 2M       0        0        0
       0 1024M    0        0        0

WILD VIRTUAL MACHINES

NAME                                                      IMPORT_ID  CPU     MEMORY

VIRTUAL MACHINES

  ID USER     GROUP    NAME                                                                                                           STAT  CPU     MEM HOST                                                                             TIME
[root@on-test ~]#

В списке должен появиться раздел примерно такого содержания:

PCI DEVICES

   VM ADDR    TYPE           NAME
      00:00.0 8086:0a04:0600 Haswell-ULT DRAM Controller
      00:02.0 8086:0a16:0300 Haswell-ULT Integrated Graphics Controller
  123 00:03.0 8086:0a0c:0403 Haswell-ULT HD Audio Controller
      00:14.0 8086:9c31:0c03 8 Series USB xHCI HC
      00:16.0 8086:9c3a:0780 8 Series HECI #0
      00:1b.0 8086:9c20:0403 8 Series HD Audio Controller
      00:1c.0 8086:9c10:0604 8 Series PCI Express Root Port 1
      00:1c.2 8086:9c14:0604 8 Series PCI Express Root Port 3
      00:1d.0 8086:9c26:0c03 8 Series USB EHCI #1
      00:1f.0 8086:9c43:0601 8 Series LPC Controller
      00:1f.2 8086:9c03:0106 8 Series SATA Controller 1 [AHCI mode]
      00:1f.3 8086:9c22:0c05 8 Series SMBus Controller
      02:00.0 8086:08b1:0280 Wireless 7260

Но как видно, ничего нет ((

Вы не можете добавлять комментарии в эту тему. Тема перемещена в архив.