ConnectX-3 VF not working on Windows comes from old in-box MLNX_OFED driver of PVE

It seems the matter ConnectX-3's SR-IOV doesn't work on Windows 10 VM on Proxmox VE 6.3 is a pretty old inbox mlx4 driver of PVE, to be exact, the driver is boxed in Linux.

When the driver is used, Windows VM recognises virtual functions and they appear to be working, but actually they are not working because of “Code 43” status. PVE reports logs like below:

mlx4_core 0000:01:00.0: vhcr command:0x43 slave:1 failed with error:0, status -22
mlx4_core 0000:01:00.0: Received reset from slave:1

I saw the “Mellanox OFED for Linux Archived Bug Fixes” document, then I found that the Internal Ref 1178129 was kin to my situation.

Description: Fixed an issue that prevented Windows virtual machines running over MLNX_OFED Linux hypervisors from operating ConnectX-3 IB ports.

When such failures occurred, the following message (or similar) appeared in the Linux HV message log when users attempted to start up a Windows VM running a ConnectX-3 VF:

“mlx4_core 0000:81:00.0: vhcr command 0x1a slave:1 in_param 0x793000 in_mod=0x210 op_mod=0x0 failed with error:0, status -22”

There was a difference that the description was mentioned to “IB ports” while my environment was “Eth ports”, but the situation was similar indeed. This issue have been fixed on MLNX_OFED v4.2-1.2.0.0. The latest driver is v4.9-2.2.4.0 LTS as of January 10, 2021, so it should be solved years ago.

I didn't know why I ran into the problem, but I found the fact which the version of Linux inbox driver corresponded to v4.0. What the hell… Someone sent the bug report to Kernel.org' Bugzilla, though, nobody cares.

I had no choice but to build the latest driver myself, then the ConnectX-3 VF easily worked. Come on, I just fell into the pitfall!

However, it is not perfect solution because any traffics don't flow though the VF seems to work fine sometimes. I guess this problem is cased by wrong something at the time of a device initialization. If it happens, restart VM several times to open the connection up. Once the device work fine, it will seem to keep working as long as the VM is active.

Another thing that makes me wonder is that Send/Recv bytes counters on Windows' network status dialog are weired. The recev bytes is always zero though the network is actually communicating. Some applications seem to recgonise that the network is down in case of using the VF. For example, iTunes fails to connect to Gracenote server because of “no network.” I'm not sure that there is the relation between these or not. By virtio-net, it works perfectly without these problems, so it is not like that my network is bad.

The information about SR-IOV little exists on the Internet, so I'm in the fog.

(Updated: 2021-12-07)

I saw Known Issues of WinOF v5.50.5400 Release notes, then found Internal Ref 1297888 which was exactly the “packet counter” issue. That's why the issue is a Windows driver bug.

Its workaround is N/A, so I have to wait to be fixed, but I wonder if it will be done? ConnectX-3 series already has been LTS phase anyway. I wish I could get a ConnectX-4 at a low price.

permalink · 2021-02-10 19:05 · Decomo · 0 Comments · Tags:

The project builds up stout router with SR-IOV is now starting up!

That being so I tried to create a pfSense 2.4.5-p1 VM on Proxmox VE 6.3-2 with ConnectX-3 VFs by PCI pass-through, but it didn't recognise them properly…😇

Some error logs are recorded in dmesg such as “pcib1: failed to allocate initial I/O port window: 0xd000-0xdfff” and “pcib1: Failed to allocate interrupt for PCI-e events”, and the VFs aren't listed up with pciconf -lv command. That means device probing fails anyways, I guess.

The pfSense 2.4 is based on FreeBSD 11.3-RELEASE. It is old-ish, so I tried newer pfSense 2.5 which is under development and is based on FreeBSD 12-STABLE.

The system recognises the VFs properly and mlxen devices are created though the former error occurs same as usual. I don't know why two mlxen are identified as mlxen0 and mlxen2 even though I pass through sequence number of VFs. It's weird, but I'll settle for this for now.

It seems mlx4 modules are embedded into a kernel of pfSense 2.4.5 or 2.5.

It is out of the question if NIC is out of order, so I decided to use developing pfSense 2.5. There is no doubt stable pfSense 2.5 will be released at some future date.

permalink · 2021-02-02 20:59 · Decomo · 0 Comments · Tags:

In about FreeBSD 12.0-RELEASE, loader.efi has been used as UEFI bootloader instead of prior boot.efi.

Both loaders can boot a FreeBSD system from ZFS or UFS file system. The loader.efi only looks for the boot file system from the disk which the loader is loaded from, though the boot1.efi does it from its disk and also another disks. Briefly speaking, current loader.efi can't boot the system from other disks. Well, it can do it if we manually operate to set the boot devices in loader prompt each time, but this is not realistic way.

I thought that it somehow got be able to automatically boot the system from other disks, then I read some documents and googled, but there was no idea. After I reluctantly saw the source codes of loader.efi, I found a way using a loader.env file and a rootdev variable.

Regardless of loader.efi or boot1.efi, it eventually uses the value of currdev variable as a boot target. In loader.efi, the 'currdev' is set from a 'rootdev' variable unquestionably if it exists.

And then, it seems the rootdev is set by /efi/freebsd/loader.env file in ESP. This function is developed relatively recently, then FreeBSD 12.2-RELEASE and over support it.

Add a following line to the file to specify the file system path corresponding to root directory in ZFS. The value becomes like disk0p1 in UFS. Suffix colon is not wrong, it's necessary!

rootdev=zfs:zroot/ROOT/default:

They are undocumented as of 9th January 2021, so they may change in the future and also as-is.

Of course, I simply use prior boot1.efi with no need to such a chore way.

permalink · 2021-01-10 17:52 · Decomo · 4 Comments · Tags:

FreeBSD
PC

It is perceived that boot sequence steps of FreeBSD (x64) in UEFI environment are as below. It is pedigreed process written on manpages.

UEFI:/EFI/BOOT/BOOTX64.EFI
- A UEFI bootloader executed when system starts up.
boot1.efi (man)
- The 1st stage bootloader looking for a freebsd-zfs or freebsd-ufs partition and booting a next stage. The look-up proceeds from a device which the loader is loaded to another ones in order of an UEFI's boot order setting.
The final stage bootloader: loader.efi (man)
- A bootloader executing a FreeBSD kernel in the storage specified by an environment variable of currdev or loaddev.
kernel

The UEFI firmware gets BOOTX64.EFI up in EFI system partition. Next it runs the boot1.efi, then the loader.efi is lunched, and finally the kernel boots up. Actually the process is supposed to move ahead as BOOTX64.EFI → laoder.efi → kernel because boot1.efi is copied as BOOTX64.EFI

The reason why I say ambiguously is because, as you know, it is different between the implementation and manpages' description. It looks like that the loader.efi has been used as BOOTX64.EFI in FreeBSD 12.0-RELEASE. (Corresponding commit.) It seems that these implementations are now work-in-progress, so there is a moderate improved patch to generate an ESP.

You can find the same hash values between BOOTX64.EFI and loader.efi when mounting the ESP generated by FreeBSD 12.2-RELEASE installer.

So, the final stage bootloader goes straight skipping over the 1st stage in fact.

This way is no problem in most environments, but it probably can't boot the system placed on another storage because the loader.efi seems to only look for system partitions from the storage which it is loaded from. That's a fine how-do-you-do!

Workaround is to replace the BOOTX64.EFI with boot1.efi or to set currdev and load the kernel and zfs.ko manually in bootloader prompt remaining the loader.efi as EFI bootloader. The former is clearly easier than the latter.

According to source codes of the loader.efi, it looks like the loader can boot the system by an undocumented way. I will try it after.

permalink · 2021-01-08 08:59 · Decomo · 0 Comments · Tags:

FreeBSD
PC

FreeBSD's cp command doesn't copy extended attributes at all. I checked it in FreeBSD 12.0-RELEASE-p4. Thus, at this time, extended attributes attached to a file will be lost when copying the file with cp command on FreeBSD.

Someone who thinks “Use -p option, won't you?” is naive. The option just preserves file's mtime, atime, flags, permissions, ACL, UID and GID as manpage says.

For instance, there is a file hasxattr.txt which has two extended attribute (OpusMetaInformation and DOSATTRIB.)

$ ls -al
total 42
drwxr-xr-x   2 Decomo  Decomo    3 10月 22 00:11 .
drwxr-xr-x  56 Decomo  Decomo  190 10月 21 23:43 ..
-rwxr--r--   1 Decomo  Decomo    0 10月 21 23:49 hasxattr.txt
$ lsextattr user *
hasxattr.txt    OpusMetaInformation     DOSATTRIB

Then, copy the file with cp -p.

$ cp -p hasxattr.txt hasxattr_cp-p.txt

…and check extended attributes.

$ lsextattr user *
hasxattr.txt    OpusMetaInformation     DOSATTRIB
hasxattr_cp-p.txt

＼(^o^)／ < They've gone.

The destination file's extended attributes surely have been lost when I checked them as a color label with DirectoryOpus filer. (OpusMetaInformation stored the label information.)

As a one of EA lover, it's a pity.

I tried GNU cp provided by sysutils/coreutils ports, but his reply was loveless…

$ gcp --preserve=xattr hasxattr.txt hasxattr_gcp.txt
gcp: 拡張属性を保護できません。cp が xattr サポートなしで作成されています

It seems that rsync -X and mv preserve extended file attributes.

$ rsync -X hasxattr.txt hasxattr_rsync.txt
$ mv hasxattr.txt hasxattr_mv.txt
$ lsextattr user *
hasxattr_cp-p.txt
hasxattr_mv.txt OpusMetaInformation     DOSATTRIB
hasxattr_rsync.txt      OpusMetaInformation     DOSATTRIB

I found a wonderful page which described a support status of extended attributes of each command on FreeBSD, Linux and MacOS: Extended attributes: the good, the not so good, the bad.

At any rate, I wish the coreutils ports could choice a option whether handling extended attributes or not.

permalink · 2020-12-24 19:18 · Decomo · 0 Comments · Tags: