Peripheral Component Interconnect (PCI) is an interface that provides connectivity between the control field-programmable gate array (FPGA) and the Routing Engine on Juniper devices. Any discrepancies in the PCI could hinder the normal functioning of the Routing Engine/ Control Board.
Given that the alarm is of severe criteria, it is advised to clear it by following the procedure detailed in this article.
The below alarm will be seen on the device:
user@host> show chassis alarms
1 alarms currently active
Alarm time Class Description
2018-04-04 06:09:13 PDT Major Host 0 PCI Device missing 0x111d:0x8090
Error Logs
Apr 4 06:06:57.880 craftd[11453]: Major alarm cleared, Host 0 PCI Device missing 0x111d:0x8090
Apr 4 06:08:55.126 kernel: Creating PCI Scan thread
Apr 4 06:08:55.126 kernel: pcidev module loaded, 0 (null)
Apr 4 06:08:55.126 kernel: pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
Apr 4 06:08:55.126 kernel: pci0: <ACPI PCI bus> on pcib0
Apr 4 06:08:55.126 kernel: isab0: <PCI-ISA bridge> at device 1.0 on pci0
Apr 4 06:08:55.126 kernel: atapci0: <Intel PIIX3 WDMA2 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xc260-0xc26f at device 1.1 on pci0
Apr 4 06:08:55.126 kernel: ata0: <ATA channel> at channel 0 on atapci0
Apr 4 06:08:55.126 kernel: ata1: <ATA channel> at channel 1 on atapci0
Apr 4 06:08:55.126 kernel: ichsmb0: <SMBus controller> port 0xc1e0-0xc1ff mem 0xfead0000-0xfead0fff irq 11 at device 2.0 on pci0
Apr 4 06:08:55.126 kernel: pci0: <network, ethernet> at device 4.0 (no driver attached)
Apr 4 06:08:55.126 kernel: xhci0: <Intel Wellsburg USB 3.0 controller> mem 0xfeac0000-0xfeacffff irq 10 at device 5.0 on pci0
Apr 4 06:08:55.126 kernel: ixlv0: <Intel(R) Ethernet Connection XL710 VF Driver, Version - 1.2.6> mem 0xfebd0000-0xfebdffff,0xfebf0000-0xfebf3fff at device 6.0 on pci0
<chassisd>
Apr 4 06:09:13.580 chassisd[4752]: PCI device with vendor id 0xxxxd and device id0xxxx0 is missing
Apr 4 06:09:13.580 chassisd[4752]: PCI device with vendor id 0xxxxd and device id 0xxxx0 is missing
There could be a few causes that trigger this alarm:
- A recent upgrade on RE
- A recent reboot of the RE
- Master RE and Backup RE running different Junos OS versions in dual RE devices
- Hardware issue with RE
- Firmware issue on RE
To troubleshoot and resolve this issue, perform the following steps:
-
Verify the RE hardware details.
user@host> show chassis hardware detail no-forwarding
Hardware inventory:
Item Version Part number Serial number Description
<snip>
Routing Engine 0 REV 17 750-054758 ZZZZZZ36 RE-S-2X00x6
vtbd0 17408 MB Virtio Block Disk
vtbd1 15360 MB Virtio Block Disk
ada0 511 MB QEMU HARDDISK QZZZZZ2 Emulated IDE Disk
usb0 (addr 0.1) XHCI root HUB 0 zzzz86 uhub0
Routing Engine 1 REV 17 750-054758 ZZZZZZZ2 RE-S-2X00x6
vtbd0 17408 MB Virtio Block Disk
vtbd1 15360 MB Virtio Block Disk
ada0 511 MB QEMU HARDDISK ZZZZZ02 Emulated IDE Disk
usb0 (addr 0.1) XHCI root HUB 0 ZZZZZ6 uhub0
In case of dual RE devices, verify if both the REs are running the same version:
user@host> show version invoke-on all-routing-engines
re0:
--------------------------------------------------------------------------
Hostname: XX
Model: mx480
Junos: 17.4R2.4
re1:
--------------------------------------------------------------------------
Hostname: XX
Model: mx480
Junos: 17.4R2.4
-
Similarly, verify the vmhost version on the Routing Engines:
user@host> show vmhost version invoke-on all-routing-engines
-
Verify the PCI status from shell prompt:
user@host> start shell user root
user@host:# vhclient -s
Last login: Mon Jan 25 10:42:39 AST 2018 from local-node on pts/1
You have mail.
user@host:~# lspci
00:00.0 Host bridge: Intel Corporation Haswell-E DMI2 (rev 02)
00:01.0 PCI bridge: Intel Corporation Haswell-E PCI Express Root Port 1 (rev 02)
00:02.0 PCI bridge: Intel Corporation Haswell-E PCI Express Root Port 2 (rev 02)
00:02.2 PCI bridge: Intel Corporation Haswell-E PCI Express Root Port 2 (rev 02)
00:03.0 PCI bridge: Intel Corporation Haswell-E PCI Express Root Port 3 (rev 02)
00:03.1 PCI bridge: Intel Corporation Haswell-E PCI Express Root Port 3 (rev 02)
00:03.2 PCI bridge: Intel Corporation Haswell-E PCI Express Root Port 3 (rev 02)
00:03.3 PCI bridge: Intel Corporation Haswell-E PCI Express Root Port 3 (rev 02)
The above output should have the below line, which ensures that the PCI bridge is being detected:
"PCI bridge: Integrated Device Technology, Inc. [IDT] Device 8090 (rev 02)"
-
Verify if the RE has been upgraded to the latest firmware:
user@host> show system firmware
Part Type Tag Current Available Status
version version
CB 0 CB FPGA 0 22.0.0 22.0 OK
CB 1 CB FPGA 0 0.0.0 22.0 OK
Routing Engine 0 RE BIOS 0 0.56 0.53 OK
Routing Engine 0 RE FPGA 1 41.0.0 41.0 OK
Routing Engine 0 RE SSD1 3 0.0.0 OK
Routing Engine 0 RE SSD2 3 0.0.0 OK
Routing Engine 1 0 1 OK
Note: Contact Support before considering a firmware upgrade for your device.
-
Power cycle the RE:
-
Power cycling re-initiates power into the FRU and re-initiates all its associated hardware parts. In case of any transient issues, this step should help resolve the problem. If the RE that you are trying to re-seat is the master, ensure that you switch mastership before performing this step.
-
In the case of a single RE device, be mindful that this process will take the device down. If you are running critical services, you might want to divert traffic through another device, and then perform this step.
6. Power cycle the CB:
-
Power cycling re-initiates power into the FRU and re-initiates all its associated hardware parts. In case of any transient issues, this step should help resolve the problem. If the CB that you are trying to re-seat is the master, ensure that you switch mastership before performing this step.
-
In the case of a single RE device, be mindful that this process will take the device down. If you are running critical services, you might want to divert traffic through another device, and then perform this step.
If the above steps do not help in resolving the error, there could be a possible hardware issue with the RE/CB and might need replacement. Please contact JTAC if the above troubleshooting steps do not help in resolving the issue,