Support Support Downloads Knowledge Base Case Manager My Juniper Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

[Junos] Major "Host x PCI Device missing" alarm and how to troubleshoot it

0

0

Article ID: KB35914 KB Last Updated: 12 Jun 2020Version: 1.0
Summary:

Peripheral Component Interconnect (PCI) is an interface that provides connectivity between control field-programmable gate array (FPGA) and the Routing Engine on Juniper devices. Any discrepancies in the PCI could hinder the normal functioning of Routing Engine.

Given that the alarm is of severe criteria, it is advised to clear it by following the procedure detailed in this article.

 

Symptoms:

The below alarm will be seen on the device:

user@host> show chassis alarms
1 alarms currently active
Alarm time               Class  Description
2018-04-04 06:09:13 PDT  Major  Host 0 PCI Device missing 0x111d:0x8090

Error Logs

Apr  4 06:06:57.880   craftd[11453]: Major alarm cleared, Host 0 PCI Device missing 0x111d:0x8090
Apr  4 06:08:55.126   kernel: Creating PCI Scan thread
Apr  4 06:08:55.126   kernel: pcidev module loaded, 0 (null)
Apr  4 06:08:55.126   kernel: pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
Apr  4 06:08:55.126   kernel: pci0: <ACPI PCI bus> on pcib0
Apr  4 06:08:55.126   kernel: isab0: <PCI-ISA bridge> at device 1.0 on pci0
Apr  4 06:08:55.126   kernel: atapci0: <Intel PIIX3 WDMA2 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xc260-0xc26f at device 1.1 on pci0
Apr  4 06:08:55.126   kernel: ata0: <ATA channel> at channel 0 on atapci0
Apr  4 06:08:55.126   kernel: ata1: <ATA channel> at channel 1 on atapci0
Apr  4 06:08:55.126   kernel: ichsmb0: <SMBus controller> port 0xc1e0-0xc1ff mem 0xfead0000-0xfead0fff irq 11 at device 2.0 on pci0
Apr  4 06:08:55.126   kernel: pci0: <network, ethernet> at device 4.0 (no driver attached)
Apr  4 06:08:55.126   kernel: xhci0: <Intel Wellsburg USB 3.0 controller> mem 0xfeac0000-0xfeacffff irq 10 at device 5.0 on pci0
Apr  4 06:08:55.126   kernel: ixlv0: <Intel(R) Ethernet Connection XL710 VF Driver, Version - 1.2.6> mem 0xfebd0000-0xfebdffff,0xfebf0000-0xfebf3fff at device 6.0 on pci0

<chassisd> 

Apr  4 06:09:13.580  chassisd[4752]: PCI device with vendor id 0xxxxd and device id0xxxx0 is missing
Apr  4 06:09:13.580  chassisd[4752]: PCI device with vendor id 0xxxxd and device id 0xxxx0 is missing

 

Cause:

There could be a few causes that trigger this alarm:

  • A recent upgrade on RE

  • A recent reboot of the RE

  • Master RE and Backup RE running different Junos OS versions in dual RE devices

  • Hardware issue with RE

 

Solution:

To troubleshoot and resolve this issue, perform the following steps:

  1. Verify the RE hardware details.

user@host> show chassis hardware detail no-forwarding
 
Hardware inventory:
Item             Version  Part number  Serial number     Description
<snip>
Routing Engine 0 REV 17   750-054758   ZZZZZZ36          RE-S-2X00x6
  vtbd0 17408 MB                                         Virtio Block Disk
  vtbd1 15360 MB                                         Virtio Block Disk
  ada0    511 MB  QEMU HARDDISK        QZZZZZ2           Emulated IDE Disk
  usb0 (addr 0.1) XHCI root HUB 0      zzzz86            uhub0
Routing Engine 1 REV 17   750-054758   ZZZZZZZ2          RE-S-2X00x6
  vtbd0 17408 MB                                         Virtio Block Disk
  vtbd1 15360 MB                                         Virtio Block Disk
  ada0    511 MB  QEMU HARDDISK        ZZZZZ02           Emulated IDE Disk
  usb0 (addr 0.1) XHCI root HUB 0      ZZZZZ6            uhub0

In case of dual RE devices, verify if both the REs are running the same version:

user@host> show version invoke-on all-routing-engines​
re0:
--------------------------------------------------------------------------
Hostname: XX
Model: mx480
Junos: 17.4R2.4​
re1:
--------------------------------------------------------------------------
Hostname: XX
Model: mx480​
Junos: 17.4R2.4
  1. Similarly, verify the vmhost version on the Routing Engines:

user@host> show vmhost version invoke-on all-routing-engines
  1. Verify the PCI status from shell prompt:​

user@host> start shell user root
user@host:# vhclient -s
Last login: Mon Jan 25 10:42:39 AST 2018 from local-node on pts/1
You have mail.
user@host:~# lspci
00:00.0 Host bridge: Intel Corporation Haswell-E DMI2 (rev 02)
00:01.0 PCI bridge: Intel Corporation Haswell-E PCI Express Root Port 1 (rev 02)
00:02.0 PCI bridge: Intel Corporation Haswell-E PCI Express Root Port 2 (rev 02)
00:02.2 PCI bridge: Intel Corporation Haswell-E PCI Express Root Port 2 (rev 02)
00:03.0 PCI bridge: Intel Corporation Haswell-E PCI Express Root Port 3 (rev 02)
00:03.1 PCI bridge: Intel Corporation Haswell-E PCI Express Root Port 3 (rev 02)
00:03.2 PCI bridge: Intel Corporation Haswell-E PCI Express Root Port 3 (rev 02)
00:03.3 PCI bridge: Intel Corporation Haswell-E PCI Express Root Port 3 (rev 02)

The above output should have the below line, which ensures that the PCI bridge is being detected:

"PCI bridge: Integrated Device Technology, Inc. [IDT] Device 8090 (rev 02)"
  1. Verify if the RE has been upgraded to the latest firmware:

user@host> show system firmware

Part             Type           Tag Current   Available Status
                                    version   version
CB 0             CB FPGA        0   22.0.0    22.0      OK
CB 1             CB FPGA        0   0.0.0     22.0      OK
Routing Engine 0 RE BIOS        0   0.56      0.53      OK
Routing Engine 0 RE FPGA        1   41.0.0    41.0      OK
Routing Engine 0 RE SSD1        3   0.0.0               OK
Routing Engine 0 RE SSD2        3   0.0.0               OK
Routing Engine 1                0   1                   OK

Note: Contact Support before considering a firmware upgrade for your device.

  1. Power cycle the RE:

  • Power cycling re-initiates power into the FRU and re-initiates all its associated hardware parts. In case of any transient issues, this step should help resolve the problem. If the RE that you are trying to re-seat is the master, ensure that you switch mastership before performing this step.

  • In the case of a single RE device, be mindful that this process will take the device down. If you are running critical services, you might want to divert traffic through another device, and then perform this step.

If the above steps do not help in resolving the error, there could be a possible hardware issue with the RE and the RE might need replacement. 

 

Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search