OSG Troubleshooting - Advertising Incorrect Cluster Information


Currently in OSG the information being advertised via BDII/MDS2 for a cluster is incorrect. Without modification GIP collects information from only the Computing Element (CE). It then assumes the Computing Element is identical to the nodes within the cluster. Many times this is not the case. This incorrect information is then used by Resource Selection and Resource Brokering to perform match-making.

As an example let us assume my CE is a Pentium4 2.4 with 1 GB of ram. Further assume each node in the cluster is dual socket with dual-core Opteron 2.0 (quad-core) with 8 GB of ram. My cluster has 20 nodes. GIP in its current state would be advertising 20 Pentium4 2.4's with 1 GB of ram. This information is not correct.

Proposed Solutions

  1. Add a dynamic plugin to collect the information from a node within the cluster
  2. Add information to config_gip in some way. Possibly
    1. automatically collect the information
    2. allow administrator to input information
    3. a combination of above?