Cobbler
Warning
This documentation is for internal use. It may be of interest to users who are curious about our internal processes and architecture, but should not be mistaken for describing services that we offer or stable infrastructure that end users should rely upon. If you find yourself submitting a ticket about something on this page, you are probably making a mistake.
HPCCF uses cobbler for provisioning and managing internal DNS.
There is a cobbler server per cluster as well as one for the public HPC VLAN.
cobbler.hpc
- public HPC VLAN.cobbler.hive
- hive private and management VLANscobbler.farm
- farmcobbler.peloton
- pelotoncobbler.franklin
- franklin
hpc1
, hpc2
, and lssc0
do not have associated cobbler servers.
Add a new host¶
Hive¶
**something**
or bold text must be replaced with correct, per-host values.
Initial setup
Pay attention to the output from the scripts. Some of it is large amounts of spew from cobbler, but the rest could have critical information.
Configure the BMC first.
./cobbler-add-from-netbox.sh "$HOSTNAME" bmc
Then the main Ethernet iface.
If this is a host with a 25G Mellanox Ethernet card that we have not been able to PXE boot from, you need to add: -I InstallationEthDeviceFromNetbox
./cobbler-add-from-netbox.sh -I eth**N** "$HOSTNAME" eth**N**
IFF this is not a standard hive node disk configuration (pair of smaller, one larger NVMe),
you will need to add: -d name_of_disk_layout_template
Available options in: hpccf-cobbler-templates/partitions/
Pay attention to the information at the end, it will have the installation hostname, IP, BMC hostname, BMC IP, etc.
Reset the system, watch in BMC, OR, watch in SSH: ssh installer@$HOSTNAME**(-install)?**
Wait for installation to finish
Find production Ethernet MAC
ssh root@$HOSTNAME**(-install)?**
Look at ethtool '*'
and ifconfig eth*N*
to figure out which one is the >1G Ethernet connection that will be used in production.
Puppet work
ssh puppet.hpc
-
Create:
data/nodes/**HOSTNAME**.hive.hpc.ucdavis.edu.yaml
-
Contents:
infiniband::network::netplan::ethmac: **10G::mac::addr:here**
Contact Omen or Camille for review and push. Note, this change will go live on Hive without a push.
Verify
ssh $USER@$HOSTNAME
Verify everything looks okay
Check Monitoring
Verify no alerts:
https://monitoring.hpc.ucdavis.edu/icingaweb2/monitoring/host/services?host=**fqdn**
Add to Slurm.
Rejoice.