<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://www.sternwarte.uni-erlangen.de/wiki/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Reinmann</id>
	<title>Remeis-Wiki - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://www.sternwarte.uni-erlangen.de/wiki/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Reinmann"/>
	<link rel="alternate" type="text/html" href="https://www.sternwarte.uni-erlangen.de/wiki/index.php/Special:Contributions/Reinmann"/>
	<updated>2026-04-06T02:30:28Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.35.7</generator>
	<entry>
		<id>https://www.sternwarte.uni-erlangen.de/wiki/index.php?title=Meggie_Cluster&amp;diff=3868</id>
		<title>Meggie Cluster</title>
		<link rel="alternate" type="text/html" href="https://www.sternwarte.uni-erlangen.de/wiki/index.php?title=Meggie_Cluster&amp;diff=3868"/>
		<updated>2025-09-10T07:45:57Z</updated>

		<summary type="html">&lt;p&gt;Reinmann: /* OS setup */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Admin]]&lt;br /&gt;
&lt;br /&gt;
The meggies are diskless nodes inherited from the RRZE used from computing only. We would like to boot them from the network. NFS and iSCSI are two options to accomplish this. Since iSCSI is directly supported by the UEFI of the nodes this is the option which is by far the easiest to implement.&lt;br /&gt;
&lt;br /&gt;
== Hardware ==&lt;br /&gt;
The main boards are S2600KPR from Intel. Four of those are in a 2U chassis, called H2000G. The rails for the installation are called AXXELVRAIL.&lt;br /&gt;
&lt;br /&gt;
== To Do List ==&lt;br /&gt;
&lt;br /&gt;
=== OS setup ===&lt;br /&gt;
* &amp;lt;s&amp;gt;Make the installer recognize the iSCSI LUN as a block device before searching for those&amp;lt;/s&amp;gt;&lt;br /&gt;
* &amp;lt;s&amp;gt;Supply LUN information directly to the UEFI via DHCP&amp;lt;/s&amp;gt;&lt;br /&gt;
* Test what happens on iSCSI connection problems&lt;br /&gt;
** Switch malfunction&lt;br /&gt;
** iSCSI server reboot&lt;br /&gt;
** It doesn't take long until the kernel gives up on the iSCSI target. We need to increase this timeout somehow&lt;br /&gt;
* &amp;lt;s&amp;gt;Make a list of UEFI settings&amp;lt;/s&amp;gt;&lt;br /&gt;
* &amp;lt;s&amp;gt;Prepare LUNs&amp;lt;/s&amp;gt;&lt;br /&gt;
** &amp;lt;s&amp;gt;Use ZFS zvols as storage backend&amp;lt;/s&amp;gt;&lt;br /&gt;
** &amp;lt;s&amp;gt;Find out which backend type is the best (file, block, SCSI passthrough)&amp;lt;/s&amp;gt;&lt;br /&gt;
** &amp;lt;s&amp;gt;Create a separate dataset for easier zfs setting propagation&amp;lt;/s&amp;gt;&lt;br /&gt;
** Leave a README file in the dataset directory for other people to know that it shouldn't be deleted&lt;br /&gt;
** Optimize dataset settings for iSCSI targets&lt;br /&gt;
*** Compression lz4&lt;br /&gt;
*** Larger &amp;lt;code&amp;gt;recordsize&amp;lt;/code&amp;gt; (1M or something)&lt;br /&gt;
** &amp;lt;s&amp;gt;Name the zvols &amp;lt;code&amp;gt;zvol-MM-m&amp;lt;/code&amp;gt; where MM is the chassis number from 01 to 20 and m the node number from 1 to 4&amp;lt;/s&amp;gt;&lt;br /&gt;
* Remove the stupid /swap.img file. This is a general point affecting all computers&lt;br /&gt;
* Adjust puppet&lt;br /&gt;
** &amp;lt;s&amp;gt;Make no scratch available&amp;lt;/s&amp;gt;&lt;br /&gt;
** &amp;lt;s&amp;gt;Make /tmp and friends a tmpfs&amp;lt;/s&amp;gt;&lt;br /&gt;
** &amp;lt;s&amp;gt;Restrict sizes of all tmpfs&amp;lt;/s&amp;gt;&lt;br /&gt;
** Remove unnecessary user stuff? (e.g. x2go, firefox, etc.)&lt;br /&gt;
&lt;br /&gt;
=== Hardware setup ===&lt;br /&gt;
* Install nodes in a rack&lt;br /&gt;
* Buy and set up up switches&lt;br /&gt;
* Lots of cabling&lt;br /&gt;
* Power requirements?&lt;br /&gt;
** Ca. 250W per node, 1kW per chassis&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== The Boot Process ==&lt;br /&gt;
&lt;br /&gt;
=== General layout ===&lt;br /&gt;
The boot process works like this:&lt;br /&gt;
# UEFI stage&lt;br /&gt;
## The UEFI starts up&lt;br /&gt;
## Establishes a connection to the network&lt;br /&gt;
## Runs a DHCP client&lt;br /&gt;
## Gets an IP as well as the iSCSI LUN information directly from our DHCP server&lt;br /&gt;
## Accesses the GPT on the LUN and loads grub&lt;br /&gt;
# grub stage&lt;br /&gt;
## Grub starts&lt;br /&gt;
## Loads the kernel and initrd from the LUN&lt;br /&gt;
## Jumbs into the kernel&lt;br /&gt;
# ramdisk stage&lt;br /&gt;
## The kernel runs&lt;br /&gt;
## Mounts the initrd&lt;br /&gt;
## Inside the initrd our custom iSCSI hook is called&lt;br /&gt;
### Loads the iSCSI kernel module&lt;br /&gt;
### Loads the iSCSI iBFT kernel module&lt;br /&gt;
### Runs &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; which makes the iSCSI LUN available as a block device&lt;br /&gt;
# Normal boot continues&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Further explanations ===&lt;br /&gt;
==== UEFI DHCP ====&lt;br /&gt;
The UEFI can talk to the DHCP server and get an IP. We use DHCP option 17 (root-path) to supply the iSCSI information to the nodes.&lt;br /&gt;
&lt;br /&gt;
==== grub ====&lt;br /&gt;
As of now we don't know why grub even works. It's installed on the efi partition on the LUN, just like a normal computer. The UEFI uses the information on this EFI partition to find grub and run it, and grub then sees the partitions of the installed OS. Either the UEFI somehow emulates a blockdevice for grub such that it can see these partitions and load the kernel and initrd, or grub somehow also does iSCSI by using the iBFT. Anyway, it loads the kernel and initrd, which is the most important part.&lt;br /&gt;
&lt;br /&gt;
==== iBFT ====&lt;br /&gt;
This is really cool. When the UEFI launches and gets iSCSI information from the DHCP server it puts this information into the EFI system table, the Boot Firmware Table (iBFT). Linux can use this information to connect to the same LUN used for the boot process and set up the block device before trying to mount the root partition. The necessary kernel module is called &amp;lt;code&amp;gt;iscsi_ibft&amp;lt;/code&amp;gt;. After the module is loaded its only a matter of calling the &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; program to set up the block device. All this needs to happen before root is mounted, therefore modifications to the initrd are necessary.&lt;br /&gt;
&lt;br /&gt;
==== initrd modifications ====&lt;br /&gt;
During the boot, when the kernel is running and mounted the initrd, the &amp;lt;code&amp;gt;iscsi_ibft&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;iscsi_tcp&amp;lt;/code&amp;gt; modules need to be loaded, after which the block device can be created. To do this a hook needs to be installed in the initramfs. On ubuntu this can be done by putting files into the &amp;lt;code&amp;gt;/etc/initramfs-tools/scripts&amp;lt;/code&amp;gt; directory (or rather subdirectories). The hook for the iSCSI setup need to happen after the network is available, which means the hook needs to be put in the &amp;lt;code&amp;gt;local-top&amp;lt;/code&amp;gt; directory and made executable. The content is:&lt;br /&gt;
&amp;lt;source&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# iSCSI init script&lt;br /&gt;
PREREQ=&amp;quot;&amp;quot;&lt;br /&gt;
prereqs()&lt;br /&gt;
{&lt;br /&gt;
     echo &amp;quot;$PREREQ&amp;quot;&lt;br /&gt;
}&lt;br /&gt;
&lt;br /&gt;
case $1 in&lt;br /&gt;
prereqs)&lt;br /&gt;
     prereqs&lt;br /&gt;
     exit 0&lt;br /&gt;
     ;;&lt;br /&gt;
esac&lt;br /&gt;
&lt;br /&gt;
. /scripts/functions&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Begin iSCSI init&amp;quot;&lt;br /&gt;
&lt;br /&gt;
modprobe iscsi_tcp&lt;br /&gt;
modprobe iscsi_ibft&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Network configuration based on iBFT&amp;quot;&lt;br /&gt;
&lt;br /&gt;
iscsistart -N || panic &amp;quot;Could not initialize iSCSI&amp;quot;&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Waiting to finish iscsistart&amp;quot;&lt;br /&gt;
until iscsistart -b ; do&lt;br /&gt;
    sleep 1&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Of course the script needs to be made executable. After that the initial ramdisk(s) need to be regenerated by calling &amp;lt;code&amp;gt;update-initramfs -u&amp;lt;/code&amp;gt;. When the script is called during the boot process the block device exists for the kernel to mount the root filesystem and continue the boot process as normal.&lt;br /&gt;
&lt;br /&gt;
=== Installing ===&lt;br /&gt;
Unfortunately we can't completely rely on the normal installation process because Ubuntu by default doesn't use the iBFT to set up an iSCSI LUN before the installer searches for block devices to install on. We could get around this by simply switching to a TTY, running &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; and relaunching the installer, which then identified the block device and proceeded as normal. Since this entire stuff runs off LUNs on the network we can potentially completely get around the installation by just preparing enough LUNS with the appropriate data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== iSCSI targets ===&lt;br /&gt;
In the home directory of the ubuntuamdin are in the folder iscsi/ scripts for generating the images in the zfs subvolume &amp;lt;code&amp;gt; pool/iscsi &amp;lt;/code&amp;gt; (called: &amp;lt;code&amp;gt;generate_zfs_images.sh&amp;lt;/code&amp;gt;). Then the iscsi targets are being created with: &amp;lt;code&amp;gt;create_iscsi_targets.sh&amp;lt;/code&amp;gt;&lt;br /&gt;
With &amp;lt;code&amp;gt;destroy_all_zfs_images.sh&amp;lt;/code&amp;gt; everything can be reverted.&lt;br /&gt;
The root filesystem size for each node is 80.8GB.&lt;br /&gt;
Once created, they shouldn't be touched unless nodes are failing and we want to delete unnecessary images.&lt;br /&gt;
&lt;br /&gt;
== UEFI settings ==&lt;br /&gt;
Settings applied after resetting the UEFI to default settings:&lt;br /&gt;
&lt;br /&gt;
Enter the BIOS menu by pressing F2.&lt;br /&gt;
&lt;br /&gt;
In the settings, first RESET everything to default.&lt;br /&gt;
Then those changes should be made:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Main&lt;br /&gt;
  --&amp;gt; Quiet Boot: Disabled&lt;br /&gt;
Advanced&lt;br /&gt;
  --&amp;gt; Processor Configuration&lt;br /&gt;
    --&amp;gt; Intel Hyper Threading: Disabled&lt;br /&gt;
  --&amp;gt; Power &amp;amp; Performance&lt;br /&gt;
    --&amp;gt; CPU Power &amp;amp; Performance: Performance&lt;br /&gt;
    --&amp;gt; CPU HWPM State Control&lt;br /&gt;
      --&amp;gt; Enable CPU HWPM: HWPM Native Mode&lt;br /&gt;
  --&amp;gt; PCI Configuration&lt;br /&gt;
    --&amp;gt; NIC Configuration&lt;br /&gt;
      --&amp;gt; NIC Port 2: Disabled&lt;br /&gt;
Server Management&lt;br /&gt;
  --&amp;gt; Clear System Event Log: Clear it here&lt;br /&gt;
Advanced Boot Options&lt;br /&gt;
  --&amp;gt; System Boot Timeout: 10&lt;br /&gt;
  --&amp;gt; Bood Mode: UEFI&lt;br /&gt;
  --&amp;gt; Boot Option Retry: Enabled&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
REBOOT the system (with e.g. ALT + CMD + DEL)&lt;br /&gt;
Then change again in the BIOS settings:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source&amp;gt;&lt;br /&gt;
Advanced&lt;br /&gt;
  --&amp;gt; PCI Configuration&lt;br /&gt;
    --&amp;gt; UEFI Network Stack&lt;br /&gt;
      --&amp;gt;IPv6 PXE Support: Disabled&lt;br /&gt;
    --&amp;gt; UEFI Option ROM Control&lt;br /&gt;
      --&amp;gt; IPv4 Network Configuration&lt;br /&gt;
        --&amp;gt; Configured: [x]&lt;br /&gt;
        --&amp;gt; Enable DHCP: [x]&lt;br /&gt;
    --&amp;gt; iSCSI Configuration&lt;br /&gt;
      --&amp;gt; iSCSI Initiator Name: Format is: iqn.1886.de.sternwarte.iscsi:meggie-x-y &lt;br /&gt;
                                where x: rack number (01 is top, 20 is the lowest rack); &lt;br /&gt;
                                      y: |-----|-----| for one rack are 4 nodes.&lt;br /&gt;
                                         |  1  |  2  |&lt;br /&gt;
                                         |-----|-----|&lt;br /&gt;
                                         |  3  |  4  |&lt;br /&gt;
                                         |-----|-----|&lt;br /&gt;
                                         eg: iqn.1886.de.sternwarte.iscsi:meggie-11-1 for node in rack 11 and place 1&lt;br /&gt;
    --&amp;gt; Add an Attempt&lt;br /&gt;
      --&amp;gt; MAC-address&lt;br /&gt;
        --&amp;gt; iSCSI Mode: Enabled&lt;br /&gt;
        --&amp;gt; Connection Retry Count: 2&lt;br /&gt;
        --&amp;gt; Connection Establishing Timeout: 5000&lt;br /&gt;
        --&amp;gt; Enable DHCP: [x]&lt;br /&gt;
        --&amp;gt; Get target info via DHCP: [x]&lt;br /&gt;
        --&amp;gt; Authentication Type: None&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
REBOOT and press F6 to enter the boot menu:&lt;br /&gt;
&lt;br /&gt;
Go to IP based booting option (should be the last of the three) &lt;br /&gt;
Select partition:&lt;br /&gt;
&amp;lt;source&amp;gt;&lt;br /&gt;
Ubuntu 20.4 puppet iSCSI autopartition --DO NOT USE IF NO IDEA--&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Be careful, because there exist also other versions like blank.&lt;/div&gt;</summary>
		<author><name>Reinmann</name></author>
	</entry>
	<entry>
		<id>https://www.sternwarte.uni-erlangen.de/wiki/index.php?title=Meggie_Cluster&amp;diff=3867</id>
		<title>Meggie Cluster</title>
		<link rel="alternate" type="text/html" href="https://www.sternwarte.uni-erlangen.de/wiki/index.php?title=Meggie_Cluster&amp;diff=3867"/>
		<updated>2025-09-10T07:45:33Z</updated>

		<summary type="html">&lt;p&gt;Reinmann: /* OS setup */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Admin]]&lt;br /&gt;
&lt;br /&gt;
The meggies are diskless nodes inherited from the RRZE used from computing only. We would like to boot them from the network. NFS and iSCSI are two options to accomplish this. Since iSCSI is directly supported by the UEFI of the nodes this is the option which is by far the easiest to implement.&lt;br /&gt;
&lt;br /&gt;
== Hardware ==&lt;br /&gt;
The main boards are S2600KPR from Intel. Four of those are in a 2U chassis, called H2000G. The rails for the installation are called AXXELVRAIL.&lt;br /&gt;
&lt;br /&gt;
== To Do List ==&lt;br /&gt;
&lt;br /&gt;
=== OS setup ===&lt;br /&gt;
* &amp;lt;s&amp;gt;Make the installer recognize the iSCSI LUN as a block device before searching for those&amp;lt;/s&amp;gt;&lt;br /&gt;
* &amp;lt;s&amp;gt;Supply LUN information directly to the UEFI via DHCP&amp;lt;/s&amp;gt;&lt;br /&gt;
* Test what happens on iSCSI connection problems&lt;br /&gt;
** &amp;lt;s&amp;gt;Switch malfunction&amp;lt;/s&amp;gt;&lt;br /&gt;
** &amp;lt;s&amp;gt;iSCSI server reboot&amp;lt;/s&amp;gt;&lt;br /&gt;
** &amp;lt;s&amp;gt;It doesn't take long until the kernel gives up on the iSCSI target. We need to increase this timeout somehow&amp;lt;/s&amp;gt;&lt;br /&gt;
* &amp;lt;s&amp;gt;Make a list of UEFI settings&amp;lt;/s&amp;gt;&lt;br /&gt;
* &amp;lt;s&amp;gt;Prepare LUNs&amp;lt;/s&amp;gt;&lt;br /&gt;
** &amp;lt;s&amp;gt;Use ZFS zvols as storage backend&amp;lt;/s&amp;gt;&lt;br /&gt;
** &amp;lt;s&amp;gt;Find out which backend type is the best (file, block, SCSI passthrough)&amp;lt;/s&amp;gt;&lt;br /&gt;
** &amp;lt;s&amp;gt;Create a separate dataset for easier zfs setting propagation&amp;lt;/s&amp;gt;&lt;br /&gt;
** Leave a README file in the dataset directory for other people to know that it shouldn't be deleted&lt;br /&gt;
** Optimize dataset settings for iSCSI targets&lt;br /&gt;
*** Compression lz4&lt;br /&gt;
*** Larger &amp;lt;code&amp;gt;recordsize&amp;lt;/code&amp;gt; (1M or something)&lt;br /&gt;
** &amp;lt;s&amp;gt;Name the zvols &amp;lt;code&amp;gt;zvol-MM-m&amp;lt;/code&amp;gt; where MM is the chassis number from 01 to 20 and m the node number from 1 to 4&amp;lt;/s&amp;gt;&lt;br /&gt;
* Remove the stupid /swap.img file. This is a general point affecting all computers&lt;br /&gt;
* Adjust puppet&lt;br /&gt;
** &amp;lt;s&amp;gt;Make no scratch available&amp;lt;/s&amp;gt;&lt;br /&gt;
** &amp;lt;s&amp;gt;Make /tmp and friends a tmpfs&amp;lt;/s&amp;gt;&lt;br /&gt;
** &amp;lt;s&amp;gt;Restrict sizes of all tmpfs&amp;lt;/s&amp;gt;&lt;br /&gt;
** Remove unnecessary user stuff? (e.g. x2go, firefox, etc.)&lt;br /&gt;
&lt;br /&gt;
=== Hardware setup ===&lt;br /&gt;
* Install nodes in a rack&lt;br /&gt;
* Buy and set up up switches&lt;br /&gt;
* Lots of cabling&lt;br /&gt;
* Power requirements?&lt;br /&gt;
** Ca. 250W per node, 1kW per chassis&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== The Boot Process ==&lt;br /&gt;
&lt;br /&gt;
=== General layout ===&lt;br /&gt;
The boot process works like this:&lt;br /&gt;
# UEFI stage&lt;br /&gt;
## The UEFI starts up&lt;br /&gt;
## Establishes a connection to the network&lt;br /&gt;
## Runs a DHCP client&lt;br /&gt;
## Gets an IP as well as the iSCSI LUN information directly from our DHCP server&lt;br /&gt;
## Accesses the GPT on the LUN and loads grub&lt;br /&gt;
# grub stage&lt;br /&gt;
## Grub starts&lt;br /&gt;
## Loads the kernel and initrd from the LUN&lt;br /&gt;
## Jumbs into the kernel&lt;br /&gt;
# ramdisk stage&lt;br /&gt;
## The kernel runs&lt;br /&gt;
## Mounts the initrd&lt;br /&gt;
## Inside the initrd our custom iSCSI hook is called&lt;br /&gt;
### Loads the iSCSI kernel module&lt;br /&gt;
### Loads the iSCSI iBFT kernel module&lt;br /&gt;
### Runs &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; which makes the iSCSI LUN available as a block device&lt;br /&gt;
# Normal boot continues&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Further explanations ===&lt;br /&gt;
==== UEFI DHCP ====&lt;br /&gt;
The UEFI can talk to the DHCP server and get an IP. We use DHCP option 17 (root-path) to supply the iSCSI information to the nodes.&lt;br /&gt;
&lt;br /&gt;
==== grub ====&lt;br /&gt;
As of now we don't know why grub even works. It's installed on the efi partition on the LUN, just like a normal computer. The UEFI uses the information on this EFI partition to find grub and run it, and grub then sees the partitions of the installed OS. Either the UEFI somehow emulates a blockdevice for grub such that it can see these partitions and load the kernel and initrd, or grub somehow also does iSCSI by using the iBFT. Anyway, it loads the kernel and initrd, which is the most important part.&lt;br /&gt;
&lt;br /&gt;
==== iBFT ====&lt;br /&gt;
This is really cool. When the UEFI launches and gets iSCSI information from the DHCP server it puts this information into the EFI system table, the Boot Firmware Table (iBFT). Linux can use this information to connect to the same LUN used for the boot process and set up the block device before trying to mount the root partition. The necessary kernel module is called &amp;lt;code&amp;gt;iscsi_ibft&amp;lt;/code&amp;gt;. After the module is loaded its only a matter of calling the &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; program to set up the block device. All this needs to happen before root is mounted, therefore modifications to the initrd are necessary.&lt;br /&gt;
&lt;br /&gt;
==== initrd modifications ====&lt;br /&gt;
During the boot, when the kernel is running and mounted the initrd, the &amp;lt;code&amp;gt;iscsi_ibft&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;iscsi_tcp&amp;lt;/code&amp;gt; modules need to be loaded, after which the block device can be created. To do this a hook needs to be installed in the initramfs. On ubuntu this can be done by putting files into the &amp;lt;code&amp;gt;/etc/initramfs-tools/scripts&amp;lt;/code&amp;gt; directory (or rather subdirectories). The hook for the iSCSI setup need to happen after the network is available, which means the hook needs to be put in the &amp;lt;code&amp;gt;local-top&amp;lt;/code&amp;gt; directory and made executable. The content is:&lt;br /&gt;
&amp;lt;source&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# iSCSI init script&lt;br /&gt;
PREREQ=&amp;quot;&amp;quot;&lt;br /&gt;
prereqs()&lt;br /&gt;
{&lt;br /&gt;
     echo &amp;quot;$PREREQ&amp;quot;&lt;br /&gt;
}&lt;br /&gt;
&lt;br /&gt;
case $1 in&lt;br /&gt;
prereqs)&lt;br /&gt;
     prereqs&lt;br /&gt;
     exit 0&lt;br /&gt;
     ;;&lt;br /&gt;
esac&lt;br /&gt;
&lt;br /&gt;
. /scripts/functions&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Begin iSCSI init&amp;quot;&lt;br /&gt;
&lt;br /&gt;
modprobe iscsi_tcp&lt;br /&gt;
modprobe iscsi_ibft&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Network configuration based on iBFT&amp;quot;&lt;br /&gt;
&lt;br /&gt;
iscsistart -N || panic &amp;quot;Could not initialize iSCSI&amp;quot;&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Waiting to finish iscsistart&amp;quot;&lt;br /&gt;
until iscsistart -b ; do&lt;br /&gt;
    sleep 1&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Of course the script needs to be made executable. After that the initial ramdisk(s) need to be regenerated by calling &amp;lt;code&amp;gt;update-initramfs -u&amp;lt;/code&amp;gt;. When the script is called during the boot process the block device exists for the kernel to mount the root filesystem and continue the boot process as normal.&lt;br /&gt;
&lt;br /&gt;
=== Installing ===&lt;br /&gt;
Unfortunately we can't completely rely on the normal installation process because Ubuntu by default doesn't use the iBFT to set up an iSCSI LUN before the installer searches for block devices to install on. We could get around this by simply switching to a TTY, running &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; and relaunching the installer, which then identified the block device and proceeded as normal. Since this entire stuff runs off LUNs on the network we can potentially completely get around the installation by just preparing enough LUNS with the appropriate data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== iSCSI targets ===&lt;br /&gt;
In the home directory of the ubuntuamdin are in the folder iscsi/ scripts for generating the images in the zfs subvolume &amp;lt;code&amp;gt; pool/iscsi &amp;lt;/code&amp;gt; (called: &amp;lt;code&amp;gt;generate_zfs_images.sh&amp;lt;/code&amp;gt;). Then the iscsi targets are being created with: &amp;lt;code&amp;gt;create_iscsi_targets.sh&amp;lt;/code&amp;gt;&lt;br /&gt;
With &amp;lt;code&amp;gt;destroy_all_zfs_images.sh&amp;lt;/code&amp;gt; everything can be reverted.&lt;br /&gt;
The root filesystem size for each node is 80.8GB.&lt;br /&gt;
Once created, they shouldn't be touched unless nodes are failing and we want to delete unnecessary images.&lt;br /&gt;
&lt;br /&gt;
== UEFI settings ==&lt;br /&gt;
Settings applied after resetting the UEFI to default settings:&lt;br /&gt;
&lt;br /&gt;
Enter the BIOS menu by pressing F2.&lt;br /&gt;
&lt;br /&gt;
In the settings, first RESET everything to default.&lt;br /&gt;
Then those changes should be made:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Main&lt;br /&gt;
  --&amp;gt; Quiet Boot: Disabled&lt;br /&gt;
Advanced&lt;br /&gt;
  --&amp;gt; Processor Configuration&lt;br /&gt;
    --&amp;gt; Intel Hyper Threading: Disabled&lt;br /&gt;
  --&amp;gt; Power &amp;amp; Performance&lt;br /&gt;
    --&amp;gt; CPU Power &amp;amp; Performance: Performance&lt;br /&gt;
    --&amp;gt; CPU HWPM State Control&lt;br /&gt;
      --&amp;gt; Enable CPU HWPM: HWPM Native Mode&lt;br /&gt;
  --&amp;gt; PCI Configuration&lt;br /&gt;
    --&amp;gt; NIC Configuration&lt;br /&gt;
      --&amp;gt; NIC Port 2: Disabled&lt;br /&gt;
Server Management&lt;br /&gt;
  --&amp;gt; Clear System Event Log: Clear it here&lt;br /&gt;
Advanced Boot Options&lt;br /&gt;
  --&amp;gt; System Boot Timeout: 10&lt;br /&gt;
  --&amp;gt; Bood Mode: UEFI&lt;br /&gt;
  --&amp;gt; Boot Option Retry: Enabled&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
REBOOT the system (with e.g. ALT + CMD + DEL)&lt;br /&gt;
Then change again in the BIOS settings:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source&amp;gt;&lt;br /&gt;
Advanced&lt;br /&gt;
  --&amp;gt; PCI Configuration&lt;br /&gt;
    --&amp;gt; UEFI Network Stack&lt;br /&gt;
      --&amp;gt;IPv6 PXE Support: Disabled&lt;br /&gt;
    --&amp;gt; UEFI Option ROM Control&lt;br /&gt;
      --&amp;gt; IPv4 Network Configuration&lt;br /&gt;
        --&amp;gt; Configured: [x]&lt;br /&gt;
        --&amp;gt; Enable DHCP: [x]&lt;br /&gt;
    --&amp;gt; iSCSI Configuration&lt;br /&gt;
      --&amp;gt; iSCSI Initiator Name: Format is: iqn.1886.de.sternwarte.iscsi:meggie-x-y &lt;br /&gt;
                                where x: rack number (01 is top, 20 is the lowest rack); &lt;br /&gt;
                                      y: |-----|-----| for one rack are 4 nodes.&lt;br /&gt;
                                         |  1  |  2  |&lt;br /&gt;
                                         |-----|-----|&lt;br /&gt;
                                         |  3  |  4  |&lt;br /&gt;
                                         |-----|-----|&lt;br /&gt;
                                         eg: iqn.1886.de.sternwarte.iscsi:meggie-11-1 for node in rack 11 and place 1&lt;br /&gt;
    --&amp;gt; Add an Attempt&lt;br /&gt;
      --&amp;gt; MAC-address&lt;br /&gt;
        --&amp;gt; iSCSI Mode: Enabled&lt;br /&gt;
        --&amp;gt; Connection Retry Count: 2&lt;br /&gt;
        --&amp;gt; Connection Establishing Timeout: 5000&lt;br /&gt;
        --&amp;gt; Enable DHCP: [x]&lt;br /&gt;
        --&amp;gt; Get target info via DHCP: [x]&lt;br /&gt;
        --&amp;gt; Authentication Type: None&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
REBOOT and press F6 to enter the boot menu:&lt;br /&gt;
&lt;br /&gt;
Go to IP based booting option (should be the last of the three) &lt;br /&gt;
Select partition:&lt;br /&gt;
&amp;lt;source&amp;gt;&lt;br /&gt;
Ubuntu 20.4 puppet iSCSI autopartition --DO NOT USE IF NO IDEA--&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Be careful, because there exist also other versions like blank.&lt;/div&gt;</summary>
		<author><name>Reinmann</name></author>
	</entry>
	<entry>
		<id>https://www.sternwarte.uni-erlangen.de/wiki/index.php?title=Meggie_Cluster&amp;diff=3830</id>
		<title>Meggie Cluster</title>
		<link rel="alternate" type="text/html" href="https://www.sternwarte.uni-erlangen.de/wiki/index.php?title=Meggie_Cluster&amp;diff=3830"/>
		<updated>2025-08-12T12:18:12Z</updated>

		<summary type="html">&lt;p&gt;Reinmann: /* UEFI settings */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Admin]]&lt;br /&gt;
&lt;br /&gt;
The meggies are diskless nodes inherited from the RRZE used from computing only. We would like to boot them from the network. NFS and iSCSI are two options to accomplish this. Since iSCSI is directly supported by the UEFI of the nodes this is the option which is by far the easiest to implement.&lt;br /&gt;
&lt;br /&gt;
== Hardware ==&lt;br /&gt;
The main boards are S2600KPR from Intel. Four of those are in a 2U chassis, called H2000G. The rails for the installation are called AXXELVRAIL.&lt;br /&gt;
&lt;br /&gt;
== To Do List ==&lt;br /&gt;
&lt;br /&gt;
=== OS setup ===&lt;br /&gt;
* &amp;lt;s&amp;gt;Make the installer recognize the iSCSI LUN as a block device before searching for those&amp;lt;/s&amp;gt;&lt;br /&gt;
* &amp;lt;s&amp;gt;Supply LUN information directly to the UEFI via DHCP&amp;lt;/s&amp;gt;&lt;br /&gt;
* Test what happens on iSCSI connection problems&lt;br /&gt;
** Switch malfunction&lt;br /&gt;
** iSCSI server reboot&lt;br /&gt;
** It doesn't take long until the kernel gives up on the iSCSI target. We need to increase this timeout somehow&lt;br /&gt;
* Make a list of UEFI settings&lt;br /&gt;
* Prepare LUNs&lt;br /&gt;
** Use ZFS zvols as storage backend&lt;br /&gt;
** Find out which backend type is the best (file, block, SCSI passthrough)&lt;br /&gt;
** Create a separate dataset for easier zfs setting propagation&lt;br /&gt;
** Leave a README file in the dataset directory for other people to know that it shouldn't be deleted&lt;br /&gt;
** Optimize dataset settings for iSCSI targets&lt;br /&gt;
*** Compression lz4&lt;br /&gt;
*** Larger &amp;lt;code&amp;gt;recordsize&amp;lt;/code&amp;gt; (1M or something)&lt;br /&gt;
** Name the zvols &amp;lt;code&amp;gt;zvol-MM-m&amp;lt;/code&amp;gt; where MM is the chassis number from 01 to 20 and m the node number from 1 to 4&lt;br /&gt;
* Remove the stupid /swap.img file. This is a general point affecting all computers&lt;br /&gt;
* Adjust puppet&lt;br /&gt;
** &amp;lt;s&amp;gt;Make no scratch available&amp;lt;/s&amp;gt;&lt;br /&gt;
** &amp;lt;s&amp;gt;Make /tmp and friends a tmpfs&amp;lt;/s&amp;gt;&lt;br /&gt;
** &amp;lt;s&amp;gt;Restrict sizes of all tmpfs&amp;lt;/s&amp;gt;&lt;br /&gt;
** Remove unnecessary user stuff? (e.g. x2go, firefox, etc.)&lt;br /&gt;
&lt;br /&gt;
=== Hardware setup ===&lt;br /&gt;
* Install nodes in a rack&lt;br /&gt;
* Buy and set up up switches&lt;br /&gt;
* Lots of cabling&lt;br /&gt;
* Power requirements?&lt;br /&gt;
** Ca. 250W per node, 1kW per chassis&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== The Boot Process ==&lt;br /&gt;
&lt;br /&gt;
=== General layout ===&lt;br /&gt;
The boot process works like this:&lt;br /&gt;
# UEFI stage&lt;br /&gt;
## The UEFI starts up&lt;br /&gt;
## Establishes a connection to the network&lt;br /&gt;
## Runs a DHCP client&lt;br /&gt;
## Gets an IP as well as the iSCSI LUN information directly from our DHCP server&lt;br /&gt;
## Accesses the GPT on the LUN and loads grub&lt;br /&gt;
# grub stage&lt;br /&gt;
## Grub starts&lt;br /&gt;
## Loads the kernel and initrd from the LUN&lt;br /&gt;
## Jumbs into the kernel&lt;br /&gt;
# ramdisk stage&lt;br /&gt;
## The kernel runs&lt;br /&gt;
## Mounts the initrd&lt;br /&gt;
## Inside the initrd our custom iSCSI hook is called&lt;br /&gt;
### Loads the iSCSI kernel module&lt;br /&gt;
### Loads the iSCSI iBFT kernel module&lt;br /&gt;
### Runs &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; which makes the iSCSI LUN available as a block device&lt;br /&gt;
# Normal boot continues&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Further explanations ===&lt;br /&gt;
==== UEFI DHCP ====&lt;br /&gt;
The UEFI can talk to the DHCP server and get an IP. We use DHCP option 17 (root-path) to supply the iSCSI information to the nodes.&lt;br /&gt;
&lt;br /&gt;
==== grub ====&lt;br /&gt;
As of now we don't know why grub even works. It's installed on the efi partition on the LUN, just like a normal computer. The UEFI uses the information on this EFI partition to find grub and run it, and grub then sees the partitions of the installed OS. Either the UEFI somehow emulates a blockdevice for grub such that it can see these partitions and load the kernel and initrd, or grub somehow also does iSCSI by using the iBFT. Anyway, it loads the kernel and initrd, which is the most important part.&lt;br /&gt;
&lt;br /&gt;
==== iBFT ====&lt;br /&gt;
This is really cool. When the UEFI launches and gets iSCSI information from the DHCP server it puts this information into the EFI system table, the Boot Firmware Table (iBFT). Linux can use this information to connect to the same LUN used for the boot process and set up the block device before trying to mount the root partition. The necessary kernel module is called &amp;lt;code&amp;gt;iscsi_ibft&amp;lt;/code&amp;gt;. After the module is loaded its only a matter of calling the &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; program to set up the block device. All this needs to happen before root is mounted, therefore modifications to the initrd are necessary.&lt;br /&gt;
&lt;br /&gt;
==== initrd modifications ====&lt;br /&gt;
During the boot, when the kernel is running and mounted the initrd, the &amp;lt;code&amp;gt;iscsi_ibft&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;iscsi_tcp&amp;lt;/code&amp;gt; modules need to be loaded, after which the block device can be created. To do this a hook needs to be installed in the initramfs. On ubuntu this can be done by putting files into the &amp;lt;code&amp;gt;/etc/initramfs-tools/scripts&amp;lt;/code&amp;gt; directory (or rather subdirectories). The hook for the iSCSI setup need to happen after the network is available, which means the hook needs to be put in the &amp;lt;code&amp;gt;local-top&amp;lt;/code&amp;gt; directory and made executable. The content is:&lt;br /&gt;
&amp;lt;source&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# iSCSI init script&lt;br /&gt;
PREREQ=&amp;quot;&amp;quot;&lt;br /&gt;
prereqs()&lt;br /&gt;
{&lt;br /&gt;
     echo &amp;quot;$PREREQ&amp;quot;&lt;br /&gt;
}&lt;br /&gt;
&lt;br /&gt;
case $1 in&lt;br /&gt;
prereqs)&lt;br /&gt;
     prereqs&lt;br /&gt;
     exit 0&lt;br /&gt;
     ;;&lt;br /&gt;
esac&lt;br /&gt;
&lt;br /&gt;
. /scripts/functions&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Begin iSCSI init&amp;quot;&lt;br /&gt;
&lt;br /&gt;
modprobe iscsi_tcp&lt;br /&gt;
modprobe iscsi_ibft&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Network configuration based on iBFT&amp;quot;&lt;br /&gt;
&lt;br /&gt;
iscsistart -N || panic &amp;quot;Could not initialize iSCSI&amp;quot;&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Waiting to finish iscsistart&amp;quot;&lt;br /&gt;
until iscsistart -b ; do&lt;br /&gt;
    sleep 1&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Of course the script needs to be made executable. After that the initial ramdisk(s) need to be regenerated by calling &amp;lt;code&amp;gt;update-initramfs -u&amp;lt;/code&amp;gt;. When the script is called during the boot process the block device exists for the kernel to mount the root filesystem and continue the boot process as normal.&lt;br /&gt;
&lt;br /&gt;
=== Installing ===&lt;br /&gt;
Unfortunately we can't completely rely on the normal installation process because Ubuntu by default doesn't use the iBFT to set up an iSCSI LUN before the installer searches for block devices to install on. We could get around this by simply switching to a TTY, running &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; and relaunching the installer, which then identified the block device and proceeded as normal. Since this entire stuff runs off LUNs on the network we can potentially completely get around the installation by just preparing enough LUNS with the appropriate data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== iSCSI targets ===&lt;br /&gt;
In the home directory of the ubuntuamdin are in the folder iscsi/ scripts for generating the images in the zfs subvolume &amp;lt;code&amp;gt; pool/iscsi &amp;lt;/code&amp;gt; (called: &amp;lt;code&amp;gt;generate_zfs_images.sh&amp;lt;/code&amp;gt;). Then the iscsi targets are being created with: &amp;lt;code&amp;gt;create_iscsi_targets.sh&amp;lt;/code&amp;gt;&lt;br /&gt;
With &amp;lt;code&amp;gt;destroy_all_zfs_images.sh&amp;lt;/code&amp;gt; everything can be reverted.&lt;br /&gt;
The root filesystem size for each node is 80.8GB.&lt;br /&gt;
Once created, they shouldn't be touched unless nodes are failing and we want to delete unnecessary images.&lt;br /&gt;
&lt;br /&gt;
== UEFI settings ==&lt;br /&gt;
Settings applied after resetting the UEFI to default settings:&lt;br /&gt;
&lt;br /&gt;
Enter the BIOS menu by pressing F2.&lt;br /&gt;
&lt;br /&gt;
In the settings, first RESET everything to default.&lt;br /&gt;
Then those changes should be made:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Main&lt;br /&gt;
  --&amp;gt; Quiet Boot: Disabled&lt;br /&gt;
Advanced&lt;br /&gt;
  --&amp;gt; Processor Configuration&lt;br /&gt;
    --&amp;gt; Intel Hyper Threading: Disabled&lt;br /&gt;
  --&amp;gt; Power &amp;amp; Performance&lt;br /&gt;
    --&amp;gt; CPU Power &amp;amp; Performance: Performance&lt;br /&gt;
    --&amp;gt; CPU HWPM State Control&lt;br /&gt;
      --&amp;gt; Enable CPU HWPM: HWPM Native Mode&lt;br /&gt;
  --&amp;gt; PCI Configuration&lt;br /&gt;
    --&amp;gt; NIC Configuration&lt;br /&gt;
      --&amp;gt; NIC Port 2: Disabled&lt;br /&gt;
Server Management&lt;br /&gt;
  --&amp;gt; Clear System Event Log: Clear it here&lt;br /&gt;
Advanced Boot Options&lt;br /&gt;
  --&amp;gt; System Boot Timeout: 10&lt;br /&gt;
  --&amp;gt; Bood Mode: UEFI&lt;br /&gt;
  --&amp;gt; Boot Option Retry: Enabled&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
REBOOT the system (with e.g. ALT + CMD + DEL)&lt;br /&gt;
Then change again in the BIOS settings:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source&amp;gt;&lt;br /&gt;
Advanced&lt;br /&gt;
  --&amp;gt; PCI Configuration&lt;br /&gt;
    --&amp;gt; UEFI Network Stack&lt;br /&gt;
      --&amp;gt;IPv6 PXE Support: Disabled&lt;br /&gt;
    --&amp;gt; UEFI Option ROM Control&lt;br /&gt;
      --&amp;gt; IPv4 Network Configuration&lt;br /&gt;
        --&amp;gt; Configured: [x]&lt;br /&gt;
        --&amp;gt; Enable DHCP: [x]&lt;br /&gt;
    --&amp;gt; iSCSI Configuration&lt;br /&gt;
      --&amp;gt; iSCSI Initiator Name: Format is: iqn.1886.de.sternwarte.iscsi:meggie-x-y &lt;br /&gt;
                                where x: rack number (01 is top, 20 is the lowest rack); &lt;br /&gt;
                                      y: |-----|-----| for one rack are 4 nodes.&lt;br /&gt;
                                         |  1  |  2  |&lt;br /&gt;
                                         |-----|-----|&lt;br /&gt;
                                         |  3  |  4  |&lt;br /&gt;
                                         |-----|-----|&lt;br /&gt;
                                         eg: iqn.1886.de.sternwarte.iscsi:meggie-11-1 for node in rack 11 and place 1&lt;br /&gt;
    --&amp;gt; Add an Attempt&lt;br /&gt;
      --&amp;gt; MAC-address&lt;br /&gt;
        --&amp;gt; iSCSI Mode: Enabled&lt;br /&gt;
        --&amp;gt; Connection Retry Count: 2&lt;br /&gt;
        --&amp;gt; Connection Establishing Timeout: 5000&lt;br /&gt;
        --&amp;gt; Enable DHCP: [x]&lt;br /&gt;
        --&amp;gt; Get target info via DHCP: [x]&lt;br /&gt;
        --&amp;gt; Authentication Type: None&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
REBOOT and press F6 to enter the boot menu:&lt;br /&gt;
&lt;br /&gt;
Go to IP based booting option (should be the last of the three) &lt;br /&gt;
Select partition:&lt;br /&gt;
&amp;lt;source&amp;gt;&lt;br /&gt;
Ubuntu 20.4 puppet iSCSI autopartition --DO NOT USE IF NO IDEA--&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Be careful, because there exist also other versions like blank.&lt;/div&gt;</summary>
		<author><name>Reinmann</name></author>
	</entry>
	<entry>
		<id>https://www.sternwarte.uni-erlangen.de/wiki/index.php?title=Meggie_Cluster&amp;diff=3829</id>
		<title>Meggie Cluster</title>
		<link rel="alternate" type="text/html" href="https://www.sternwarte.uni-erlangen.de/wiki/index.php?title=Meggie_Cluster&amp;diff=3829"/>
		<updated>2025-08-12T12:05:23Z</updated>

		<summary type="html">&lt;p&gt;Reinmann: /* UEFI settings */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Admin]]&lt;br /&gt;
&lt;br /&gt;
The meggies are diskless nodes inherited from the RRZE used from computing only. We would like to boot them from the network. NFS and iSCSI are two options to accomplish this. Since iSCSI is directly supported by the UEFI of the nodes this is the option which is by far the easiest to implement.&lt;br /&gt;
&lt;br /&gt;
== Hardware ==&lt;br /&gt;
The main boards are S2600KPR from Intel. Four of those are in a 2U chassis, called H2000G. The rails for the installation are called AXXELVRAIL.&lt;br /&gt;
&lt;br /&gt;
== To Do List ==&lt;br /&gt;
&lt;br /&gt;
=== OS setup ===&lt;br /&gt;
* &amp;lt;s&amp;gt;Make the installer recognize the iSCSI LUN as a block device before searching for those&amp;lt;/s&amp;gt;&lt;br /&gt;
* &amp;lt;s&amp;gt;Supply LUN information directly to the UEFI via DHCP&amp;lt;/s&amp;gt;&lt;br /&gt;
* Test what happens on iSCSI connection problems&lt;br /&gt;
** Switch malfunction&lt;br /&gt;
** iSCSI server reboot&lt;br /&gt;
** It doesn't take long until the kernel gives up on the iSCSI target. We need to increase this timeout somehow&lt;br /&gt;
* Make a list of UEFI settings&lt;br /&gt;
* Prepare LUNs&lt;br /&gt;
** Use ZFS zvols as storage backend&lt;br /&gt;
** Find out which backend type is the best (file, block, SCSI passthrough)&lt;br /&gt;
** Create a separate dataset for easier zfs setting propagation&lt;br /&gt;
** Leave a README file in the dataset directory for other people to know that it shouldn't be deleted&lt;br /&gt;
** Optimize dataset settings for iSCSI targets&lt;br /&gt;
*** Compression lz4&lt;br /&gt;
*** Larger &amp;lt;code&amp;gt;recordsize&amp;lt;/code&amp;gt; (1M or something)&lt;br /&gt;
** Name the zvols &amp;lt;code&amp;gt;zvol-MM-m&amp;lt;/code&amp;gt; where MM is the chassis number from 01 to 20 and m the node number from 1 to 4&lt;br /&gt;
* Remove the stupid /swap.img file. This is a general point affecting all computers&lt;br /&gt;
* Adjust puppet&lt;br /&gt;
** &amp;lt;s&amp;gt;Make no scratch available&amp;lt;/s&amp;gt;&lt;br /&gt;
** &amp;lt;s&amp;gt;Make /tmp and friends a tmpfs&amp;lt;/s&amp;gt;&lt;br /&gt;
** &amp;lt;s&amp;gt;Restrict sizes of all tmpfs&amp;lt;/s&amp;gt;&lt;br /&gt;
** Remove unnecessary user stuff? (e.g. x2go, firefox, etc.)&lt;br /&gt;
&lt;br /&gt;
=== Hardware setup ===&lt;br /&gt;
* Install nodes in a rack&lt;br /&gt;
* Buy and set up up switches&lt;br /&gt;
* Lots of cabling&lt;br /&gt;
* Power requirements?&lt;br /&gt;
** Ca. 250W per node, 1kW per chassis&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== The Boot Process ==&lt;br /&gt;
&lt;br /&gt;
=== General layout ===&lt;br /&gt;
The boot process works like this:&lt;br /&gt;
# UEFI stage&lt;br /&gt;
## The UEFI starts up&lt;br /&gt;
## Establishes a connection to the network&lt;br /&gt;
## Runs a DHCP client&lt;br /&gt;
## Gets an IP as well as the iSCSI LUN information directly from our DHCP server&lt;br /&gt;
## Accesses the GPT on the LUN and loads grub&lt;br /&gt;
# grub stage&lt;br /&gt;
## Grub starts&lt;br /&gt;
## Loads the kernel and initrd from the LUN&lt;br /&gt;
## Jumbs into the kernel&lt;br /&gt;
# ramdisk stage&lt;br /&gt;
## The kernel runs&lt;br /&gt;
## Mounts the initrd&lt;br /&gt;
## Inside the initrd our custom iSCSI hook is called&lt;br /&gt;
### Loads the iSCSI kernel module&lt;br /&gt;
### Loads the iSCSI iBFT kernel module&lt;br /&gt;
### Runs &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; which makes the iSCSI LUN available as a block device&lt;br /&gt;
# Normal boot continues&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Further explanations ===&lt;br /&gt;
==== UEFI DHCP ====&lt;br /&gt;
The UEFI can talk to the DHCP server and get an IP. We use DHCP option 17 (root-path) to supply the iSCSI information to the nodes.&lt;br /&gt;
&lt;br /&gt;
==== grub ====&lt;br /&gt;
As of now we don't know why grub even works. It's installed on the efi partition on the LUN, just like a normal computer. The UEFI uses the information on this EFI partition to find grub and run it, and grub then sees the partitions of the installed OS. Either the UEFI somehow emulates a blockdevice for grub such that it can see these partitions and load the kernel and initrd, or grub somehow also does iSCSI by using the iBFT. Anyway, it loads the kernel and initrd, which is the most important part.&lt;br /&gt;
&lt;br /&gt;
==== iBFT ====&lt;br /&gt;
This is really cool. When the UEFI launches and gets iSCSI information from the DHCP server it puts this information into the EFI system table, the Boot Firmware Table (iBFT). Linux can use this information to connect to the same LUN used for the boot process and set up the block device before trying to mount the root partition. The necessary kernel module is called &amp;lt;code&amp;gt;iscsi_ibft&amp;lt;/code&amp;gt;. After the module is loaded its only a matter of calling the &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; program to set up the block device. All this needs to happen before root is mounted, therefore modifications to the initrd are necessary.&lt;br /&gt;
&lt;br /&gt;
==== initrd modifications ====&lt;br /&gt;
During the boot, when the kernel is running and mounted the initrd, the &amp;lt;code&amp;gt;iscsi_ibft&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;iscsi_tcp&amp;lt;/code&amp;gt; modules need to be loaded, after which the block device can be created. To do this a hook needs to be installed in the initramfs. On ubuntu this can be done by putting files into the &amp;lt;code&amp;gt;/etc/initramfs-tools/scripts&amp;lt;/code&amp;gt; directory (or rather subdirectories). The hook for the iSCSI setup need to happen after the network is available, which means the hook needs to be put in the &amp;lt;code&amp;gt;local-top&amp;lt;/code&amp;gt; directory and made executable. The content is:&lt;br /&gt;
&amp;lt;source&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# iSCSI init script&lt;br /&gt;
PREREQ=&amp;quot;&amp;quot;&lt;br /&gt;
prereqs()&lt;br /&gt;
{&lt;br /&gt;
     echo &amp;quot;$PREREQ&amp;quot;&lt;br /&gt;
}&lt;br /&gt;
&lt;br /&gt;
case $1 in&lt;br /&gt;
prereqs)&lt;br /&gt;
     prereqs&lt;br /&gt;
     exit 0&lt;br /&gt;
     ;;&lt;br /&gt;
esac&lt;br /&gt;
&lt;br /&gt;
. /scripts/functions&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Begin iSCSI init&amp;quot;&lt;br /&gt;
&lt;br /&gt;
modprobe iscsi_tcp&lt;br /&gt;
modprobe iscsi_ibft&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Network configuration based on iBFT&amp;quot;&lt;br /&gt;
&lt;br /&gt;
iscsistart -N || panic &amp;quot;Could not initialize iSCSI&amp;quot;&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Waiting to finish iscsistart&amp;quot;&lt;br /&gt;
until iscsistart -b ; do&lt;br /&gt;
    sleep 1&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Of course the script needs to be made executable. After that the initial ramdisk(s) need to be regenerated by calling &amp;lt;code&amp;gt;update-initramfs -u&amp;lt;/code&amp;gt;. When the script is called during the boot process the block device exists for the kernel to mount the root filesystem and continue the boot process as normal.&lt;br /&gt;
&lt;br /&gt;
=== Installing ===&lt;br /&gt;
Unfortunately we can't completely rely on the normal installation process because Ubuntu by default doesn't use the iBFT to set up an iSCSI LUN before the installer searches for block devices to install on. We could get around this by simply switching to a TTY, running &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; and relaunching the installer, which then identified the block device and proceeded as normal. Since this entire stuff runs off LUNs on the network we can potentially completely get around the installation by just preparing enough LUNS with the appropriate data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== iSCSI targets ===&lt;br /&gt;
In the home directory of the ubuntuamdin are in the folder iscsi/ scripts for generating the images in the zfs subvolume &amp;lt;code&amp;gt; pool/iscsi &amp;lt;/code&amp;gt; (called: &amp;lt;code&amp;gt;generate_zfs_images.sh&amp;lt;/code&amp;gt;). Then the iscsi targets are being created with: &amp;lt;code&amp;gt;create_iscsi_targets.sh&amp;lt;/code&amp;gt;&lt;br /&gt;
With &amp;lt;code&amp;gt;destroy_all_zfs_images.sh&amp;lt;/code&amp;gt; everything can be reverted.&lt;br /&gt;
The root filesystem size for each node is 80.8GB.&lt;br /&gt;
Once created, they shouldn't be touched unless nodes are failing and we want to delete unnecessary images.&lt;br /&gt;
&lt;br /&gt;
== UEFI settings ==&lt;br /&gt;
Settings applied after resetting the UEFI to default settings:&lt;br /&gt;
&lt;br /&gt;
Enter the BIOS menu by pressing F2.&lt;br /&gt;
&lt;br /&gt;
In the settings, first RESET everything to default.&lt;br /&gt;
Then those changes should be made:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Main&lt;br /&gt;
  --&amp;gt; Quiet Boot: Disabled&lt;br /&gt;
Advanced&lt;br /&gt;
  --&amp;gt; Processor Configuration&lt;br /&gt;
    --&amp;gt; Intel Hyper Threading: Disabled&lt;br /&gt;
  --&amp;gt; Power &amp;amp; Performance&lt;br /&gt;
    --&amp;gt; CPU Power &amp;amp; Performance: Performance&lt;br /&gt;
    --&amp;gt; CPU HWPM State Control&lt;br /&gt;
      --&amp;gt; Enable CPU HWPM: HWPM Native Mode&lt;br /&gt;
  --&amp;gt; PCI Configuration&lt;br /&gt;
    --&amp;gt; NIC Configuration&lt;br /&gt;
      --&amp;gt; NIC Port 2: Disabled&lt;br /&gt;
Server Management&lt;br /&gt;
  --&amp;gt; Clear System Event Log: Clear it here&lt;br /&gt;
Advanced Boot Options&lt;br /&gt;
  --&amp;gt; System Boot Timeout: 10&lt;br /&gt;
  --&amp;gt; Bood Mode: UEFI&lt;br /&gt;
  --&amp;gt; Boot Option Retry: Enabled&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
REBOOT the system (with e.g. ALT + CMD + DEL)&lt;br /&gt;
Then change again in the BIOS settings:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;source&amp;gt;&lt;br /&gt;
Advanced&lt;br /&gt;
  --&amp;gt; PCI Configuration&lt;br /&gt;
    --&amp;gt; UEFI Network Stack&lt;br /&gt;
      --&amp;gt;IPv6 PXE Support: Disabled&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;/div&gt;</summary>
		<author><name>Reinmann</name></author>
	</entry>
	<entry>
		<id>https://www.sternwarte.uni-erlangen.de/wiki/index.php?title=Meggie_Cluster&amp;diff=3828</id>
		<title>Meggie Cluster</title>
		<link rel="alternate" type="text/html" href="https://www.sternwarte.uni-erlangen.de/wiki/index.php?title=Meggie_Cluster&amp;diff=3828"/>
		<updated>2025-08-12T11:56:04Z</updated>

		<summary type="html">&lt;p&gt;Reinmann: /* iSCSI targets */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Admin]]&lt;br /&gt;
&lt;br /&gt;
The meggies are diskless nodes inherited from the RRZE used from computing only. We would like to boot them from the network. NFS and iSCSI are two options to accomplish this. Since iSCSI is directly supported by the UEFI of the nodes this is the option which is by far the easiest to implement.&lt;br /&gt;
&lt;br /&gt;
== Hardware ==&lt;br /&gt;
The main boards are S2600KPR from Intel. Four of those are in a 2U chassis, called H2000G. The rails for the installation are called AXXELVRAIL.&lt;br /&gt;
&lt;br /&gt;
== To Do List ==&lt;br /&gt;
&lt;br /&gt;
=== OS setup ===&lt;br /&gt;
* &amp;lt;s&amp;gt;Make the installer recognize the iSCSI LUN as a block device before searching for those&amp;lt;/s&amp;gt;&lt;br /&gt;
* &amp;lt;s&amp;gt;Supply LUN information directly to the UEFI via DHCP&amp;lt;/s&amp;gt;&lt;br /&gt;
* Test what happens on iSCSI connection problems&lt;br /&gt;
** Switch malfunction&lt;br /&gt;
** iSCSI server reboot&lt;br /&gt;
** It doesn't take long until the kernel gives up on the iSCSI target. We need to increase this timeout somehow&lt;br /&gt;
* Make a list of UEFI settings&lt;br /&gt;
* Prepare LUNs&lt;br /&gt;
** Use ZFS zvols as storage backend&lt;br /&gt;
** Find out which backend type is the best (file, block, SCSI passthrough)&lt;br /&gt;
** Create a separate dataset for easier zfs setting propagation&lt;br /&gt;
** Leave a README file in the dataset directory for other people to know that it shouldn't be deleted&lt;br /&gt;
** Optimize dataset settings for iSCSI targets&lt;br /&gt;
*** Compression lz4&lt;br /&gt;
*** Larger &amp;lt;code&amp;gt;recordsize&amp;lt;/code&amp;gt; (1M or something)&lt;br /&gt;
** Name the zvols &amp;lt;code&amp;gt;zvol-MM-m&amp;lt;/code&amp;gt; where MM is the chassis number from 01 to 20 and m the node number from 1 to 4&lt;br /&gt;
* Remove the stupid /swap.img file. This is a general point affecting all computers&lt;br /&gt;
* Adjust puppet&lt;br /&gt;
** &amp;lt;s&amp;gt;Make no scratch available&amp;lt;/s&amp;gt;&lt;br /&gt;
** &amp;lt;s&amp;gt;Make /tmp and friends a tmpfs&amp;lt;/s&amp;gt;&lt;br /&gt;
** &amp;lt;s&amp;gt;Restrict sizes of all tmpfs&amp;lt;/s&amp;gt;&lt;br /&gt;
** Remove unnecessary user stuff? (e.g. x2go, firefox, etc.)&lt;br /&gt;
&lt;br /&gt;
=== Hardware setup ===&lt;br /&gt;
* Install nodes in a rack&lt;br /&gt;
* Buy and set up up switches&lt;br /&gt;
* Lots of cabling&lt;br /&gt;
* Power requirements?&lt;br /&gt;
** Ca. 250W per node, 1kW per chassis&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== The Boot Process ==&lt;br /&gt;
&lt;br /&gt;
=== General layout ===&lt;br /&gt;
The boot process works like this:&lt;br /&gt;
# UEFI stage&lt;br /&gt;
## The UEFI starts up&lt;br /&gt;
## Establishes a connection to the network&lt;br /&gt;
## Runs a DHCP client&lt;br /&gt;
## Gets an IP as well as the iSCSI LUN information directly from our DHCP server&lt;br /&gt;
## Accesses the GPT on the LUN and loads grub&lt;br /&gt;
# grub stage&lt;br /&gt;
## Grub starts&lt;br /&gt;
## Loads the kernel and initrd from the LUN&lt;br /&gt;
## Jumbs into the kernel&lt;br /&gt;
# ramdisk stage&lt;br /&gt;
## The kernel runs&lt;br /&gt;
## Mounts the initrd&lt;br /&gt;
## Inside the initrd our custom iSCSI hook is called&lt;br /&gt;
### Loads the iSCSI kernel module&lt;br /&gt;
### Loads the iSCSI iBFT kernel module&lt;br /&gt;
### Runs &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; which makes the iSCSI LUN available as a block device&lt;br /&gt;
# Normal boot continues&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Further explanations ===&lt;br /&gt;
==== UEFI DHCP ====&lt;br /&gt;
The UEFI can talk to the DHCP server and get an IP. We use DHCP option 17 (root-path) to supply the iSCSI information to the nodes.&lt;br /&gt;
&lt;br /&gt;
==== grub ====&lt;br /&gt;
As of now we don't know why grub even works. It's installed on the efi partition on the LUN, just like a normal computer. The UEFI uses the information on this EFI partition to find grub and run it, and grub then sees the partitions of the installed OS. Either the UEFI somehow emulates a blockdevice for grub such that it can see these partitions and load the kernel and initrd, or grub somehow also does iSCSI by using the iBFT. Anyway, it loads the kernel and initrd, which is the most important part.&lt;br /&gt;
&lt;br /&gt;
==== iBFT ====&lt;br /&gt;
This is really cool. When the UEFI launches and gets iSCSI information from the DHCP server it puts this information into the EFI system table, the Boot Firmware Table (iBFT). Linux can use this information to connect to the same LUN used for the boot process and set up the block device before trying to mount the root partition. The necessary kernel module is called &amp;lt;code&amp;gt;iscsi_ibft&amp;lt;/code&amp;gt;. After the module is loaded its only a matter of calling the &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; program to set up the block device. All this needs to happen before root is mounted, therefore modifications to the initrd are necessary.&lt;br /&gt;
&lt;br /&gt;
==== initrd modifications ====&lt;br /&gt;
During the boot, when the kernel is running and mounted the initrd, the &amp;lt;code&amp;gt;iscsi_ibft&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;iscsi_tcp&amp;lt;/code&amp;gt; modules need to be loaded, after which the block device can be created. To do this a hook needs to be installed in the initramfs. On ubuntu this can be done by putting files into the &amp;lt;code&amp;gt;/etc/initramfs-tools/scripts&amp;lt;/code&amp;gt; directory (or rather subdirectories). The hook for the iSCSI setup need to happen after the network is available, which means the hook needs to be put in the &amp;lt;code&amp;gt;local-top&amp;lt;/code&amp;gt; directory and made executable. The content is:&lt;br /&gt;
&amp;lt;source&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# iSCSI init script&lt;br /&gt;
PREREQ=&amp;quot;&amp;quot;&lt;br /&gt;
prereqs()&lt;br /&gt;
{&lt;br /&gt;
     echo &amp;quot;$PREREQ&amp;quot;&lt;br /&gt;
}&lt;br /&gt;
&lt;br /&gt;
case $1 in&lt;br /&gt;
prereqs)&lt;br /&gt;
     prereqs&lt;br /&gt;
     exit 0&lt;br /&gt;
     ;;&lt;br /&gt;
esac&lt;br /&gt;
&lt;br /&gt;
. /scripts/functions&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Begin iSCSI init&amp;quot;&lt;br /&gt;
&lt;br /&gt;
modprobe iscsi_tcp&lt;br /&gt;
modprobe iscsi_ibft&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Network configuration based on iBFT&amp;quot;&lt;br /&gt;
&lt;br /&gt;
iscsistart -N || panic &amp;quot;Could not initialize iSCSI&amp;quot;&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Waiting to finish iscsistart&amp;quot;&lt;br /&gt;
until iscsistart -b ; do&lt;br /&gt;
    sleep 1&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Of course the script needs to be made executable. After that the initial ramdisk(s) need to be regenerated by calling &amp;lt;code&amp;gt;update-initramfs -u&amp;lt;/code&amp;gt;. When the script is called during the boot process the block device exists for the kernel to mount the root filesystem and continue the boot process as normal.&lt;br /&gt;
&lt;br /&gt;
=== Installing ===&lt;br /&gt;
Unfortunately we can't completely rely on the normal installation process because Ubuntu by default doesn't use the iBFT to set up an iSCSI LUN before the installer searches for block devices to install on. We could get around this by simply switching to a TTY, running &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; and relaunching the installer, which then identified the block device and proceeded as normal. Since this entire stuff runs off LUNs on the network we can potentially completely get around the installation by just preparing enough LUNS with the appropriate data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== iSCSI targets ===&lt;br /&gt;
In the home directory of the ubuntuamdin are in the folder iscsi/ scripts for generating the images in the zfs subvolume &amp;lt;code&amp;gt; pool/iscsi &amp;lt;/code&amp;gt; (called: &amp;lt;code&amp;gt;generate_zfs_images.sh&amp;lt;/code&amp;gt;). Then the iscsi targets are being created with: &amp;lt;code&amp;gt;create_iscsi_targets.sh&amp;lt;/code&amp;gt;&lt;br /&gt;
With &amp;lt;code&amp;gt;destroy_all_zfs_images.sh&amp;lt;/code&amp;gt; everything can be reverted.&lt;br /&gt;
The root filesystem size for each node is 80.8GB.&lt;br /&gt;
Once created, they shouldn't be touched unless nodes are failing and we want to delete unnecessary images.&lt;br /&gt;
&lt;br /&gt;
== UEFI settings ==&lt;br /&gt;
Settings applied after resetting the UEFI to default settings:&lt;br /&gt;
* ...&lt;/div&gt;</summary>
		<author><name>Reinmann</name></author>
	</entry>
	<entry>
		<id>https://www.sternwarte.uni-erlangen.de/wiki/index.php?title=Meggie_Cluster&amp;diff=3827</id>
		<title>Meggie Cluster</title>
		<link rel="alternate" type="text/html" href="https://www.sternwarte.uni-erlangen.de/wiki/index.php?title=Meggie_Cluster&amp;diff=3827"/>
		<updated>2025-08-12T11:54:42Z</updated>

		<summary type="html">&lt;p&gt;Reinmann: /* iSCSI targets */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Admin]]&lt;br /&gt;
&lt;br /&gt;
The meggies are diskless nodes inherited from the RRZE used from computing only. We would like to boot them from the network. NFS and iSCSI are two options to accomplish this. Since iSCSI is directly supported by the UEFI of the nodes this is the option which is by far the easiest to implement.&lt;br /&gt;
&lt;br /&gt;
== Hardware ==&lt;br /&gt;
The main boards are S2600KPR from Intel. Four of those are in a 2U chassis, called H2000G. The rails for the installation are called AXXELVRAIL.&lt;br /&gt;
&lt;br /&gt;
== To Do List ==&lt;br /&gt;
&lt;br /&gt;
=== OS setup ===&lt;br /&gt;
* &amp;lt;s&amp;gt;Make the installer recognize the iSCSI LUN as a block device before searching for those&amp;lt;/s&amp;gt;&lt;br /&gt;
* &amp;lt;s&amp;gt;Supply LUN information directly to the UEFI via DHCP&amp;lt;/s&amp;gt;&lt;br /&gt;
* Test what happens on iSCSI connection problems&lt;br /&gt;
** Switch malfunction&lt;br /&gt;
** iSCSI server reboot&lt;br /&gt;
** It doesn't take long until the kernel gives up on the iSCSI target. We need to increase this timeout somehow&lt;br /&gt;
* Make a list of UEFI settings&lt;br /&gt;
* Prepare LUNs&lt;br /&gt;
** Use ZFS zvols as storage backend&lt;br /&gt;
** Find out which backend type is the best (file, block, SCSI passthrough)&lt;br /&gt;
** Create a separate dataset for easier zfs setting propagation&lt;br /&gt;
** Leave a README file in the dataset directory for other people to know that it shouldn't be deleted&lt;br /&gt;
** Optimize dataset settings for iSCSI targets&lt;br /&gt;
*** Compression lz4&lt;br /&gt;
*** Larger &amp;lt;code&amp;gt;recordsize&amp;lt;/code&amp;gt; (1M or something)&lt;br /&gt;
** Name the zvols &amp;lt;code&amp;gt;zvol-MM-m&amp;lt;/code&amp;gt; where MM is the chassis number from 01 to 20 and m the node number from 1 to 4&lt;br /&gt;
* Remove the stupid /swap.img file. This is a general point affecting all computers&lt;br /&gt;
* Adjust puppet&lt;br /&gt;
** &amp;lt;s&amp;gt;Make no scratch available&amp;lt;/s&amp;gt;&lt;br /&gt;
** &amp;lt;s&amp;gt;Make /tmp and friends a tmpfs&amp;lt;/s&amp;gt;&lt;br /&gt;
** &amp;lt;s&amp;gt;Restrict sizes of all tmpfs&amp;lt;/s&amp;gt;&lt;br /&gt;
** Remove unnecessary user stuff? (e.g. x2go, firefox, etc.)&lt;br /&gt;
&lt;br /&gt;
=== Hardware setup ===&lt;br /&gt;
* Install nodes in a rack&lt;br /&gt;
* Buy and set up up switches&lt;br /&gt;
* Lots of cabling&lt;br /&gt;
* Power requirements?&lt;br /&gt;
** Ca. 250W per node, 1kW per chassis&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== The Boot Process ==&lt;br /&gt;
&lt;br /&gt;
=== General layout ===&lt;br /&gt;
The boot process works like this:&lt;br /&gt;
# UEFI stage&lt;br /&gt;
## The UEFI starts up&lt;br /&gt;
## Establishes a connection to the network&lt;br /&gt;
## Runs a DHCP client&lt;br /&gt;
## Gets an IP as well as the iSCSI LUN information directly from our DHCP server&lt;br /&gt;
## Accesses the GPT on the LUN and loads grub&lt;br /&gt;
# grub stage&lt;br /&gt;
## Grub starts&lt;br /&gt;
## Loads the kernel and initrd from the LUN&lt;br /&gt;
## Jumbs into the kernel&lt;br /&gt;
# ramdisk stage&lt;br /&gt;
## The kernel runs&lt;br /&gt;
## Mounts the initrd&lt;br /&gt;
## Inside the initrd our custom iSCSI hook is called&lt;br /&gt;
### Loads the iSCSI kernel module&lt;br /&gt;
### Loads the iSCSI iBFT kernel module&lt;br /&gt;
### Runs &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; which makes the iSCSI LUN available as a block device&lt;br /&gt;
# Normal boot continues&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Further explanations ===&lt;br /&gt;
==== UEFI DHCP ====&lt;br /&gt;
The UEFI can talk to the DHCP server and get an IP. We use DHCP option 17 (root-path) to supply the iSCSI information to the nodes.&lt;br /&gt;
&lt;br /&gt;
==== grub ====&lt;br /&gt;
As of now we don't know why grub even works. It's installed on the efi partition on the LUN, just like a normal computer. The UEFI uses the information on this EFI partition to find grub and run it, and grub then sees the partitions of the installed OS. Either the UEFI somehow emulates a blockdevice for grub such that it can see these partitions and load the kernel and initrd, or grub somehow also does iSCSI by using the iBFT. Anyway, it loads the kernel and initrd, which is the most important part.&lt;br /&gt;
&lt;br /&gt;
==== iBFT ====&lt;br /&gt;
This is really cool. When the UEFI launches and gets iSCSI information from the DHCP server it puts this information into the EFI system table, the Boot Firmware Table (iBFT). Linux can use this information to connect to the same LUN used for the boot process and set up the block device before trying to mount the root partition. The necessary kernel module is called &amp;lt;code&amp;gt;iscsi_ibft&amp;lt;/code&amp;gt;. After the module is loaded its only a matter of calling the &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; program to set up the block device. All this needs to happen before root is mounted, therefore modifications to the initrd are necessary.&lt;br /&gt;
&lt;br /&gt;
==== initrd modifications ====&lt;br /&gt;
During the boot, when the kernel is running and mounted the initrd, the &amp;lt;code&amp;gt;iscsi_ibft&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;iscsi_tcp&amp;lt;/code&amp;gt; modules need to be loaded, after which the block device can be created. To do this a hook needs to be installed in the initramfs. On ubuntu this can be done by putting files into the &amp;lt;code&amp;gt;/etc/initramfs-tools/scripts&amp;lt;/code&amp;gt; directory (or rather subdirectories). The hook for the iSCSI setup need to happen after the network is available, which means the hook needs to be put in the &amp;lt;code&amp;gt;local-top&amp;lt;/code&amp;gt; directory and made executable. The content is:&lt;br /&gt;
&amp;lt;source&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# iSCSI init script&lt;br /&gt;
PREREQ=&amp;quot;&amp;quot;&lt;br /&gt;
prereqs()&lt;br /&gt;
{&lt;br /&gt;
     echo &amp;quot;$PREREQ&amp;quot;&lt;br /&gt;
}&lt;br /&gt;
&lt;br /&gt;
case $1 in&lt;br /&gt;
prereqs)&lt;br /&gt;
     prereqs&lt;br /&gt;
     exit 0&lt;br /&gt;
     ;;&lt;br /&gt;
esac&lt;br /&gt;
&lt;br /&gt;
. /scripts/functions&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Begin iSCSI init&amp;quot;&lt;br /&gt;
&lt;br /&gt;
modprobe iscsi_tcp&lt;br /&gt;
modprobe iscsi_ibft&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Network configuration based on iBFT&amp;quot;&lt;br /&gt;
&lt;br /&gt;
iscsistart -N || panic &amp;quot;Could not initialize iSCSI&amp;quot;&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Waiting to finish iscsistart&amp;quot;&lt;br /&gt;
until iscsistart -b ; do&lt;br /&gt;
    sleep 1&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Of course the script needs to be made executable. After that the initial ramdisk(s) need to be regenerated by calling &amp;lt;code&amp;gt;update-initramfs -u&amp;lt;/code&amp;gt;. When the script is called during the boot process the block device exists for the kernel to mount the root filesystem and continue the boot process as normal.&lt;br /&gt;
&lt;br /&gt;
=== Installing ===&lt;br /&gt;
Unfortunately we can't completely rely on the normal installation process because Ubuntu by default doesn't use the iBFT to set up an iSCSI LUN before the installer searches for block devices to install on. We could get around this by simply switching to a TTY, running &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; and relaunching the installer, which then identified the block device and proceeded as normal. Since this entire stuff runs off LUNs on the network we can potentially completely get around the installation by just preparing enough LUNS with the appropriate data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== iSCSI targets ===&lt;br /&gt;
In the home directory of the ubuntuamdin are in the folder iscsi/ scripts for generating the images in the zfs subvolum &amp;lt;code&amp;gt; pool/iscsi &amp;lt;/code&amp;gt; (called: &amp;lt;code&amp;gt;generate_zfs_images.sh&amp;lt;/code&amp;gt;). Then the iscsi targets are being created with: &amp;lt;code&amp;gt;create_iscsi_targets.sh&amp;lt;/code&amp;gt;&lt;br /&gt;
With &amp;lt;code&amp;gt;destroy_all_zfs_images.sh&amp;lt;/code&amp;gt; everything can be reverted.&lt;br /&gt;
The root filesystem size for each node is 80.8GB.&lt;br /&gt;
&lt;br /&gt;
== UEFI settings ==&lt;br /&gt;
Settings applied after resetting the UEFI to default settings:&lt;br /&gt;
* ...&lt;/div&gt;</summary>
		<author><name>Reinmann</name></author>
	</entry>
	<entry>
		<id>https://www.sternwarte.uni-erlangen.de/wiki/index.php?title=Meggie_Cluster&amp;diff=3826</id>
		<title>Meggie Cluster</title>
		<link rel="alternate" type="text/html" href="https://www.sternwarte.uni-erlangen.de/wiki/index.php?title=Meggie_Cluster&amp;diff=3826"/>
		<updated>2025-08-12T11:53:57Z</updated>

		<summary type="html">&lt;p&gt;Reinmann: /* iSCSI targets */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Admin]]&lt;br /&gt;
&lt;br /&gt;
The meggies are diskless nodes inherited from the RRZE used from computing only. We would like to boot them from the network. NFS and iSCSI are two options to accomplish this. Since iSCSI is directly supported by the UEFI of the nodes this is the option which is by far the easiest to implement.&lt;br /&gt;
&lt;br /&gt;
== Hardware ==&lt;br /&gt;
The main boards are S2600KPR from Intel. Four of those are in a 2U chassis, called H2000G. The rails for the installation are called AXXELVRAIL.&lt;br /&gt;
&lt;br /&gt;
== To Do List ==&lt;br /&gt;
&lt;br /&gt;
=== OS setup ===&lt;br /&gt;
* &amp;lt;s&amp;gt;Make the installer recognize the iSCSI LUN as a block device before searching for those&amp;lt;/s&amp;gt;&lt;br /&gt;
* &amp;lt;s&amp;gt;Supply LUN information directly to the UEFI via DHCP&amp;lt;/s&amp;gt;&lt;br /&gt;
* Test what happens on iSCSI connection problems&lt;br /&gt;
** Switch malfunction&lt;br /&gt;
** iSCSI server reboot&lt;br /&gt;
** It doesn't take long until the kernel gives up on the iSCSI target. We need to increase this timeout somehow&lt;br /&gt;
* Make a list of UEFI settings&lt;br /&gt;
* Prepare LUNs&lt;br /&gt;
** Use ZFS zvols as storage backend&lt;br /&gt;
** Find out which backend type is the best (file, block, SCSI passthrough)&lt;br /&gt;
** Create a separate dataset for easier zfs setting propagation&lt;br /&gt;
** Leave a README file in the dataset directory for other people to know that it shouldn't be deleted&lt;br /&gt;
** Optimize dataset settings for iSCSI targets&lt;br /&gt;
*** Compression lz4&lt;br /&gt;
*** Larger &amp;lt;code&amp;gt;recordsize&amp;lt;/code&amp;gt; (1M or something)&lt;br /&gt;
** Name the zvols &amp;lt;code&amp;gt;zvol-MM-m&amp;lt;/code&amp;gt; where MM is the chassis number from 01 to 20 and m the node number from 1 to 4&lt;br /&gt;
* Remove the stupid /swap.img file. This is a general point affecting all computers&lt;br /&gt;
* Adjust puppet&lt;br /&gt;
** &amp;lt;s&amp;gt;Make no scratch available&amp;lt;/s&amp;gt;&lt;br /&gt;
** &amp;lt;s&amp;gt;Make /tmp and friends a tmpfs&amp;lt;/s&amp;gt;&lt;br /&gt;
** &amp;lt;s&amp;gt;Restrict sizes of all tmpfs&amp;lt;/s&amp;gt;&lt;br /&gt;
** Remove unnecessary user stuff? (e.g. x2go, firefox, etc.)&lt;br /&gt;
&lt;br /&gt;
=== Hardware setup ===&lt;br /&gt;
* Install nodes in a rack&lt;br /&gt;
* Buy and set up up switches&lt;br /&gt;
* Lots of cabling&lt;br /&gt;
* Power requirements?&lt;br /&gt;
** Ca. 250W per node, 1kW per chassis&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== The Boot Process ==&lt;br /&gt;
&lt;br /&gt;
=== General layout ===&lt;br /&gt;
The boot process works like this:&lt;br /&gt;
# UEFI stage&lt;br /&gt;
## The UEFI starts up&lt;br /&gt;
## Establishes a connection to the network&lt;br /&gt;
## Runs a DHCP client&lt;br /&gt;
## Gets an IP as well as the iSCSI LUN information directly from our DHCP server&lt;br /&gt;
## Accesses the GPT on the LUN and loads grub&lt;br /&gt;
# grub stage&lt;br /&gt;
## Grub starts&lt;br /&gt;
## Loads the kernel and initrd from the LUN&lt;br /&gt;
## Jumbs into the kernel&lt;br /&gt;
# ramdisk stage&lt;br /&gt;
## The kernel runs&lt;br /&gt;
## Mounts the initrd&lt;br /&gt;
## Inside the initrd our custom iSCSI hook is called&lt;br /&gt;
### Loads the iSCSI kernel module&lt;br /&gt;
### Loads the iSCSI iBFT kernel module&lt;br /&gt;
### Runs &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; which makes the iSCSI LUN available as a block device&lt;br /&gt;
# Normal boot continues&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Further explanations ===&lt;br /&gt;
==== UEFI DHCP ====&lt;br /&gt;
The UEFI can talk to the DHCP server and get an IP. We use DHCP option 17 (root-path) to supply the iSCSI information to the nodes.&lt;br /&gt;
&lt;br /&gt;
==== grub ====&lt;br /&gt;
As of now we don't know why grub even works. It's installed on the efi partition on the LUN, just like a normal computer. The UEFI uses the information on this EFI partition to find grub and run it, and grub then sees the partitions of the installed OS. Either the UEFI somehow emulates a blockdevice for grub such that it can see these partitions and load the kernel and initrd, or grub somehow also does iSCSI by using the iBFT. Anyway, it loads the kernel and initrd, which is the most important part.&lt;br /&gt;
&lt;br /&gt;
==== iBFT ====&lt;br /&gt;
This is really cool. When the UEFI launches and gets iSCSI information from the DHCP server it puts this information into the EFI system table, the Boot Firmware Table (iBFT). Linux can use this information to connect to the same LUN used for the boot process and set up the block device before trying to mount the root partition. The necessary kernel module is called &amp;lt;code&amp;gt;iscsi_ibft&amp;lt;/code&amp;gt;. After the module is loaded its only a matter of calling the &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; program to set up the block device. All this needs to happen before root is mounted, therefore modifications to the initrd are necessary.&lt;br /&gt;
&lt;br /&gt;
==== initrd modifications ====&lt;br /&gt;
During the boot, when the kernel is running and mounted the initrd, the &amp;lt;code&amp;gt;iscsi_ibft&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;iscsi_tcp&amp;lt;/code&amp;gt; modules need to be loaded, after which the block device can be created. To do this a hook needs to be installed in the initramfs. On ubuntu this can be done by putting files into the &amp;lt;code&amp;gt;/etc/initramfs-tools/scripts&amp;lt;/code&amp;gt; directory (or rather subdirectories). The hook for the iSCSI setup need to happen after the network is available, which means the hook needs to be put in the &amp;lt;code&amp;gt;local-top&amp;lt;/code&amp;gt; directory and made executable. The content is:&lt;br /&gt;
&amp;lt;source&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# iSCSI init script&lt;br /&gt;
PREREQ=&amp;quot;&amp;quot;&lt;br /&gt;
prereqs()&lt;br /&gt;
{&lt;br /&gt;
     echo &amp;quot;$PREREQ&amp;quot;&lt;br /&gt;
}&lt;br /&gt;
&lt;br /&gt;
case $1 in&lt;br /&gt;
prereqs)&lt;br /&gt;
     prereqs&lt;br /&gt;
     exit 0&lt;br /&gt;
     ;;&lt;br /&gt;
esac&lt;br /&gt;
&lt;br /&gt;
. /scripts/functions&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Begin iSCSI init&amp;quot;&lt;br /&gt;
&lt;br /&gt;
modprobe iscsi_tcp&lt;br /&gt;
modprobe iscsi_ibft&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Network configuration based on iBFT&amp;quot;&lt;br /&gt;
&lt;br /&gt;
iscsistart -N || panic &amp;quot;Could not initialize iSCSI&amp;quot;&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Waiting to finish iscsistart&amp;quot;&lt;br /&gt;
until iscsistart -b ; do&lt;br /&gt;
    sleep 1&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Of course the script needs to be made executable. After that the initial ramdisk(s) need to be regenerated by calling &amp;lt;code&amp;gt;update-initramfs -u&amp;lt;/code&amp;gt;. When the script is called during the boot process the block device exists for the kernel to mount the root filesystem and continue the boot process as normal.&lt;br /&gt;
&lt;br /&gt;
=== Installing ===&lt;br /&gt;
Unfortunately we can't completely rely on the normal installation process because Ubuntu by default doesn't use the iBFT to set up an iSCSI LUN before the installer searches for block devices to install on. We could get around this by simply switching to a TTY, running &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; and relaunching the installer, which then identified the block device and proceeded as normal. Since this entire stuff runs off LUNs on the network we can potentially completely get around the installation by just preparing enough LUNS with the appropriate data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== iSCSI targets ===&lt;br /&gt;
In the home directory of the ubuntuamdin are in the folder iscsi/ scripts for generating the images in the zfs subvolum &amp;lt;source&amp;gt; pool/iscsi &amp;lt;/source&amp;gt; (called: generate_zfs_images.sh). Then the iscsi targets are being created with: create_iscsi_targets.sh&lt;br /&gt;
With destroy_all_zfs_images.sh everything can be reverted.&lt;br /&gt;
The root filesystem size for each node is 80.8GB.&lt;br /&gt;
&lt;br /&gt;
== UEFI settings ==&lt;br /&gt;
Settings applied after resetting the UEFI to default settings:&lt;br /&gt;
* ...&lt;/div&gt;</summary>
		<author><name>Reinmann</name></author>
	</entry>
	<entry>
		<id>https://www.sternwarte.uni-erlangen.de/wiki/index.php?title=Meggie_Cluster&amp;diff=3825</id>
		<title>Meggie Cluster</title>
		<link rel="alternate" type="text/html" href="https://www.sternwarte.uni-erlangen.de/wiki/index.php?title=Meggie_Cluster&amp;diff=3825"/>
		<updated>2025-08-12T11:53:27Z</updated>

		<summary type="html">&lt;p&gt;Reinmann: /* iSCSI targets */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Admin]]&lt;br /&gt;
&lt;br /&gt;
The meggies are diskless nodes inherited from the RRZE used from computing only. We would like to boot them from the network. NFS and iSCSI are two options to accomplish this. Since iSCSI is directly supported by the UEFI of the nodes this is the option which is by far the easiest to implement.&lt;br /&gt;
&lt;br /&gt;
== Hardware ==&lt;br /&gt;
The main boards are S2600KPR from Intel. Four of those are in a 2U chassis, called H2000G. The rails for the installation are called AXXELVRAIL.&lt;br /&gt;
&lt;br /&gt;
== To Do List ==&lt;br /&gt;
&lt;br /&gt;
=== OS setup ===&lt;br /&gt;
* &amp;lt;s&amp;gt;Make the installer recognize the iSCSI LUN as a block device before searching for those&amp;lt;/s&amp;gt;&lt;br /&gt;
* &amp;lt;s&amp;gt;Supply LUN information directly to the UEFI via DHCP&amp;lt;/s&amp;gt;&lt;br /&gt;
* Test what happens on iSCSI connection problems&lt;br /&gt;
** Switch malfunction&lt;br /&gt;
** iSCSI server reboot&lt;br /&gt;
** It doesn't take long until the kernel gives up on the iSCSI target. We need to increase this timeout somehow&lt;br /&gt;
* Make a list of UEFI settings&lt;br /&gt;
* Prepare LUNs&lt;br /&gt;
** Use ZFS zvols as storage backend&lt;br /&gt;
** Find out which backend type is the best (file, block, SCSI passthrough)&lt;br /&gt;
** Create a separate dataset for easier zfs setting propagation&lt;br /&gt;
** Leave a README file in the dataset directory for other people to know that it shouldn't be deleted&lt;br /&gt;
** Optimize dataset settings for iSCSI targets&lt;br /&gt;
*** Compression lz4&lt;br /&gt;
*** Larger &amp;lt;code&amp;gt;recordsize&amp;lt;/code&amp;gt; (1M or something)&lt;br /&gt;
** Name the zvols &amp;lt;code&amp;gt;zvol-MM-m&amp;lt;/code&amp;gt; where MM is the chassis number from 01 to 20 and m the node number from 1 to 4&lt;br /&gt;
* Remove the stupid /swap.img file. This is a general point affecting all computers&lt;br /&gt;
* Adjust puppet&lt;br /&gt;
** &amp;lt;s&amp;gt;Make no scratch available&amp;lt;/s&amp;gt;&lt;br /&gt;
** &amp;lt;s&amp;gt;Make /tmp and friends a tmpfs&amp;lt;/s&amp;gt;&lt;br /&gt;
** &amp;lt;s&amp;gt;Restrict sizes of all tmpfs&amp;lt;/s&amp;gt;&lt;br /&gt;
** Remove unnecessary user stuff? (e.g. x2go, firefox, etc.)&lt;br /&gt;
&lt;br /&gt;
=== Hardware setup ===&lt;br /&gt;
* Install nodes in a rack&lt;br /&gt;
* Buy and set up up switches&lt;br /&gt;
* Lots of cabling&lt;br /&gt;
* Power requirements?&lt;br /&gt;
** Ca. 250W per node, 1kW per chassis&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== The Boot Process ==&lt;br /&gt;
&lt;br /&gt;
=== General layout ===&lt;br /&gt;
The boot process works like this:&lt;br /&gt;
# UEFI stage&lt;br /&gt;
## The UEFI starts up&lt;br /&gt;
## Establishes a connection to the network&lt;br /&gt;
## Runs a DHCP client&lt;br /&gt;
## Gets an IP as well as the iSCSI LUN information directly from our DHCP server&lt;br /&gt;
## Accesses the GPT on the LUN and loads grub&lt;br /&gt;
# grub stage&lt;br /&gt;
## Grub starts&lt;br /&gt;
## Loads the kernel and initrd from the LUN&lt;br /&gt;
## Jumbs into the kernel&lt;br /&gt;
# ramdisk stage&lt;br /&gt;
## The kernel runs&lt;br /&gt;
## Mounts the initrd&lt;br /&gt;
## Inside the initrd our custom iSCSI hook is called&lt;br /&gt;
### Loads the iSCSI kernel module&lt;br /&gt;
### Loads the iSCSI iBFT kernel module&lt;br /&gt;
### Runs &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; which makes the iSCSI LUN available as a block device&lt;br /&gt;
# Normal boot continues&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Further explanations ===&lt;br /&gt;
==== UEFI DHCP ====&lt;br /&gt;
The UEFI can talk to the DHCP server and get an IP. We use DHCP option 17 (root-path) to supply the iSCSI information to the nodes.&lt;br /&gt;
&lt;br /&gt;
==== grub ====&lt;br /&gt;
As of now we don't know why grub even works. It's installed on the efi partition on the LUN, just like a normal computer. The UEFI uses the information on this EFI partition to find grub and run it, and grub then sees the partitions of the installed OS. Either the UEFI somehow emulates a blockdevice for grub such that it can see these partitions and load the kernel and initrd, or grub somehow also does iSCSI by using the iBFT. Anyway, it loads the kernel and initrd, which is the most important part.&lt;br /&gt;
&lt;br /&gt;
==== iBFT ====&lt;br /&gt;
This is really cool. When the UEFI launches and gets iSCSI information from the DHCP server it puts this information into the EFI system table, the Boot Firmware Table (iBFT). Linux can use this information to connect to the same LUN used for the boot process and set up the block device before trying to mount the root partition. The necessary kernel module is called &amp;lt;code&amp;gt;iscsi_ibft&amp;lt;/code&amp;gt;. After the module is loaded its only a matter of calling the &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; program to set up the block device. All this needs to happen before root is mounted, therefore modifications to the initrd are necessary.&lt;br /&gt;
&lt;br /&gt;
==== initrd modifications ====&lt;br /&gt;
During the boot, when the kernel is running and mounted the initrd, the &amp;lt;code&amp;gt;iscsi_ibft&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;iscsi_tcp&amp;lt;/code&amp;gt; modules need to be loaded, after which the block device can be created. To do this a hook needs to be installed in the initramfs. On ubuntu this can be done by putting files into the &amp;lt;code&amp;gt;/etc/initramfs-tools/scripts&amp;lt;/code&amp;gt; directory (or rather subdirectories). The hook for the iSCSI setup need to happen after the network is available, which means the hook needs to be put in the &amp;lt;code&amp;gt;local-top&amp;lt;/code&amp;gt; directory and made executable. The content is:&lt;br /&gt;
&amp;lt;source&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# iSCSI init script&lt;br /&gt;
PREREQ=&amp;quot;&amp;quot;&lt;br /&gt;
prereqs()&lt;br /&gt;
{&lt;br /&gt;
     echo &amp;quot;$PREREQ&amp;quot;&lt;br /&gt;
}&lt;br /&gt;
&lt;br /&gt;
case $1 in&lt;br /&gt;
prereqs)&lt;br /&gt;
     prereqs&lt;br /&gt;
     exit 0&lt;br /&gt;
     ;;&lt;br /&gt;
esac&lt;br /&gt;
&lt;br /&gt;
. /scripts/functions&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Begin iSCSI init&amp;quot;&lt;br /&gt;
&lt;br /&gt;
modprobe iscsi_tcp&lt;br /&gt;
modprobe iscsi_ibft&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Network configuration based on iBFT&amp;quot;&lt;br /&gt;
&lt;br /&gt;
iscsistart -N || panic &amp;quot;Could not initialize iSCSI&amp;quot;&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Waiting to finish iscsistart&amp;quot;&lt;br /&gt;
until iscsistart -b ; do&lt;br /&gt;
    sleep 1&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Of course the script needs to be made executable. After that the initial ramdisk(s) need to be regenerated by calling &amp;lt;code&amp;gt;update-initramfs -u&amp;lt;/code&amp;gt;. When the script is called during the boot process the block device exists for the kernel to mount the root filesystem and continue the boot process as normal.&lt;br /&gt;
&lt;br /&gt;
=== Installing ===&lt;br /&gt;
Unfortunately we can't completely rely on the normal installation process because Ubuntu by default doesn't use the iBFT to set up an iSCSI LUN before the installer searches for block devices to install on. We could get around this by simply switching to a TTY, running &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; and relaunching the installer, which then identified the block device and proceeded as normal. Since this entire stuff runs off LUNs on the network we can potentially completely get around the installation by just preparing enough LUNS with the appropriate data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== iSCSI targets ===&lt;br /&gt;
In the home directory of the ubuntuamdin are in the folder iscsi/ scripts for generating the images in the zfs subvolum &amp;lt;source&amp;gt; pool/iscsi &amp;lt;\source&amp;gt; (called: generate_zfs_images.sh). Then the iscsi targets are being created with: create_iscsi_targets.sh&lt;br /&gt;
With destroy_all_zfs_images.sh everything can be reverted.&lt;br /&gt;
The root filesystem size for each node is 80.8GB.&lt;br /&gt;
&lt;br /&gt;
== UEFI settings ==&lt;br /&gt;
Settings applied after resetting the UEFI to default settings:&lt;br /&gt;
* ...&lt;/div&gt;</summary>
		<author><name>Reinmann</name></author>
	</entry>
	<entry>
		<id>https://www.sternwarte.uni-erlangen.de/wiki/index.php?title=Meggie_Cluster&amp;diff=3824</id>
		<title>Meggie Cluster</title>
		<link rel="alternate" type="text/html" href="https://www.sternwarte.uni-erlangen.de/wiki/index.php?title=Meggie_Cluster&amp;diff=3824"/>
		<updated>2025-08-12T11:52:38Z</updated>

		<summary type="html">&lt;p&gt;Reinmann: /* iSCSI targets */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Admin]]&lt;br /&gt;
&lt;br /&gt;
The meggies are diskless nodes inherited from the RRZE used from computing only. We would like to boot them from the network. NFS and iSCSI are two options to accomplish this. Since iSCSI is directly supported by the UEFI of the nodes this is the option which is by far the easiest to implement.&lt;br /&gt;
&lt;br /&gt;
== Hardware ==&lt;br /&gt;
The main boards are S2600KPR from Intel. Four of those are in a 2U chassis, called H2000G. The rails for the installation are called AXXELVRAIL.&lt;br /&gt;
&lt;br /&gt;
== To Do List ==&lt;br /&gt;
&lt;br /&gt;
=== OS setup ===&lt;br /&gt;
* &amp;lt;s&amp;gt;Make the installer recognize the iSCSI LUN as a block device before searching for those&amp;lt;/s&amp;gt;&lt;br /&gt;
* &amp;lt;s&amp;gt;Supply LUN information directly to the UEFI via DHCP&amp;lt;/s&amp;gt;&lt;br /&gt;
* Test what happens on iSCSI connection problems&lt;br /&gt;
** Switch malfunction&lt;br /&gt;
** iSCSI server reboot&lt;br /&gt;
** It doesn't take long until the kernel gives up on the iSCSI target. We need to increase this timeout somehow&lt;br /&gt;
* Make a list of UEFI settings&lt;br /&gt;
* Prepare LUNs&lt;br /&gt;
** Use ZFS zvols as storage backend&lt;br /&gt;
** Find out which backend type is the best (file, block, SCSI passthrough)&lt;br /&gt;
** Create a separate dataset for easier zfs setting propagation&lt;br /&gt;
** Leave a README file in the dataset directory for other people to know that it shouldn't be deleted&lt;br /&gt;
** Optimize dataset settings for iSCSI targets&lt;br /&gt;
*** Compression lz4&lt;br /&gt;
*** Larger &amp;lt;code&amp;gt;recordsize&amp;lt;/code&amp;gt; (1M or something)&lt;br /&gt;
** Name the zvols &amp;lt;code&amp;gt;zvol-MM-m&amp;lt;/code&amp;gt; where MM is the chassis number from 01 to 20 and m the node number from 1 to 4&lt;br /&gt;
* Remove the stupid /swap.img file. This is a general point affecting all computers&lt;br /&gt;
* Adjust puppet&lt;br /&gt;
** &amp;lt;s&amp;gt;Make no scratch available&amp;lt;/s&amp;gt;&lt;br /&gt;
** &amp;lt;s&amp;gt;Make /tmp and friends a tmpfs&amp;lt;/s&amp;gt;&lt;br /&gt;
** &amp;lt;s&amp;gt;Restrict sizes of all tmpfs&amp;lt;/s&amp;gt;&lt;br /&gt;
** Remove unnecessary user stuff? (e.g. x2go, firefox, etc.)&lt;br /&gt;
&lt;br /&gt;
=== Hardware setup ===&lt;br /&gt;
* Install nodes in a rack&lt;br /&gt;
* Buy and set up up switches&lt;br /&gt;
* Lots of cabling&lt;br /&gt;
* Power requirements?&lt;br /&gt;
** Ca. 250W per node, 1kW per chassis&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== The Boot Process ==&lt;br /&gt;
&lt;br /&gt;
=== General layout ===&lt;br /&gt;
The boot process works like this:&lt;br /&gt;
# UEFI stage&lt;br /&gt;
## The UEFI starts up&lt;br /&gt;
## Establishes a connection to the network&lt;br /&gt;
## Runs a DHCP client&lt;br /&gt;
## Gets an IP as well as the iSCSI LUN information directly from our DHCP server&lt;br /&gt;
## Accesses the GPT on the LUN and loads grub&lt;br /&gt;
# grub stage&lt;br /&gt;
## Grub starts&lt;br /&gt;
## Loads the kernel and initrd from the LUN&lt;br /&gt;
## Jumbs into the kernel&lt;br /&gt;
# ramdisk stage&lt;br /&gt;
## The kernel runs&lt;br /&gt;
## Mounts the initrd&lt;br /&gt;
## Inside the initrd our custom iSCSI hook is called&lt;br /&gt;
### Loads the iSCSI kernel module&lt;br /&gt;
### Loads the iSCSI iBFT kernel module&lt;br /&gt;
### Runs &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; which makes the iSCSI LUN available as a block device&lt;br /&gt;
# Normal boot continues&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Further explanations ===&lt;br /&gt;
==== UEFI DHCP ====&lt;br /&gt;
The UEFI can talk to the DHCP server and get an IP. We use DHCP option 17 (root-path) to supply the iSCSI information to the nodes.&lt;br /&gt;
&lt;br /&gt;
==== grub ====&lt;br /&gt;
As of now we don't know why grub even works. It's installed on the efi partition on the LUN, just like a normal computer. The UEFI uses the information on this EFI partition to find grub and run it, and grub then sees the partitions of the installed OS. Either the UEFI somehow emulates a blockdevice for grub such that it can see these partitions and load the kernel and initrd, or grub somehow also does iSCSI by using the iBFT. Anyway, it loads the kernel and initrd, which is the most important part.&lt;br /&gt;
&lt;br /&gt;
==== iBFT ====&lt;br /&gt;
This is really cool. When the UEFI launches and gets iSCSI information from the DHCP server it puts this information into the EFI system table, the Boot Firmware Table (iBFT). Linux can use this information to connect to the same LUN used for the boot process and set up the block device before trying to mount the root partition. The necessary kernel module is called &amp;lt;code&amp;gt;iscsi_ibft&amp;lt;/code&amp;gt;. After the module is loaded its only a matter of calling the &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; program to set up the block device. All this needs to happen before root is mounted, therefore modifications to the initrd are necessary.&lt;br /&gt;
&lt;br /&gt;
==== initrd modifications ====&lt;br /&gt;
During the boot, when the kernel is running and mounted the initrd, the &amp;lt;code&amp;gt;iscsi_ibft&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;iscsi_tcp&amp;lt;/code&amp;gt; modules need to be loaded, after which the block device can be created. To do this a hook needs to be installed in the initramfs. On ubuntu this can be done by putting files into the &amp;lt;code&amp;gt;/etc/initramfs-tools/scripts&amp;lt;/code&amp;gt; directory (or rather subdirectories). The hook for the iSCSI setup need to happen after the network is available, which means the hook needs to be put in the &amp;lt;code&amp;gt;local-top&amp;lt;/code&amp;gt; directory and made executable. The content is:&lt;br /&gt;
&amp;lt;source&amp;gt;&lt;br /&gt;
#!/bin/sh&lt;br /&gt;
# iSCSI init script&lt;br /&gt;
PREREQ=&amp;quot;&amp;quot;&lt;br /&gt;
prereqs()&lt;br /&gt;
{&lt;br /&gt;
     echo &amp;quot;$PREREQ&amp;quot;&lt;br /&gt;
}&lt;br /&gt;
&lt;br /&gt;
case $1 in&lt;br /&gt;
prereqs)&lt;br /&gt;
     prereqs&lt;br /&gt;
     exit 0&lt;br /&gt;
     ;;&lt;br /&gt;
esac&lt;br /&gt;
&lt;br /&gt;
. /scripts/functions&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Begin iSCSI init&amp;quot;&lt;br /&gt;
&lt;br /&gt;
modprobe iscsi_tcp&lt;br /&gt;
modprobe iscsi_ibft&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Network configuration based on iBFT&amp;quot;&lt;br /&gt;
&lt;br /&gt;
iscsistart -N || panic &amp;quot;Could not initialize iSCSI&amp;quot;&lt;br /&gt;
&lt;br /&gt;
log_begin_msg &amp;quot;Waiting to finish iscsistart&amp;quot;&lt;br /&gt;
until iscsistart -b ; do&lt;br /&gt;
    sleep 1&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
Of course the script needs to be made executable. After that the initial ramdisk(s) need to be regenerated by calling &amp;lt;code&amp;gt;update-initramfs -u&amp;lt;/code&amp;gt;. When the script is called during the boot process the block device exists for the kernel to mount the root filesystem and continue the boot process as normal.&lt;br /&gt;
&lt;br /&gt;
=== Installing ===&lt;br /&gt;
Unfortunately we can't completely rely on the normal installation process because Ubuntu by default doesn't use the iBFT to set up an iSCSI LUN before the installer searches for block devices to install on. We could get around this by simply switching to a TTY, running &amp;lt;code&amp;gt;iscsistart&amp;lt;/code&amp;gt; and relaunching the installer, which then identified the block device and proceeded as normal. Since this entire stuff runs off LUNs on the network we can potentially completely get around the installation by just preparing enough LUNS with the appropriate data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== iSCSI targets ===&lt;br /&gt;
In the home directory of the ubuntuamdin are in the folder iscsi/ scripts for generating the images in the zfs subvolum pool/iscsi (called: generate_zfs_images.sh). Then the iscsi targets are being created with: create_iscsi_targets.sh&lt;br /&gt;
With destroy_all_zfs_images.sh everything can be reverted.&lt;br /&gt;
The root filesystem size for each node is 80.8GB.&lt;br /&gt;
&lt;br /&gt;
== UEFI settings ==&lt;br /&gt;
Settings applied after resetting the UEFI to default settings:&lt;br /&gt;
* ...&lt;/div&gt;</summary>
		<author><name>Reinmann</name></author>
	</entry>
</feed>