<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://tech.uvoo.io/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Busk</id>
	<title>UVOO Tech Wiki - User contributions
 [en]</title>
	<link rel="self" type="application/atom+xml" href="https://tech.uvoo.io/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Busk"/>
	<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php/Special:Contributions/Busk"/>
	<updated>2026-06-19T22:09:54Z</updated>
	<subtitle>User contributions
</subtitle>
	<generator>MediaWiki 1.35.2</generator>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Zfs_pool&amp;diff=5709</id>
		<title>Zfs pool</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Zfs_pool&amp;diff=5709"/>
		<updated>2026-06-15T21:46:57Z</updated>

		<summary type="html">&lt;p&gt;Busk: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
```&lt;br /&gt;
sudo zpool create -o ashift=12 tank-nvme mirror /dev/disk/by-id/nvme-eui.0025385281b1b872 /dev/disk/by-id/nvme-eui.0025385281b1b878&lt;br /&gt;
&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/nvme-Samsung_SSD_960_EVO_1TB_S3X3NF0K204029J&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/nvme-Samsung_SSD_960_EVO_1TB_S3X3NF0K204035E&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RMS&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RPX&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN0339T&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RFB&lt;br /&gt;
&lt;br /&gt;
sudo zpool create -o ashift=12 tank-hdd mirror /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RMS /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RPX mirror /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN0339T /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RFB&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
sudo zfs create -o atime=off -o compression=lz4 -o xattr=sa -o acltype=posixacl tank-nvme/lxd&lt;br /&gt;
&lt;br /&gt;
## ashift&lt;br /&gt;
&lt;br /&gt;
`ashift` stands for **alignment shift**. It dictates the minimum block size ZFS will use when formatting and writing data to the physical storage devices in your pool.&lt;br /&gt;
&lt;br /&gt;
The number you provide is an exponent of 2:&lt;br /&gt;
&lt;br /&gt;
* `ashift=9` means $2^9=512$ bytes.&lt;br /&gt;
* `ashift=12` means $2^{12}=4096$ bytes (4K).&lt;br /&gt;
* `ashift=13` means $2^{13}=8192$ bytes (8K).&lt;br /&gt;
&lt;br /&gt;
Here is why forcing `ashift=12` is critical for modern storage.&lt;br /&gt;
&lt;br /&gt;
### **The 512-byte Lie (Emulation)**&lt;br /&gt;
&lt;br /&gt;
Historically, hard drives used physical sectors that were exactly 512 bytes in size. However, almost all modern hard drives and SSDs use **Advanced Format**, meaning their physical layout is built on 4096-byte (4K) sectors. This larger size allows for higher storage density and better error correction.&lt;br /&gt;
&lt;br /&gt;
To avoid breaking older operating systems and legacy hardware controllers, many modern 4K drives &amp;quot;lie&amp;quot; to the host system. They use a firmware feature called **512e (512-byte emulation)** to report themselves as having old-school 512-byte sectors, even though their physical architecture is 4K.&lt;br /&gt;
&lt;br /&gt;
### **The Read-Modify-Write Penalty**&lt;br /&gt;
&lt;br /&gt;
If you create a ZFS pool without specifying the `ashift` value, ZFS will often interrogate the drive, hear the 512-byte lie, and set `ashift=9`. This creates a severe misalignment between ZFS's logical blocks and the drive's physical sectors.&lt;br /&gt;
&lt;br /&gt;
If ZFS attempts to write a 512-byte block to a physical 4K sector, the storage drive is forced to execute a **Read-Modify-Write** operation:&lt;br /&gt;
&lt;br /&gt;
1. **Read:** The drive reads the entire 4K physical sector into its internal memory.&lt;br /&gt;
2. **Modify:** The drive inserts the 512 bytes ZFS sent into the 4K block.&lt;br /&gt;
3. **Write:** The drive writes the whole 4K sector back to the disk.&lt;br /&gt;
&lt;br /&gt;
This massive overhead completely destroys write performance (especially random I/O) and causes severe &amp;quot;write amplification,&amp;quot; which prematurely burns through the endurance limits of NVMe and SSD drives.&lt;br /&gt;
&lt;br /&gt;
### **The Solution**&lt;br /&gt;
&lt;br /&gt;
By explicitly appending `-o ashift=12` to your `zpool create` command, you force ZFS to align all of its data payloads to exact 4K boundaries. This ensures ZFS writes map perfectly to the physical hardware beneath it, entirely bypassing the emulation penalty and ensuring optimal throughput and drive lifespan.&lt;br /&gt;
&lt;br /&gt;
&amp;gt; **Note:** The `ashift` value is permanently baked into a top-level virtual device (vdev) at the moment of creation. If you create a pool with the wrong `ashift`, it cannot be changed later; you have to destroy the pool, wipe the drives, and start over.&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Zfs_pool&amp;diff=5708</id>
		<title>Zfs pool</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Zfs_pool&amp;diff=5708"/>
		<updated>2026-06-15T20:56:16Z</updated>

		<summary type="html">&lt;p&gt;Busk: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
```&lt;br /&gt;
sudo zpool create -o ashift=12 tank-nvme mirror /dev/disk/by-id/nvme-eui.0025385281b1b872 /dev/disk/by-id/nvme-eui.0025385281b1b878&lt;br /&gt;
&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/nvme-Samsung_SSD_960_EVO_1TB_S3X3NF0K204029J&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/nvme-Samsung_SSD_960_EVO_1TB_S3X3NF0K204035E&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RMS&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RPX&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN0339T&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RFB&lt;br /&gt;
&lt;br /&gt;
sudo zpool create -o ashift=12 tank-hdd mirror /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RMS /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RPX mirror /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN0339T /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RFB&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
## ashift&lt;br /&gt;
&lt;br /&gt;
`ashift` stands for **alignment shift**. It dictates the minimum block size ZFS will use when formatting and writing data to the physical storage devices in your pool.&lt;br /&gt;
&lt;br /&gt;
The number you provide is an exponent of 2:&lt;br /&gt;
&lt;br /&gt;
* `ashift=9` means $2^9=512$ bytes.&lt;br /&gt;
* `ashift=12` means $2^{12}=4096$ bytes (4K).&lt;br /&gt;
* `ashift=13` means $2^{13}=8192$ bytes (8K).&lt;br /&gt;
&lt;br /&gt;
Here is why forcing `ashift=12` is critical for modern storage.&lt;br /&gt;
&lt;br /&gt;
### **The 512-byte Lie (Emulation)**&lt;br /&gt;
&lt;br /&gt;
Historically, hard drives used physical sectors that were exactly 512 bytes in size. However, almost all modern hard drives and SSDs use **Advanced Format**, meaning their physical layout is built on 4096-byte (4K) sectors. This larger size allows for higher storage density and better error correction.&lt;br /&gt;
&lt;br /&gt;
To avoid breaking older operating systems and legacy hardware controllers, many modern 4K drives &amp;quot;lie&amp;quot; to the host system. They use a firmware feature called **512e (512-byte emulation)** to report themselves as having old-school 512-byte sectors, even though their physical architecture is 4K.&lt;br /&gt;
&lt;br /&gt;
### **The Read-Modify-Write Penalty**&lt;br /&gt;
&lt;br /&gt;
If you create a ZFS pool without specifying the `ashift` value, ZFS will often interrogate the drive, hear the 512-byte lie, and set `ashift=9`. This creates a severe misalignment between ZFS's logical blocks and the drive's physical sectors.&lt;br /&gt;
&lt;br /&gt;
If ZFS attempts to write a 512-byte block to a physical 4K sector, the storage drive is forced to execute a **Read-Modify-Write** operation:&lt;br /&gt;
&lt;br /&gt;
1. **Read:** The drive reads the entire 4K physical sector into its internal memory.&lt;br /&gt;
2. **Modify:** The drive inserts the 512 bytes ZFS sent into the 4K block.&lt;br /&gt;
3. **Write:** The drive writes the whole 4K sector back to the disk.&lt;br /&gt;
&lt;br /&gt;
This massive overhead completely destroys write performance (especially random I/O) and causes severe &amp;quot;write amplification,&amp;quot; which prematurely burns through the endurance limits of NVMe and SSD drives.&lt;br /&gt;
&lt;br /&gt;
### **The Solution**&lt;br /&gt;
&lt;br /&gt;
By explicitly appending `-o ashift=12` to your `zpool create` command, you force ZFS to align all of its data payloads to exact 4K boundaries. This ensures ZFS writes map perfectly to the physical hardware beneath it, entirely bypassing the emulation penalty and ensuring optimal throughput and drive lifespan.&lt;br /&gt;
&lt;br /&gt;
&amp;gt; **Note:** The `ashift` value is permanently baked into a top-level virtual device (vdev) at the moment of creation. If you create a pool with the wrong `ashift`, it cannot be changed later; you have to destroy the pool, wipe the drives, and start over.&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Zfs_pool&amp;diff=5707</id>
		<title>Zfs pool</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Zfs_pool&amp;diff=5707"/>
		<updated>2026-06-15T20:52:51Z</updated>

		<summary type="html">&lt;p&gt;Busk: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
```&lt;br /&gt;
sudo zpool create -o ashift=12 tank-nvme mirror /dev/disk/by-id/nvme-eui.0025385281b1b872 /dev/disk/by-id/nvme-eui.0025385281b1b878&lt;br /&gt;
&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/nvme-Samsung_SSD_960_EVO_1TB_S3X3NF0K204029J&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/nvme-Samsung_SSD_960_EVO_1TB_S3X3NF0K204035E&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RMS&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RPX&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN0339T&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RFB&lt;br /&gt;
&lt;br /&gt;
sudo zpool create -o ashift=12 tank-hdd mirror /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RMS /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RPX mirror /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN0339T /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RFB&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
## ashift&lt;br /&gt;
&lt;br /&gt;
ashift stands for alignment shift. It dictates the minimum block size ZFS will use when formatting and writing data to the physical storage devices in your pool.The number you provide is an exponent of 2:ashift=9 means $2^9=512$ bytes.ashift=12 means $2^{12}=4096$ bytes (4K).ashift=13 means $2^{13}=8192$ bytes (8K).Here is why forcing ashift=12 is critical for modern storage.The 512-byte Lie (Emulation)Historically, hard drives used physical sectors that were exactly 512 bytes in size. However, almost all modern hard drives and SSDs use Advanced Format, meaning their physical layout is built on 4096-byte (4K) sectors. This larger size allows for higher storage density and better error correction.To avoid breaking older operating systems and legacy hardware controllers, many modern 4K drives &amp;quot;lie&amp;quot; to the host system. They use a firmware feature called 512e (512-byte emulation) to report themselves as having old-school 512-byte sectors, even though their physical architecture is 4K.The Read-Modify-Write PenaltyIf you create a ZFS pool without specifying the ashift value, ZFS will often interrogate the drive, hear the 512-byte lie, and set ashift=9. This creates a severe misalignment between ZFS's logical blocks and the drive's physical sectors.If ZFS attempts to write a 512-byte block to a physical 4K sector, the storage drive is forced to execute a Read-Modify-Write operation:Read: The drive reads the entire 4K physical sector into its internal memory.Modify: The drive inserts the 512 bytes ZFS sent into the 4K block.Write: The drive writes the whole 4K sector back to the disk.This massive overhead completely destroys write performance (especially random I/O) and causes severe &amp;quot;write amplification,&amp;quot; which prematurely burns through the endurance limits of NVMe and SSD drives.The SolutionBy explicitly appending -o ashift=12 to your zpool create command, you force ZFS to align all of its data payloads to exact 4K boundaries. This ensures ZFS writes map perfectly to the physical hardware beneath it, entirely bypassing the emulation penalty and ensuring optimal throughput and drive lifespan.Note: The ashift value is permanently baked into a top-level virtual device (vdev) at the moment of creation. If you create a pool with the wrong ashift, it cannot be changed later; you have to destroy the pool, wipe the drives, and start over.&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Zfs_pool&amp;diff=5706</id>
		<title>Zfs pool</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Zfs_pool&amp;diff=5706"/>
		<updated>2026-06-15T20:47:33Z</updated>

		<summary type="html">&lt;p&gt;Busk: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
```&lt;br /&gt;
sudo zpool create -o ashift=12 tank-nvme mirror /dev/disk/by-id/nvme-eui.0025385281b1b872 /dev/disk/by-id/nvme-eui.0025385281b1b878&lt;br /&gt;
&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/nvme-Samsung_SSD_960_EVO_1TB_S3X3NF0K204029J&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/nvme-Samsung_SSD_960_EVO_1TB_S3X3NF0K204035E&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RMS&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RPX&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN0339T&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RFB&lt;br /&gt;
&lt;br /&gt;
sudo zpool create -o ashift=12 tank-hdd mirror /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RMS /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RPX mirror /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN0339T /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RFB&lt;br /&gt;
```&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Zfs_pool&amp;diff=5705</id>
		<title>Zfs pool</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Zfs_pool&amp;diff=5705"/>
		<updated>2026-06-15T20:47:12Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot; ``` sudo zpool create -o ashift=12 tank1 mirror /dev/disk/by-id/nvme-eui.0025385281b1b872 /dev/disk/by-id/nvme-eui.0025385281b1b878  sudo wipefs -a /dev/disk/by-id/nvme-Samsu...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
```&lt;br /&gt;
sudo zpool create -o ashift=12 tank1 mirror /dev/disk/by-id/nvme-eui.0025385281b1b872 /dev/disk/by-id/nvme-eui.0025385281b1b878&lt;br /&gt;
&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/nvme-Samsung_SSD_960_EVO_1TB_S3X3NF0K204029J&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/nvme-Samsung_SSD_960_EVO_1TB_S3X3NF0K204035E&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RMS&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RPX&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN0339T&lt;br /&gt;
sudo wipefs -a /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RFB&lt;br /&gt;
&lt;br /&gt;
sudo zpool create -o ashift=12 mypool mirror /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RMS /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RPX mirror /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN0339T /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN07RFB&lt;br /&gt;
```&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Sandbox3&amp;diff=5704</id>
		<title>Sandbox3</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Sandbox3&amp;diff=5704"/>
		<updated>2026-06-07T00:35:27Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot;``` #!/bin/bash export REPO=&amp;quot;your-owner/your-repo&amp;quot; export BRANCH=&amp;quot;main&amp;quot; export RULESET_NAME=&amp;quot;Migrated Protection - $BRANCH&amp;quot;  # 1. Fetch Classic Branch Protection Rules echo &amp;quot;F...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;```&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
export REPO=&amp;quot;your-owner/your-repo&amp;quot;&lt;br /&gt;
export BRANCH=&amp;quot;main&amp;quot;&lt;br /&gt;
export RULESET_NAME=&amp;quot;Migrated Protection - $BRANCH&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# 1. Fetch Classic Branch Protection Rules&lt;br /&gt;
echo &amp;quot;Fetching classic branch protection for $BRANCH...&amp;quot;&lt;br /&gt;
CLASSIC_RULES=$(gh api repos/$REPO/branches/$BRANCH/protection -X GET 2&amp;gt;/dev/null)&lt;br /&gt;
&lt;br /&gt;
if [ -z &amp;quot;$CLASSIC_RULES&amp;quot; ]; then&lt;br /&gt;
    echo &amp;quot;No classic rules found or error accessing repo.&amp;quot;&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
&lt;br /&gt;
# 2. Extract and Map Protections&lt;br /&gt;
REQ_PR=$(echo &amp;quot;$CLASSIC_RULES&amp;quot; | jq -e '.required_pull_request_reviews' &amp;gt;/dev/null &amp;amp;&amp;amp; echo &amp;quot;true&amp;quot; || echo &amp;quot;false&amp;quot;)&lt;br /&gt;
REQ_APPROVALS=$(echo &amp;quot;$CLASSIC_RULES&amp;quot; | jq '.required_pull_request_reviews.required_approving_review_count // 1')&lt;br /&gt;
DISMISS_STALE=$(echo &amp;quot;$CLASSIC_RULES&amp;quot; | jq '.required_pull_request_reviews.dismiss_stale_reviews // false')&lt;br /&gt;
REQ_CODE_OWNERS=$(echo &amp;quot;$CLASSIC_RULES&amp;quot; | jq '.required_pull_request_reviews.require_code_owner_reviews // false')&lt;br /&gt;
&lt;br /&gt;
REQ_CHECKS=$(echo &amp;quot;$CLASSIC_RULES&amp;quot; | jq -e '.required_status_checks' &amp;gt;/dev/null &amp;amp;&amp;amp; echo &amp;quot;true&amp;quot; || echo &amp;quot;false&amp;quot;)&lt;br /&gt;
REQ_STRICT=$(echo &amp;quot;$CLASSIC_RULES&amp;quot; | jq '.required_status_checks.strict // false')&lt;br /&gt;
CONTEXTS=$(echo &amp;quot;$CLASSIC_RULES&amp;quot; | jq '.required_status_checks.contexts // []')&lt;br /&gt;
&lt;br /&gt;
ENFORCE_ADMINS=$(echo &amp;quot;$CLASSIC_RULES&amp;quot; | jq '.enforce_admins.enabled // false')&lt;br /&gt;
REQ_LINEAR=$(echo &amp;quot;$CLASSIC_RULES&amp;quot; | jq '.required_linear_history.enabled // false')&lt;br /&gt;
REQ_SIGNATURES=$(echo &amp;quot;$CLASSIC_RULES&amp;quot; | jq '.required_signatures.enabled // false')&lt;br /&gt;
&lt;br /&gt;
# 3. Construct Ruleset Payload&lt;br /&gt;
PAYLOAD=$(jq -n \&lt;br /&gt;
  --arg name &amp;quot;$RULESET_NAME&amp;quot; \&lt;br /&gt;
  --arg target &amp;quot;$BRANCH&amp;quot; \&lt;br /&gt;
  --argjson req_pr &amp;quot;$REQ_PR&amp;quot; \&lt;br /&gt;
  --argjson approvals &amp;quot;$REQ_APPROVALS&amp;quot; \&lt;br /&gt;
  --argjson dismiss_stale &amp;quot;$DISMISS_STALE&amp;quot; \&lt;br /&gt;
  --argjson code_owners &amp;quot;$REQ_CODE_OWNERS&amp;quot; \&lt;br /&gt;
  --argjson req_checks &amp;quot;$REQ_CHECKS&amp;quot; \&lt;br /&gt;
  --argjson strict &amp;quot;$REQ_STRICT&amp;quot; \&lt;br /&gt;
  --argjson contexts &amp;quot;$CONTEXTS&amp;quot; \&lt;br /&gt;
  --argjson linear &amp;quot;$REQ_LINEAR&amp;quot; \&lt;br /&gt;
  --argjson signatures &amp;quot;$REQ_SIGNATURES&amp;quot; \&lt;br /&gt;
  '&lt;br /&gt;
  {&lt;br /&gt;
    &amp;quot;name&amp;quot;: $name,&lt;br /&gt;
    &amp;quot;target&amp;quot;: &amp;quot;branch&amp;quot;,&lt;br /&gt;
    &amp;quot;enforcement&amp;quot;: &amp;quot;active&amp;quot;,&lt;br /&gt;
    &amp;quot;conditions&amp;quot;: {&lt;br /&gt;
      &amp;quot;ref_name&amp;quot;: {&lt;br /&gt;
        &amp;quot;include&amp;quot;: [&amp;quot;refs/heads/&amp;quot; + $target],&lt;br /&gt;
        &amp;quot;exclude&amp;quot;: []&lt;br /&gt;
      }&lt;br /&gt;
    },&lt;br /&gt;
    &amp;quot;rules&amp;quot;: [&lt;br /&gt;
      { &amp;quot;type&amp;quot;: &amp;quot;deletion&amp;quot; },&lt;br /&gt;
      { &amp;quot;type&amp;quot;: &amp;quot;non_fast_forward&amp;quot; }&lt;br /&gt;
    ]&lt;br /&gt;
  }&lt;br /&gt;
  | if $req_pr then .rules += [{&lt;br /&gt;
      &amp;quot;type&amp;quot;: &amp;quot;pull_request&amp;quot;,&lt;br /&gt;
      &amp;quot;parameters&amp;quot;: {&lt;br /&gt;
        &amp;quot;required_approving_review_count&amp;quot;: $approvals,&lt;br /&gt;
        &amp;quot;dismiss_stale_reviews_on_push&amp;quot;: $dismiss_stale,&lt;br /&gt;
        &amp;quot;require_code_owner_review&amp;quot;: $code_owners,&lt;br /&gt;
        &amp;quot;require_last_push_approval&amp;quot;: false&lt;br /&gt;
      }&lt;br /&gt;
    }] else . end&lt;br /&gt;
  | if $req_checks then .rules += [{&lt;br /&gt;
      &amp;quot;type&amp;quot;: &amp;quot;required_status_checks&amp;quot;,&lt;br /&gt;
      &amp;quot;parameters&amp;quot;: {&lt;br /&gt;
        &amp;quot;strict_required_status_checks_policy&amp;quot;: $strict,&lt;br /&gt;
        &amp;quot;required_status_checks&amp;quot;: ($contexts | map({ context: ., integration_id: 0 }))&lt;br /&gt;
      }&lt;br /&gt;
    }] else . end&lt;br /&gt;
  | if $linear then .rules += [{ &amp;quot;type&amp;quot;: &amp;quot;required_linear_history&amp;quot; }] else . end&lt;br /&gt;
  | if $signatures then .rules += [{ &amp;quot;type&amp;quot;: &amp;quot;required_signatures&amp;quot; }] else . end&lt;br /&gt;
')&lt;br /&gt;
&lt;br /&gt;
# 4. Create the New Ruleset&lt;br /&gt;
echo &amp;quot;Creating new Branch Ruleset...&amp;quot;&lt;br /&gt;
CREATE_RES=$(gh api repos/$REPO/rulesets -X POST --input - &amp;lt;&amp;lt;&amp;lt; &amp;quot;$PAYLOAD&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
if [ $? -eq 0 ]; then&lt;br /&gt;
    echo &amp;quot;Ruleset successfully created!&amp;quot;&lt;br /&gt;
    &lt;br /&gt;
    # 5. Optional: Delete the classic rule if the ruleset succeeded&lt;br /&gt;
    echo &amp;quot;To complete the migration, manually delete the classic rule or uncomment the line below:&amp;quot;&lt;br /&gt;
    echo &amp;quot;# gh api repos/$REPO/branches/$BRANCH/protection -X DELETE&amp;quot;&lt;br /&gt;
else&lt;br /&gt;
    echo &amp;quot;Failed to create ruleset.&amp;quot;&lt;br /&gt;
fi&lt;br /&gt;
```&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Sed&amp;diff=5703</id>
		<title>Sed</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Sed&amp;diff=5703"/>
		<updated>2026-06-03T20:07:09Z</updated>

		<summary type="html">&lt;p&gt;Busk: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;https://opensource.com/article/20/12/sed&lt;br /&gt;
&lt;br /&gt;
grep -rl oldtext . | xargs sed -i 's/oldtext/newtext/g'&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
find . -type f -name &amp;quot;*.txt&amp;quot; -exec sed -i 's/old_text/new_text/g' {} +&lt;br /&gt;
&lt;br /&gt;
find . -type f -name &amp;quot;*.txt&amp;quot; -exec sed -i '' 's/old_text/new_text/g' {} +&lt;br /&gt;
&lt;br /&gt;
find . -type f -name &amp;quot;*.txt&amp;quot; -print0 | xargs -0 sed -i 's/old_text/new_text/g'v&lt;br /&gt;
&lt;br /&gt;
## Linux&lt;br /&gt;
&lt;br /&gt;
grep -rlZ --exclude-dir='.git' 'oldtext' . | xargs -0 sed -i 's/oldtext/newtext/g'&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
## Macos/freebsd&lt;br /&gt;
grep -rl --null --exclude-dir='.git' 'oldtext' . | xargs -0 sed -i '' 's/oldtext/newtext/g'&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Sed&amp;diff=5702</id>
		<title>Sed</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Sed&amp;diff=5702"/>
		<updated>2026-06-03T20:05:29Z</updated>

		<summary type="html">&lt;p&gt;Busk: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;https://opensource.com/article/20/12/sed&lt;br /&gt;
&lt;br /&gt;
grep -rl oldtext . | xargs sed -i 's/oldtext/newtext/g'&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
find . -type f -name &amp;quot;*.txt&amp;quot; -exec sed -i 's/old_text/new_text/g' {} +&lt;br /&gt;
&lt;br /&gt;
find . -type f -name &amp;quot;*.txt&amp;quot; -exec sed -i '' 's/old_text/new_text/g' {} +&lt;br /&gt;
&lt;br /&gt;
find . -type f -name &amp;quot;*.txt&amp;quot; -print0 | xargs -0 sed -i 's/old_text/new_text/g'&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Hugo&amp;diff=5701</id>
		<title>Hugo</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Hugo&amp;diff=5701"/>
		<updated>2026-06-01T15:07:51Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot;``` nohup hugo server --source website_hugo --disableFastRender &amp;gt; hugo-server.log 2&amp;gt;&amp;amp;1 &amp;amp; ```&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;```&lt;br /&gt;
nohup hugo server --source website_hugo --disableFastRender &amp;gt; hugo-server.log 2&amp;gt;&amp;amp;1 &amp;amp;&lt;br /&gt;
```&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Rsync&amp;diff=5700</id>
		<title>Rsync</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Rsync&amp;diff=5700"/>
		<updated>2026-05-31T22:35:42Z</updated>

		<summary type="html">&lt;p&gt;Busk: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;# Backup Command&lt;br /&gt;
&lt;br /&gt;
rsync -axSH --exclude='.locks/' --dry-run --progress ~/.cache/huggingface/hub/ /ai/hf_cache/hub/&lt;br /&gt;
&lt;br /&gt;
rate limit 10000 Kbytes per second on zfs&lt;br /&gt;
```&lt;br /&gt;
sudo  zfs create -o mountpoint=/bkp zfspv-pool/bkp&lt;br /&gt;
sudo adduser bkp&lt;br /&gt;
sudo chown bkp:bkp /bkp&lt;br /&gt;
rsync -avz --recursive --bwlimit=10000 --exclude &amp;quot;hourly.&amp;quot; --relative -e &amp;quot;ssh -p 22 -l bkp&amp;quot; --progress /var/lib/influxdb 10.x.x.x:/bkp/myinfluxbkp&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Backup example&lt;br /&gt;
```&lt;br /&gt;
rsync -avzSuc --recursive --relative --delete -e &amp;quot;ssh -p 22 -i /home/user/.ssh/id_ed25519&amp;quot; --progress --files-from=files.dat / user@10.x.x.x:/home/busk/rsync --include-from=includes.dat --exclude-from=excludes.dat&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Use lxd container and lxc snapshots to manage state. You can use rdiff-backup command tool as well&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
sudo rsync --dry-run -rv -e &amp;quot;ssh -p 22 -i /home/myuser/.ssh/id_ed25519&amp;quot; --rsync-path=&amp;quot;sudo rsyn&lt;br /&gt;
c&amp;quot;  myuser@10.x.x.x:/bkp/foo /docker/&lt;br /&gt;
```&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Rsync&amp;diff=5699</id>
		<title>Rsync</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Rsync&amp;diff=5699"/>
		<updated>2026-05-31T22:33:41Z</updated>

		<summary type="html">&lt;p&gt;Busk: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;# Backup Command&lt;br /&gt;
&lt;br /&gt;
rsync -axSH --dry-run --progress ~/.cache/huggingface/hub/ /ai/hf_cache/hub/&lt;br /&gt;
&lt;br /&gt;
rate limit 10000 Kbytes per second on zfs&lt;br /&gt;
```&lt;br /&gt;
sudo  zfs create -o mountpoint=/bkp zfspv-pool/bkp&lt;br /&gt;
sudo adduser bkp&lt;br /&gt;
sudo chown bkp:bkp /bkp&lt;br /&gt;
rsync -avz --recursive --bwlimit=10000 --exclude &amp;quot;hourly.&amp;quot; --relative -e &amp;quot;ssh -p 22 -l bkp&amp;quot; --progress /var/lib/influxdb 10.x.x.x:/bkp/myinfluxbkp&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Backup example&lt;br /&gt;
```&lt;br /&gt;
rsync -avzSuc --recursive --relative --delete -e &amp;quot;ssh -p 22 -i /home/user/.ssh/id_ed25519&amp;quot; --progress --files-from=files.dat / user@10.x.x.x:/home/busk/rsync --include-from=includes.dat --exclude-from=excludes.dat&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Use lxd container and lxc snapshots to manage state. You can use rdiff-backup command tool as well&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
sudo rsync --dry-run -rv -e &amp;quot;ssh -p 22 -i /home/myuser/.ssh/id_ed25519&amp;quot; --rsync-path=&amp;quot;sudo rsyn&lt;br /&gt;
c&amp;quot;  myuser@10.x.x.x:/bkp/foo /docker/&lt;br /&gt;
```&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Amdgpu_rocm&amp;diff=5698</id>
		<title>Amdgpu rocm</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Amdgpu_rocm&amp;diff=5698"/>
		<updated>2026-05-26T02:27:30Z</updated>

		<summary type="html">&lt;p&gt;Busk: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;# Make sure keyring dir exists&lt;br /&gt;
sudo mkdir --parents --mode=0755 /etc/apt/keyrings&lt;br /&gt;
&lt;br /&gt;
# Refresh AMD key, same as you already did&lt;br /&gt;
wget https://repo.radeon.com/rocm/rocm.gpg.key -O - | \&lt;br /&gt;
  gpg --dearmor | sudo tee /etc/apt/keyrings/rocm.gpg &amp;gt; /dev/null&lt;br /&gt;
&lt;br /&gt;
# Add the missing AMDGPU kernel-driver repo for Ubuntu 24.04 / noble&lt;br /&gt;
sudo tee /etc/apt/sources.list.d/amdgpu.list &amp;gt; /dev/null &amp;lt;&amp;lt;'EOF'&lt;br /&gt;
deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/30.30.3/ubuntu noble main&lt;br /&gt;
EOF&lt;br /&gt;
&lt;br /&gt;
sudo apt update&lt;br /&gt;
&lt;br /&gt;
# Confirm apt now sees it&lt;br /&gt;
apt-cache policy amdgpu-dkms&lt;br /&gt;
&lt;br /&gt;
# Install prerequisites + driver&lt;br /&gt;
sudo apt install &amp;quot;linux-headers-$(uname -r)&amp;quot; &amp;quot;linux-modules-extra-$(uname -r)&amp;quot;&lt;br /&gt;
sudo apt install amdgpu-dkms&lt;br /&gt;
&lt;br /&gt;
# Add yourself for ROCm access, then reboot&lt;br /&gt;
sudo usermod -a -G render,video &amp;quot;$LOGNAME&amp;quot;&lt;br /&gt;
sudo reboot&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
# Test&lt;br /&gt;
```&lt;br /&gt;
dkms status | grep -i amdgpu&lt;br /&gt;
lsmod | grep amdgpu&lt;br /&gt;
ls /dev/kfd /dev/dri/render*&lt;br /&gt;
rocminfo | head&lt;br /&gt;
amd-smi list&lt;br /&gt;
```&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Amdgpu_rocm&amp;diff=5697</id>
		<title>Amdgpu rocm</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Amdgpu_rocm&amp;diff=5697"/>
		<updated>2026-05-26T02:23:14Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot;# Make sure keyring dir exists sudo mkdir --parents --mode=0755 /etc/apt/keyrings  # Refresh AMD key, same as you already did wget https://repo.radeon.com/rocm/rocm.gpg.key -O...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;# Make sure keyring dir exists&lt;br /&gt;
sudo mkdir --parents --mode=0755 /etc/apt/keyrings&lt;br /&gt;
&lt;br /&gt;
# Refresh AMD key, same as you already did&lt;br /&gt;
wget https://repo.radeon.com/rocm/rocm.gpg.key -O - | \&lt;br /&gt;
  gpg --dearmor | sudo tee /etc/apt/keyrings/rocm.gpg &amp;gt; /dev/null&lt;br /&gt;
&lt;br /&gt;
# Add the missing AMDGPU kernel-driver repo for Ubuntu 24.04 / noble&lt;br /&gt;
sudo tee /etc/apt/sources.list.d/amdgpu.list &amp;gt; /dev/null &amp;lt;&amp;lt;'EOF'&lt;br /&gt;
deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/30.30.3/ubuntu noble main&lt;br /&gt;
EOF&lt;br /&gt;
&lt;br /&gt;
sudo apt update&lt;br /&gt;
&lt;br /&gt;
# Confirm apt now sees it&lt;br /&gt;
apt-cache policy amdgpu-dkms&lt;br /&gt;
&lt;br /&gt;
# Install prerequisites + driver&lt;br /&gt;
sudo apt install &amp;quot;linux-headers-$(uname -r)&amp;quot; &amp;quot;linux-modules-extra-$(uname -r)&amp;quot;&lt;br /&gt;
sudo apt install amdgpu-dkms&lt;br /&gt;
&lt;br /&gt;
# Add yourself for ROCm access, then reboot&lt;br /&gt;
sudo usermod -a -G render,video &amp;quot;$LOGNAME&amp;quot;&lt;br /&gt;
sudo reboot&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Rocm&amp;diff=5696</id>
		<title>Rocm</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Rocm&amp;diff=5696"/>
		<updated>2026-05-26T02:04:13Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot;https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/install-methods/package-manager/package-manager-ubuntu.html&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/install-methods/package-manager/package-manager-ubuntu.html&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Zfs_optimization&amp;diff=5695</id>
		<title>Zfs optimization</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Zfs_optimization&amp;diff=5695"/>
		<updated>2026-05-25T15:56:52Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot;# ZFS Optimization Get &amp;amp; Set  ## Get current settings for pool ``` #!/bin/bash  if [ -z &amp;quot;$1&amp;quot; ]; then     echo &amp;quot;Usage: $0 &amp;lt;pool_name&amp;gt;&amp;quot;     exit 1 fi  POOL=&amp;quot;$1&amp;quot;  if ! zpool list...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;# ZFS Optimization Get &amp;amp; Set&lt;br /&gt;
&lt;br /&gt;
## Get current settings for pool&lt;br /&gt;
```&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
if [ -z &amp;quot;$1&amp;quot; ]; then&lt;br /&gt;
    echo &amp;quot;Usage: $0 &amp;lt;pool_name&amp;gt;&amp;quot;&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
&lt;br /&gt;
POOL=&amp;quot;$1&amp;quot;&lt;br /&gt;
&lt;br /&gt;
if ! zpool list &amp;quot;$POOL&amp;quot; &amp;gt; /dev/null 2&amp;gt;&amp;amp;1; then&lt;br /&gt;
    echo &amp;quot;Error: Pool '$POOL' does not exist.&amp;quot;&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;=== ZFS Pool Alignment ===&amp;quot;&lt;br /&gt;
zpool get ashift &amp;quot;$POOL&amp;quot;&lt;br /&gt;
echo &amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;=== Dataset Optimizations ===&amp;quot;&lt;br /&gt;
zfs get recordsize,atime,compression &amp;quot;$POOL&amp;quot;&lt;br /&gt;
echo &amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;=== TRIM Status ===&amp;quot;&lt;br /&gt;
zpool get autotrim &amp;quot;$POOL&amp;quot;&lt;br /&gt;
echo &amp;quot;&amp;quot;&lt;br /&gt;
zpool status -t &amp;quot;$POOL&amp;quot;&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
# Set optimization&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
if [ -z &amp;quot;$1&amp;quot; ]; then&lt;br /&gt;
    echo &amp;quot;Usage: $0 &amp;lt;pool_name&amp;gt;&amp;quot;&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
&lt;br /&gt;
POOL=&amp;quot;$1&amp;quot;&lt;br /&gt;
&lt;br /&gt;
if ! zpool list &amp;quot;$POOL&amp;quot; &amp;gt; /dev/null 2&amp;gt;&amp;amp;1; then&lt;br /&gt;
    echo &amp;quot;Error: Pool '$POOL' does not exist.&amp;quot;&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
&lt;br /&gt;
zfs set recordsize=1M &amp;quot;$POOL&amp;quot;&lt;br /&gt;
zfs set atime=off &amp;quot;$POOL&amp;quot;&lt;br /&gt;
zpool set autotrim=on &amp;quot;$POOL&amp;quot;&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Optimization settings applied to pool '$POOL'.&amp;quot;&lt;br /&gt;
```&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Codex_use_local_openai_api&amp;diff=5694</id>
		<title>Codex use local openai api</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Codex_use_local_openai_api&amp;diff=5694"/>
		<updated>2026-05-22T15:57:28Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot;To point the **Codex CLI** to your local `llama-server`, you primarily need to override the environment variables that the tool uses to locate the OpenAI API.  Since the Codex...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;To point the **Codex CLI** to your local `llama-server`, you primarily need to override the environment variables that the tool uses to locate the OpenAI API.&lt;br /&gt;
&lt;br /&gt;
Since the Codex CLI (and most OpenAI-compatible tools) follows standard OpenAI SDK conventions, you can redirect its traffic by pointing `OPENAI_BASE_URL` to your local endpoint.&lt;br /&gt;
&lt;br /&gt;
### 1. Set the Environment Variables&lt;br /&gt;
&lt;br /&gt;
You need to set these in your shell configuration (e.g., `~/.zshrc`, `~/.bashrc`) or for the specific session where you run the command.&lt;br /&gt;
&lt;br /&gt;
* **`OPENAI_BASE_URL`**: Set this to your `llama-server` address, ensuring you include the `/v1` path.&lt;br /&gt;
* **`OPENAI_API_KEY`**: Even if your `llama-server` doesn't require authentication, the CLI usually expects this variable to exist. You can set it to any dummy string (e.g., `sk-no-key`).&lt;br /&gt;
&lt;br /&gt;
**Example for your shell configuration:**&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
export OPENAI_BASE_URL=&amp;quot;http://localhost:8080/v1&amp;quot;&lt;br /&gt;
export OPENAI_API_KEY=&amp;quot;sk-not-needed&amp;quot;&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
*After saving this, run `source ~/.zshrc` (or your relevant shell config file) to apply the changes.*&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### 2. Configure via `config.toml` (Optional/Permanent)&lt;br /&gt;
&lt;br /&gt;
If you prefer a more permanent configuration—or if you need to use a specific model name that `llama-server` might not report by default—you can define a provider profile in the Codex CLI configuration file, typically located at `~/.codex/config.toml`.&lt;br /&gt;
&lt;br /&gt;
Add a custom provider section like this:&lt;br /&gt;
&lt;br /&gt;
```toml&lt;br /&gt;
# In ~/.codex/config.toml&lt;br /&gt;
&lt;br /&gt;
[model_providers.local]&lt;br /&gt;
name = &amp;quot;Llama Server&amp;quot;&lt;br /&gt;
base_url = &amp;quot;http://localhost:8080/v1&amp;quot;&lt;br /&gt;
env_key = &amp;quot;OPENAI_API_KEY&amp;quot;&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### 3. Verify the Connection&lt;br /&gt;
&lt;br /&gt;
Once the environment variables are set, test the connection by running a simple command:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
codex &amp;quot;What is the capital of Utah?&amp;quot;&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
### Important Tips for Local Models&lt;br /&gt;
&lt;br /&gt;
* **Model Name:** If the CLI complains that the &amp;quot;model is not found,&amp;quot; you may need to explicitly specify the model name that `llama-server` is reporting. You can check what models are available by hitting the `/v1/models` endpoint directly:&lt;br /&gt;
`curl http://localhost:8080/v1/models`&lt;br /&gt;
* **Compatibility:** Some &amp;quot;agentic&amp;quot; features of high-end CLI tools rely on OpenAI-specific function calling or vision capabilities. Depending on the model you are running in `llama.cpp` (e.g., a standard Llama 3 vs. a specialized coding model), some advanced agentic behaviors might have varying success compared to using a proprietary model like `gpt-4o`.&lt;br /&gt;
* **SSL/HTTPS:** Ensure your `llama-server` is running on `http` (or that you handle any certificate issues if you've enabled HTTPS), as CLI tools often fail with self-signed local certificates.&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Zfs_nvme_log_and_cache&amp;diff=5693</id>
		<title>Zfs nvme log and cache</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Zfs_nvme_log_and_cache&amp;diff=5693"/>
		<updated>2026-05-22T15:42:26Z</updated>

		<summary type="html">&lt;p&gt;Busk: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;```&lt;br /&gt;
sudo nvme list&lt;br /&gt;
Node                  Generic               SN                   Model                                    Namespace  Usage                      Format           FW Rev&lt;br /&gt;
--------------------- --------------------- -------------------- ---------------------------------------- ---------- -------------------------- ---------------- --------&lt;br /&gt;
/dev/nvme0n1          /dev/ng0n1            CVFT439300062P0EGN   INTEL SSDPEDMD020T4                      0x1          2.00  TB /   2.00  TB    512   B +  0 B   8DV101H0&lt;br /&gt;
&lt;br /&gt;
# 1. Create a new GPT partition table&lt;br /&gt;
sudo parted /dev/nvme0n1 mklabel gpt&lt;br /&gt;
&lt;br /&gt;
# 2. Create the first 1TB partition (for SLOG)&lt;br /&gt;
sudo parted -a optimal /dev/nvme0n1 mkpart primary 0% 1TB&lt;br /&gt;
&lt;br /&gt;
# 3. Create the second 1TB partition (for L2ARC)&lt;br /&gt;
sudo parted -a optimal /dev/nvme0n1 mkpart primary 1TB 100%&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
lsblk /dev/nvme0n1&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
# Add to log&lt;br /&gt;
sudo zpool add &amp;lt;pool_name&amp;gt; log /dev/nvme0n1p1&lt;br /&gt;
&lt;br /&gt;
# Add to cache&lt;br /&gt;
sudo zpool add &amp;lt;pool_name&amp;gt; cache /dev/nvme0n1p2&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
# Removal &amp;amp; Size Updates&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To change the size, you must remove the devices from the pool, repartition the drive, and add the new partitions back.&lt;br /&gt;
&lt;br /&gt;
### The Removal Process&lt;br /&gt;
&lt;br /&gt;
When you want to change the size, follow this sequence:&lt;br /&gt;
&lt;br /&gt;
1. **Remove the log (SLOG) device:**&lt;br /&gt;
```bash&lt;br /&gt;
zpool remove &amp;lt;pool_name&amp;gt; /dev/nvme0n1p1&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
    *Note: ZFS will gracefully flush any pending synchronous writes from the SLOG to the main pool before removing the device.*&lt;br /&gt;
&lt;br /&gt;
2.  **Remove the cache (L2ARC) device:**&lt;br /&gt;
    ```bash&lt;br /&gt;
    zpool remove &amp;lt;pool_name&amp;gt; /dev/nvme0n1p2&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
*Note: This is instantaneous as it simply drops the cache.*&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
3. **Repartition the drive:**&lt;br /&gt;
Use `parted` or `fdisk` to delete the old partitions and create the new ones to your desired sizes.&lt;br /&gt;
4. **Add them back to the pool:**&lt;br /&gt;
```bash&lt;br /&gt;
zpool add &amp;lt;pool_name&amp;gt; log /dev/nvme0n1p1&lt;br /&gt;
zpool add &amp;lt;pool_name&amp;gt; cache /dev/nvme0n1p2&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### Architect's Tip: Use LVM for Flexibility&lt;br /&gt;
If you anticipate wanting to resize these partitions later without wiping and recreating them, consider using **LVM (Logical Volume Manager)** as an intermediary layer.&lt;br /&gt;
&lt;br /&gt;
1.  Initialize the physical drive as an LVM Physical Volume (PV).&lt;br /&gt;
2.  Create a Volume Group (VG).&lt;br /&gt;
3.  Create two Logical Volumes (LV) — one for SLOG, one for L2ARC.&lt;br /&gt;
4.  Point ZFS to the device-mapper paths (e.g., `/dev/mapper/vgname-lvname`).&lt;br /&gt;
&lt;br /&gt;
**Why do this?**&lt;br /&gt;
*   **Elasticity:** You can grow or shrink the logical volumes on the fly.&lt;br /&gt;
*   **Abstraction:** You can move your ZFS log/cache to a different physical disk later by using `pvmove` without ever having to `zpool remove` or re-add the devices to ZFS.&lt;br /&gt;
&lt;br /&gt;
**Important Warning:** If you go the LVM route, ensure you are not creating a circular dependency where your root filesystem (and thus LVM) depends on a ZFS pool that is trying to access an LVM volume. Given your expertise, this is likely a non-issue, but it is a common pitfall.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
```&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Zfs_nvme_log_and_cache&amp;diff=5692</id>
		<title>Zfs nvme log and cache</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Zfs_nvme_log_and_cache&amp;diff=5692"/>
		<updated>2026-05-22T15:42:11Z</updated>

		<summary type="html">&lt;p&gt;Busk: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;```&lt;br /&gt;
sudo nvme list&lt;br /&gt;
Node                  Generic               SN                   Model                                    Namespace  Usage                      Format           FW Rev&lt;br /&gt;
--------------------- --------------------- -------------------- ---------------------------------------- ---------- -------------------------- ---------------- --------&lt;br /&gt;
/dev/nvme0n1          /dev/ng0n1            CVFT439300062P0EGN   INTEL SSDPEDMD020T4                      0x1          2.00  TB /   2.00  TB    512   B +  0 B   8DV101H0&lt;br /&gt;
&lt;br /&gt;
# 1. Create a new GPT partition table&lt;br /&gt;
sudo parted /dev/nvme0n1 mklabel gpt&lt;br /&gt;
&lt;br /&gt;
# 2. Create the first 1TB partition (for SLOG)&lt;br /&gt;
sudo parted -a optimal /dev/nvme0n1 mkpart primary 0% 1TB&lt;br /&gt;
&lt;br /&gt;
# 3. Create the second 1TB partition (for L2ARC)&lt;br /&gt;
sudo parted -a optimal /dev/nvme0n1 mkpart primary 1TB 100%&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
lsblk /dev/nvme0n1&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
# Add to log&lt;br /&gt;
sudo zpool add &amp;lt;pool_name&amp;gt; log /dev/nvme0n1p1&lt;br /&gt;
&lt;br /&gt;
# Add to cache&lt;br /&gt;
sudo zpool add &amp;lt;pool_name&amp;gt; cache /dev/nvme0n1p2&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
# Removal &amp;amp; Size Updates&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To change the size, you must remove the devices from the pool, repartition the drive, and add the new partitions back.&lt;br /&gt;
&lt;br /&gt;
### The Removal Process&lt;br /&gt;
&lt;br /&gt;
When you want to change the size, follow this sequence:&lt;br /&gt;
&lt;br /&gt;
1. **Remove the log (SLOG) device:**&lt;br /&gt;
```bash&lt;br /&gt;
zpool remove &amp;lt;pool_name&amp;gt; /dev/nvme0n1p1&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
    *Note: ZFS will gracefully flush any pending synchronous writes from the SLOG to the main pool before removing the device.*&lt;br /&gt;
&lt;br /&gt;
2.  **Remove the cache (L2ARC) device:**&lt;br /&gt;
    ```bash&lt;br /&gt;
    zpool remove &amp;lt;pool_name&amp;gt; /dev/nvme0n1p2&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
*Note: This is instantaneous as it simply drops the cache.*&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
3. **Repartition the drive:**&lt;br /&gt;
Use `parted` or `fdisk` to delete the old partitions and create the new ones to your desired sizes.&lt;br /&gt;
4. **Add them back to the pool:**&lt;br /&gt;
```bash&lt;br /&gt;
zpool add &amp;lt;pool_name&amp;gt; log /dev/nvme0n1p1&lt;br /&gt;
zpool add &amp;lt;pool_name&amp;gt; cache /dev/nvme0n1p2&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### Architect's Tip: Use LVM for Flexibility&lt;br /&gt;
If you anticipate wanting to resize these partitions later without wiping and recreating them, consider using **LVM (Logical Volume Manager)** as an intermediary layer.&lt;br /&gt;
&lt;br /&gt;
1.  Initialize the physical drive as an LVM Physical Volume (PV).&lt;br /&gt;
2.  Create a Volume Group (VG).&lt;br /&gt;
3.  Create two Logical Volumes (LV) — one for SLOG, one for L2ARC.&lt;br /&gt;
4.  Point ZFS to the device-mapper paths (e.g., `/dev/mapper/vgname-lvname`).&lt;br /&gt;
&lt;br /&gt;
**Why do this?**&lt;br /&gt;
*   **Elasticity:** You can grow or shrink the logical volumes on the fly.&lt;br /&gt;
*   **Abstraction:** You can move your ZFS log/cache to a different physical disk later by using `pvmove` without ever having to `zpool remove` or re-add the devices to ZFS.&lt;br /&gt;
&lt;br /&gt;
**Important Warning:** If you go the LVM route, ensure you are not creating a circular dependency where your root filesystem (and thus LVM) depends on a ZFS pool that is trying to access an LVM volume. Given your expertise, this is likely a non-issue, but it is a common pitfall.&lt;br /&gt;
&lt;br /&gt;
Are you planning to stick with static partitioning, or does the LVM approach fit your current infrastructure design better?&lt;br /&gt;
&lt;br /&gt;
```&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Zfs_nvme_log_and_cache&amp;diff=5691</id>
		<title>Zfs nvme log and cache</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Zfs_nvme_log_and_cache&amp;diff=5691"/>
		<updated>2026-05-22T15:36:45Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot;``` sudo nvme list Node                  Generic               SN                   Model                                    Namespace  Usage                      Format...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;```&lt;br /&gt;
sudo nvme list&lt;br /&gt;
Node                  Generic               SN                   Model                                    Namespace  Usage                      Format           FW Rev&lt;br /&gt;
--------------------- --------------------- -------------------- ---------------------------------------- ---------- -------------------------- ---------------- --------&lt;br /&gt;
/dev/nvme0n1          /dev/ng0n1            CVFT439300062P0EGN   INTEL SSDPEDMD020T4                      0x1          2.00  TB /   2.00  TB    512   B +  0 B   8DV101H0&lt;br /&gt;
&lt;br /&gt;
# 1. Create a new GPT partition table&lt;br /&gt;
sudo parted /dev/nvme0n1 mklabel gpt&lt;br /&gt;
&lt;br /&gt;
# 2. Create the first 1TB partition (for SLOG)&lt;br /&gt;
sudo parted -a optimal /dev/nvme0n1 mkpart primary 0% 1TB&lt;br /&gt;
&lt;br /&gt;
# 3. Create the second 1TB partition (for L2ARC)&lt;br /&gt;
sudo parted -a optimal /dev/nvme0n1 mkpart primary 1TB 100%&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
lsblk /dev/nvme0n1&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
# Add to log&lt;br /&gt;
sudo zpool add &amp;lt;pool_name&amp;gt; log /dev/nvme0n1p1&lt;br /&gt;
&lt;br /&gt;
# Add to cache&lt;br /&gt;
sudo zpool add &amp;lt;pool_name&amp;gt; cache /dev/nvme0n1p2&lt;br /&gt;
&lt;br /&gt;
```&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Llama_optimization&amp;diff=5690</id>
		<title>Llama optimization</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Llama_optimization&amp;diff=5690"/>
		<updated>2026-05-13T17:32:49Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot;Yes, that is a good **stable baseline**:  ```bash ./build/bin/llama-server \   -hf ggml-org/gpt-oss-20b-GGUF \   -ngl 999 \   -c 2048 \   -np 1 \   -t 8 \   --no-warmup \   --...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Yes, that is a good **stable baseline**:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
./build/bin/llama-server \&lt;br /&gt;
  -hf ggml-org/gpt-oss-20b-GGUF \&lt;br /&gt;
  -ngl 999 \&lt;br /&gt;
  -c 2048 \&lt;br /&gt;
  -np 1 \&lt;br /&gt;
  -t 8 \&lt;br /&gt;
  --no-warmup \&lt;br /&gt;
  --cache-ram 0 \&lt;br /&gt;
  --host 0.0.0.0 \&lt;br /&gt;
  --port 8080&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
For your **Threadripper + 128 GB RAM + Tesla P40**, I’d optimize for **stability first**, not max speed. Your previous log showed GPT-OSS 20B fully loaded, with the P40 detected as compute capability 6.1 and about 22.9 GiB VRAM available, so the hardware is basically working. &lt;br /&gt;
&lt;br /&gt;
## My recommended P40-stable launcher&lt;br /&gt;
&lt;br /&gt;
Use this instead:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
cd ~/llama.cpp&lt;br /&gt;
&lt;br /&gt;
GGML_CUDA_DISABLE_GRAPHS=1 ./build/bin/llama-server \&lt;br /&gt;
  -hf ggml-org/gpt-oss-20b-GGUF \&lt;br /&gt;
  -ngl 999 \&lt;br /&gt;
  -c 2048 \&lt;br /&gt;
  -np 1 \&lt;br /&gt;
  -t 12 \&lt;br /&gt;
  -b 512 \&lt;br /&gt;
  -ub 256 \&lt;br /&gt;
  --no-warmup \&lt;br /&gt;
  --cache-ram 0 \&lt;br /&gt;
  --host 0.0.0.0 \&lt;br /&gt;
  --port 8080&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Why:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
GGML_CUDA_DISABLE_GRAPHS=1  Avoids CUDA graph weirdness on older GPUs.&lt;br /&gt;
-c 2048                     Lower context = lower KV/cache pressure.&lt;br /&gt;
-np 1                       One request/slot at a time.&lt;br /&gt;
-t 12                       Enough CPU threads without pegging Threadripper.&lt;br /&gt;
-b 512 -ub 256              Smaller batches; slower maybe, but less crashy.&lt;br /&gt;
--cache-ram 0               Avoids prompt cache save/load path that appeared near your crash.&lt;br /&gt;
--no-warmup                 Avoids extra startup GPU work.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
CUDA graphs can improve performance, but they are also an area where llama.cpp has had CUDA-specific edge cases and workarounds; disabling them is a reasonable stability choice on Pascal/P40. NVIDIA’s CUDA-graphs writeup explains why they are used for performance, while llama.cpp issue/discussion threads show cases where disabling CUDA graphs is used as a workaround. ([NVIDIA Developer][1])&lt;br /&gt;
&lt;br /&gt;
## Auto-detect launcher script&lt;br /&gt;
&lt;br /&gt;
Create this:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
cat &amp;gt; ~/run-llama-gpt-oss.sh &amp;lt;&amp;lt;'EOF'&lt;br /&gt;
#!/usr/bin/env bash&lt;br /&gt;
set -euo pipefail&lt;br /&gt;
&lt;br /&gt;
MODEL_REPO=&amp;quot;${MODEL_REPO:-ggml-org/gpt-oss-20b-GGUF}&amp;quot;&lt;br /&gt;
HOST=&amp;quot;${HOST:-0.0.0.0}&amp;quot;&lt;br /&gt;
PORT=&amp;quot;${PORT:-8080}&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Detect CPU threads.&lt;br /&gt;
CPU_THREADS=&amp;quot;$(nproc)&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Use about half the CPU threads, capped. Good for responsiveness.&lt;br /&gt;
if (( CPU_THREADS &amp;gt;= 32 )); then&lt;br /&gt;
  THREADS=12&lt;br /&gt;
elif (( CPU_THREADS &amp;gt;= 16 )); then&lt;br /&gt;
  THREADS=8&lt;br /&gt;
else&lt;br /&gt;
  THREADS=$(( CPU_THREADS / 2 ))&lt;br /&gt;
  (( THREADS &amp;lt; 4 )) &amp;amp;&amp;amp; THREADS=4&lt;br /&gt;
fi&lt;br /&gt;
&lt;br /&gt;
# Detect system RAM in GiB.&lt;br /&gt;
RAM_GB=&amp;quot;$(awk '/MemTotal/ { printf &amp;quot;%d&amp;quot;, $2/1024/1024 }' /proc/meminfo)&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Detect NVIDIA VRAM in MiB.&lt;br /&gt;
if command -v nvidia-smi &amp;gt;/dev/null 2&amp;gt;&amp;amp;1; then&lt;br /&gt;
  VRAM_MIB=&amp;quot;$(nvidia-smi --query-gpu=memory.total --format=csv,noheader,nounits | head -n1 | tr -d ' ')&amp;quot;&lt;br /&gt;
  GPU_NAME=&amp;quot;$(nvidia-smi --query-gpu=name --format=csv,noheader | head -n1)&amp;quot;&lt;br /&gt;
else&lt;br /&gt;
  VRAM_MIB=0&lt;br /&gt;
  GPU_NAME=&amp;quot;none&amp;quot;&lt;br /&gt;
fi&lt;br /&gt;
&lt;br /&gt;
# Conservative defaults for older GPUs.&lt;br /&gt;
CTX=2048&lt;br /&gt;
NP=1&lt;br /&gt;
BATCH=512&lt;br /&gt;
UBATCH=256&lt;br /&gt;
NGL=999&lt;br /&gt;
CACHE_RAM=0&lt;br /&gt;
NO_WARMUP=&amp;quot;--no-warmup&amp;quot;&lt;br /&gt;
CUDA_ENV=&amp;quot;GGML_CUDA_DISABLE_GRAPHS=1&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# If VRAM is very small, reduce context/batch.&lt;br /&gt;
if (( VRAM_MIB &amp;gt; 0 &amp;amp;&amp;amp; VRAM_MIB &amp;lt; 16000 )); then&lt;br /&gt;
  CTX=1024&lt;br /&gt;
  BATCH=256&lt;br /&gt;
  UBATCH=128&lt;br /&gt;
fi&lt;br /&gt;
&lt;br /&gt;
# If VRAM is &amp;gt;= 32GB, allow larger context.&lt;br /&gt;
if (( VRAM_MIB &amp;gt;= 32000 )); then&lt;br /&gt;
  CTX=4096&lt;br /&gt;
  BATCH=1024&lt;br /&gt;
  UBATCH=512&lt;br /&gt;
fi&lt;br /&gt;
&lt;br /&gt;
# Detect Pascal/P40-ish GPU and keep safer settings.&lt;br /&gt;
if echo &amp;quot;$GPU_NAME&amp;quot; | grep -Eiq 'P40|P100|Pascal|GTX 10'; then&lt;br /&gt;
  CTX=2048&lt;br /&gt;
  NP=1&lt;br /&gt;
  BATCH=512&lt;br /&gt;
  UBATCH=256&lt;br /&gt;
  CACHE_RAM=0&lt;br /&gt;
  CUDA_ENV=&amp;quot;GGML_CUDA_DISABLE_GRAPHS=1&amp;quot;&lt;br /&gt;
fi&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Detected:&amp;quot;&lt;br /&gt;
echo &amp;quot;  CPU threads: $CPU_THREADS&amp;quot;&lt;br /&gt;
echo &amp;quot;  RAM:         ${RAM_GB} GiB&amp;quot;&lt;br /&gt;
echo &amp;quot;  GPU:         $GPU_NAME&amp;quot;&lt;br /&gt;
echo &amp;quot;  VRAM:        ${VRAM_MIB} MiB&amp;quot;&lt;br /&gt;
echo&lt;br /&gt;
echo &amp;quot;Launching llama-server:&amp;quot;&lt;br /&gt;
echo &amp;quot;  model repo:  $MODEL_REPO&amp;quot;&lt;br /&gt;
echo &amp;quot;  ctx:         $CTX&amp;quot;&lt;br /&gt;
echo &amp;quot;  parallel:    $NP&amp;quot;&lt;br /&gt;
echo &amp;quot;  threads:     $THREADS&amp;quot;&lt;br /&gt;
echo &amp;quot;  batch:       $BATCH&amp;quot;&lt;br /&gt;
echo &amp;quot;  ubatch:      $UBATCH&amp;quot;&lt;br /&gt;
echo &amp;quot;  cache-ram:   $CACHE_RAM&amp;quot;&lt;br /&gt;
echo&lt;br /&gt;
&lt;br /&gt;
cd &amp;quot;$HOME/llama.cpp&amp;quot;&lt;br /&gt;
&lt;br /&gt;
exec env $CUDA_ENV ./build/bin/llama-server \&lt;br /&gt;
  -hf &amp;quot;$MODEL_REPO&amp;quot; \&lt;br /&gt;
  -ngl &amp;quot;$NGL&amp;quot; \&lt;br /&gt;
  -c &amp;quot;$CTX&amp;quot; \&lt;br /&gt;
  -np &amp;quot;$NP&amp;quot; \&lt;br /&gt;
  -t &amp;quot;$THREADS&amp;quot; \&lt;br /&gt;
  -b &amp;quot;$BATCH&amp;quot; \&lt;br /&gt;
  -ub &amp;quot;$UBATCH&amp;quot; \&lt;br /&gt;
  $NO_WARMUP \&lt;br /&gt;
  --cache-ram &amp;quot;$CACHE_RAM&amp;quot; \&lt;br /&gt;
  --host &amp;quot;$HOST&amp;quot; \&lt;br /&gt;
  --port &amp;quot;$PORT&amp;quot;&lt;br /&gt;
EOF&lt;br /&gt;
&lt;br /&gt;
chmod +x ~/run-llama-gpt-oss.sh&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Run it:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
~/run-llama-gpt-oss.sh&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
## Test command&lt;br /&gt;
&lt;br /&gt;
In another terminal:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
curl -N http://127.0.0.1:8080/v1/chat/completions \&lt;br /&gt;
  -H 'Content-Type: application/json' \&lt;br /&gt;
  -d '{&lt;br /&gt;
    &amp;quot;model&amp;quot;: &amp;quot;gpt-oss-20b&amp;quot;,&lt;br /&gt;
    &amp;quot;stream&amp;quot;: true,&lt;br /&gt;
    &amp;quot;messages&amp;quot;: [&lt;br /&gt;
      {&lt;br /&gt;
        &amp;quot;role&amp;quot;: &amp;quot;user&amp;quot;,&lt;br /&gt;
        &amp;quot;content&amp;quot;: &amp;quot;Output only valid Python code. No markdown. Create a small Dog class with name, breed, age, tricks, add_trick(), and __str__().&amp;quot;&lt;br /&gt;
      }&lt;br /&gt;
    ],&lt;br /&gt;
    &amp;quot;temperature&amp;quot;: 0.2,&lt;br /&gt;
    &amp;quot;max_tokens&amp;quot;: 250&lt;br /&gt;
  }'&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
## About auto-detecting `max_tokens`&lt;br /&gt;
&lt;br /&gt;
`max_tokens` is **per request**, not really a server setting. The server can control context size with `-c`, but each API request should still set `max_tokens`.&lt;br /&gt;
&lt;br /&gt;
Simple rule:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Small answer:      max_tokens 50-150&lt;br /&gt;
Small code:        max_tokens 200-500&lt;br /&gt;
Medium code:       max_tokens 800-1500&lt;br /&gt;
Long file/design:  max_tokens 2000+&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
For the P40, I’d keep most tests at:&lt;br /&gt;
&lt;br /&gt;
```json&lt;br /&gt;
&amp;quot;max_tokens&amp;quot;: 250&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Then increase only when needed.&lt;br /&gt;
&lt;br /&gt;
## Add a simple curl wrapper&lt;br /&gt;
&lt;br /&gt;
Create this:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
cat &amp;gt; ~/ask-llama.sh &amp;lt;&amp;lt;'EOF'&lt;br /&gt;
#!/usr/bin/env bash&lt;br /&gt;
set -euo pipefail&lt;br /&gt;
&lt;br /&gt;
PROMPT=&amp;quot;${*:-Say hello.}&amp;quot;&lt;br /&gt;
MAX_TOKENS=&amp;quot;${MAX_TOKENS:-300}&amp;quot;&lt;br /&gt;
&lt;br /&gt;
curl -s http://127.0.0.1:8080/v1/chat/completions \&lt;br /&gt;
  -H 'Content-Type: application/json' \&lt;br /&gt;
  -d &amp;quot;$(jq -n \&lt;br /&gt;
    --arg prompt &amp;quot;$PROMPT&amp;quot; \&lt;br /&gt;
    --argjson max_tokens &amp;quot;$MAX_TOKENS&amp;quot; \&lt;br /&gt;
    '{&lt;br /&gt;
      model: &amp;quot;gpt-oss-20b&amp;quot;,&lt;br /&gt;
      messages: [&lt;br /&gt;
        {&lt;br /&gt;
          role: &amp;quot;user&amp;quot;,&lt;br /&gt;
          content: $prompt&lt;br /&gt;
        }&lt;br /&gt;
      ],&lt;br /&gt;
      temperature: 0.2,&lt;br /&gt;
      max_tokens: $max_tokens&lt;br /&gt;
    }')&amp;quot; | jq -r '.choices[0].message.content'&lt;br /&gt;
EOF&lt;br /&gt;
&lt;br /&gt;
chmod +x ~/ask-llama.sh&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Use it:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
~/ask-llama.sh &amp;quot;Output only valid Python code. Create a hello world script.&amp;quot;&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
With more output:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
MAX_TOKENS=800 ~/ask-llama.sh &amp;quot;Output only valid Python code. Create a Dog class with name, breed, age, tricks, add_trick(), and __str__().&amp;quot;&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
## My final recommendation&lt;br /&gt;
&lt;br /&gt;
For your P40, use:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
GGML_CUDA_DISABLE_GRAPHS=1&lt;br /&gt;
-c 2048&lt;br /&gt;
-np 1&lt;br /&gt;
-t 12&lt;br /&gt;
-b 512&lt;br /&gt;
-ub 256&lt;br /&gt;
--cache-ram 0&lt;br /&gt;
--no-warmup&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
After it is stable for a while, try performance increases one at a time:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
1. Increase -t from 12 to 16&lt;br /&gt;
2. Increase -c from 2048 to 4096&lt;br /&gt;
3. Increase -b 512 to 1024&lt;br /&gt;
4. Re-enable CUDA graphs only if everything else is stable&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Do not tune everything at once, because then you will not know what caused the next crash.&lt;br /&gt;
&lt;br /&gt;
[1]: https://developer.nvidia.com/blog/optimizing-llama-cpp-ai-inference-with-cuda-graphs/?utm_source=chatgpt.com &amp;quot;Optimizing llama.cpp AI Inference with CUDA Graphs&amp;quot;&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Llama_1&amp;diff=5689</id>
		<title>Llama 1</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Llama_1&amp;diff=5689"/>
		<updated>2026-05-13T16:32:10Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot;`model.gguf` was just a placeholder. You need to download an actual `.gguf` file, such as:  ```text gpt-oss-20b-mxfp4.gguf ```  For GPT-OSS 20B, the Hugging Face repo is:  ```...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;`model.gguf` was just a placeholder. You need to download an actual `.gguf` file, such as:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
gpt-oss-20b-mxfp4.gguf&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
For GPT-OSS 20B, the Hugging Face repo is:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
ggml-org/gpt-oss-20b-GGUF&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
The model page shows `gpt-oss-20b-mxfp4.gguf` as the file used by `llama.cpp` / `llama-cpp-python`, and it also supports direct `llama-server -hf ggml-org/gpt-oss-20b-GGUF`. ([Hugging Face][1])&lt;br /&gt;
&lt;br /&gt;
## Easiest: let `llama.cpp` download it&lt;br /&gt;
&lt;br /&gt;
From your `llama.cpp` directory:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
cd ~/llama.cpp&lt;br /&gt;
&lt;br /&gt;
./build/bin/llama-server \&lt;br /&gt;
  -hf ggml-org/gpt-oss-20b-GGUF \&lt;br /&gt;
  -ngl 999 \&lt;br /&gt;
  -c 4096 \&lt;br /&gt;
  --host 0.0.0.0 \&lt;br /&gt;
  --port 8080&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
The official quick start for this repo is basically:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
llama-server -hf ggml-org/gpt-oss-20b-GGUF -c 0 --jinja&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
([Hugging Face][2])&lt;br /&gt;
&lt;br /&gt;
For your P40, I would start with:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
-c 4096&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
rather than unlimited/context auto mode.&lt;br /&gt;
&lt;br /&gt;
## Manual download to a real file&lt;br /&gt;
&lt;br /&gt;
Install the Hugging Face CLI:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
sudo apt install -y python3-pip&lt;br /&gt;
python3 -m pip install --user -U huggingface_hub&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Make sure your user-local Python bin path is active:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
export PATH=&amp;quot;$HOME/.local/bin:$PATH&amp;quot;&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Create a model directory:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
mkdir -p ~/models/gpt-oss-20b&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Download the GGUF file:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
huggingface-cli download ggml-org/gpt-oss-20b-GGUF \&lt;br /&gt;
  gpt-oss-20b-mxfp4.gguf \&lt;br /&gt;
  --local-dir ~/models/gpt-oss-20b \&lt;br /&gt;
  --local-dir-use-symlinks False&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Then run it:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
cd ~/llama.cpp&lt;br /&gt;
&lt;br /&gt;
./build/bin/llama-server \&lt;br /&gt;
  -m ~/models/gpt-oss-20b/gpt-oss-20b-mxfp4.gguf \&lt;br /&gt;
  -ngl 999 \&lt;br /&gt;
  -c 4096 \&lt;br /&gt;
  --host 0.0.0.0 \&lt;br /&gt;
  --port 8080&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
## Test it&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
curl http://127.0.0.1:8080/v1/chat/completions \&lt;br /&gt;
  -H 'Content-Type: application/json' \&lt;br /&gt;
  -d '{&lt;br /&gt;
    &amp;quot;model&amp;quot;: &amp;quot;gpt-oss-20b&amp;quot;,&lt;br /&gt;
    &amp;quot;messages&amp;quot;: [&lt;br /&gt;
      {&lt;br /&gt;
        &amp;quot;role&amp;quot;: &amp;quot;user&amp;quot;,&lt;br /&gt;
        &amp;quot;content&amp;quot;: &amp;quot;Write a minimal Go HTTP health check server.&amp;quot;&lt;br /&gt;
      }&lt;br /&gt;
    ],&lt;br /&gt;
    &amp;quot;temperature&amp;quot;: 0.2&lt;br /&gt;
  }'&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
## For Qwen later&lt;br /&gt;
&lt;br /&gt;
Same idea, but choose a Qwen GGUF repo instead. For example, Qwen’s docs show running Qwen models through `llama.cpp` using GGUF files. ([Hugging Face][3])&lt;br /&gt;
&lt;br /&gt;
For now, get GPT-OSS 20B working first with:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
-hf ggml-org/gpt-oss-20b-GGUF&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
or with the downloaded file:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
-m ~/models/gpt-oss-20b/gpt-oss-20b-mxfp4.gguf&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
[1]: https://huggingface.co/ggml-org/gpt-oss-20b-GGUF?utm_source=chatgpt.com &amp;quot;ggml-org/gpt-oss-20b-GGUF&amp;quot;&lt;br /&gt;
[2]: https://huggingface.co/ggml-org/gpt-oss-20b-GGUF/resolve/main/README.md?download=true&amp;amp;utm_source=chatgpt.com &amp;quot;285 Bytes&amp;quot;&lt;br /&gt;
[3]: https://huggingface.co/docs/inference-endpoints/engines/llama_cpp?utm_source=chatgpt.com &amp;quot;llama.cpp&amp;quot;&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Tesla_P40_llama.cpp&amp;diff=5688</id>
		<title>Tesla P40 llama.cpp</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Tesla_P40_llama.cpp&amp;diff=5688"/>
		<updated>2026-05-13T16:25:46Z</updated>

		<summary type="html">&lt;p&gt;Busk: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;```&lt;br /&gt;
sudo apt update&lt;br /&gt;
sudo apt install -y nvidia-cuda-toolkit&lt;br /&gt;
sudo apt install -y git cmake build-essential curl libcurl4-openssl-dev&lt;br /&gt;
&lt;br /&gt;
git clone https://github.com/ggml-org/llama.cpp&lt;br /&gt;
cd llama.cpp&lt;br /&gt;
&lt;br /&gt;
cmake -B build -DGGML_CUDA=ON&lt;br /&gt;
cmake --build build --config Release -j&amp;quot;$(nproc)&amp;quot;&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
That error is simple: **your NVIDIA driver is installed, but the CUDA Toolkit compiler is not.**&lt;br /&gt;
&lt;br /&gt;
`nvidia-smi` working means the **driver/runtime** is working.&lt;br /&gt;
Building `llama.cpp` with CUDA requires **`nvcc`**, which comes from the **CUDA Toolkit**, not from `nvidia-utils`. NVIDIA’s CUDA Linux install guide separates the driver from CUDA development tools/toolkit. ([NVIDIA Docs][1])&lt;br /&gt;
&lt;br /&gt;
Check:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
which nvcc&lt;br /&gt;
nvcc --version&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Right now those probably fail.&lt;br /&gt;
&lt;br /&gt;
## Best fix on Ubuntu 24.04&lt;br /&gt;
&lt;br /&gt;
Since you are using `580-server` for the Tesla P40, I would install the Ubuntu CUDA toolkit package first:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
sudo apt update&lt;br /&gt;
sudo apt install -y nvidia-cuda-toolkit&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Then verify:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
which nvcc&lt;br /&gt;
nvcc --version&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Then rebuild `llama.cpp` cleanly:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
cd ~/llama.cpp&lt;br /&gt;
&lt;br /&gt;
rm -rf build&lt;br /&gt;
&lt;br /&gt;
cmake -B build \&lt;br /&gt;
  -DGGML_CUDA=ON \&lt;br /&gt;
  -DCMAKE_BUILD_TYPE=Release \&lt;br /&gt;
  -DCMAKE_CUDA_ARCHITECTURES=61&lt;br /&gt;
&lt;br /&gt;
cmake --build build --config Release -j&amp;quot;$(nproc)&amp;quot;&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
The important part for Tesla P40 is:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
-DCMAKE_CUDA_ARCHITECTURES=61&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
P40 is Pascal, compute capability **6.1**. Setting that explicitly avoids CMake trying to guess or building unnecessary architectures.&lt;br /&gt;
&lt;br /&gt;
## If `nvidia-cuda-toolkit` is too old or missing&lt;br /&gt;
&lt;br /&gt;
Use NVIDIA’s CUDA repo/toolkit packages instead. For a P40, I would avoid chasing CUDA 13 unless needed. CUDA 12.x is safer for Pascal-era cards and llama.cpp.&lt;br /&gt;
&lt;br /&gt;
Example pattern:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
sudo apt install -y cuda-toolkit-12-8&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Some Ubuntu 24.04 CUDA install docs show installing `cuda-toolkit-12-8` from NVIDIA’s repo after adding the CUDA repo/keyring. ([docs.leaf.cloud][2])&lt;br /&gt;
&lt;br /&gt;
But try this first:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
sudo apt install -y nvidia-cuda-toolkit&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
## Do not reinstall the NVIDIA driver&lt;br /&gt;
&lt;br /&gt;
Do **not** run a broad CUDA installer that replaces your working `580-server` driver with 590/595.&lt;br /&gt;
&lt;br /&gt;
You want:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Driver:       nvidia-driver-580-server&lt;br /&gt;
CUDA toolkit: nvcc/compiler only&lt;br /&gt;
llama.cpp:    built with GGML_CUDA=ON&lt;br /&gt;
Arch:         61 for Tesla P40&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
After build, test:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
./build/bin/llama-cli --help | grep -i cuda&lt;br /&gt;
./build/bin/llama-bench --help&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Then run a small model or server with:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
./build/bin/llama-server \&lt;br /&gt;
  -m /path/to/model.gguf \&lt;br /&gt;
  -ngl 999 \&lt;br /&gt;
  -c 4096 \&lt;br /&gt;
  --host 0.0.0.0 \&lt;br /&gt;
  --port 8080&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Your driver is fine now; you just need `nvcc`.&lt;br /&gt;
&lt;br /&gt;
[1]: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/?utm_source=chatgpt.com &amp;quot;CUDA Installation Guide for Linux&amp;quot;&lt;br /&gt;
[2]: https://docs.leaf.cloud/en/latest/data-science/installing-gpu-drivers/?utm_source=chatgpt.com &amp;quot;Installing NVIDIA Drivers and CUDA on Ubuntu 24.04&amp;quot;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Yes — for a **Tesla P40**, I would use **`llama.cpp` first** for GPT-OSS, Qwen, DeepSeek, Llama, Gemma, etc.&lt;br /&gt;
&lt;br /&gt;
Your P40 is a great cheap 24 GB VRAM card, but it is **Pascal**, so it lacks Tensor Cores and newer datatype acceleration. That makes `llama.cpp` a better fit than heavier stacks like vLLM/TGI for most local single-GPU use.&lt;br /&gt;
&lt;br /&gt;
## Recommendation for your P40&lt;br /&gt;
&lt;br /&gt;
Use this stack:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Best low-friction stack:&lt;br /&gt;
llama.cpp + GGUF models + CUDA build&lt;br /&gt;
&lt;br /&gt;
Optional convenience layer:&lt;br /&gt;
Ollama or LM Studio&lt;br /&gt;
&lt;br /&gt;
Avoid as first choice on P40:&lt;br /&gt;
vLLM, TensorRT-LLM, exllamav2 for GPT-OSS&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
`llama.cpp` has active support for GPT-OSS and GGUF models, and there are ready GGUF builds for `gpt-oss-20b`. Hugging Face’s GGUF page for `ggml-org/gpt-oss-20b-GGUF` shows direct `llama.cpp`, `llama-cpp-python`, Ollama, LM Studio, and Jan usage paths. ([Hugging Face][1])&lt;br /&gt;
&lt;br /&gt;
For Qwen, the official Qwen docs also document running Qwen3/Qwen3MoE with `llama.cpp` and GGUF, with support starting from specific llama.cpp builds. ([Qwen][2])&lt;br /&gt;
&lt;br /&gt;
## What models fit your P40?&lt;br /&gt;
&lt;br /&gt;
With 24 GB VRAM:&lt;br /&gt;
&lt;br /&gt;
| Model type                       | Recommendation                              |&lt;br /&gt;
| -------------------------------- | ------------------------------------------- |&lt;br /&gt;
| GPT-OSS 20B                      | Good target                                 |&lt;br /&gt;
| GPT-OSS 120B                     | Not realistic on one P40                    |&lt;br /&gt;
| Qwen 7B / 14B / 30B-A3B MoE      | Good targets depending quant                |&lt;br /&gt;
| DeepSeek Coder / Qwen Coder GGUF | Good for coding                             |&lt;br /&gt;
| Llama 3.x 8B / 70B quantized     | 8B easy, 70B partially/offload or CPU spill |&lt;br /&gt;
| Mixtral / MoE GGUF               | Can work, speed varies                      |&lt;br /&gt;
&lt;br /&gt;
For your use — Go, SQL, Python, React, infra code — I would start with:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
1. Qwen coder GGUF model&lt;br /&gt;
2. GPT-OSS 20B GGUF&lt;br /&gt;
3. DeepSeek coder/distill GGUF&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
For the P40 specifically, **GGUF quantized models** are the practical path.&lt;br /&gt;
&lt;br /&gt;
## Build `llama.cpp` with CUDA&lt;br /&gt;
&lt;br /&gt;
On Ubuntu:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
sudo apt update&lt;br /&gt;
sudo apt install -y git cmake build-essential curl libcurl4-openssl-dev&lt;br /&gt;
&lt;br /&gt;
git clone https://github.com/ggml-org/llama.cpp&lt;br /&gt;
cd llama.cpp&lt;br /&gt;
&lt;br /&gt;
cmake -B build -DGGML_CUDA=ON&lt;br /&gt;
cmake --build build --config Release -j&amp;quot;$(nproc)&amp;quot;&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Check CUDA offload works:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
./build/bin/llama-cli --help | grep -i gpu&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
## Run GPT-OSS 20B&lt;br /&gt;
&lt;br /&gt;
Example using llama.cpp’s Hugging Face download support:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
./build/bin/llama-server \&lt;br /&gt;
  -hf ggml-org/gpt-oss-20b-GGUF \&lt;br /&gt;
  -ngl 999 \&lt;br /&gt;
  -c 8192 \&lt;br /&gt;
  --host 0.0.0.0 \&lt;br /&gt;
  --port 8080&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Then test OpenAI-compatible API:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
curl http://127.0.0.1:8080/v1/chat/completions \&lt;br /&gt;
  -H 'Content-Type: application/json' \&lt;br /&gt;
  -d '{&lt;br /&gt;
    &amp;quot;model&amp;quot;: &amp;quot;gpt-oss-20b&amp;quot;,&lt;br /&gt;
    &amp;quot;messages&amp;quot;: [&lt;br /&gt;
      {&amp;quot;role&amp;quot;: &amp;quot;user&amp;quot;, &amp;quot;content&amp;quot;: &amp;quot;Write a small Go HTTP server with health check.&amp;quot;}&lt;br /&gt;
    ],&lt;br /&gt;
    &amp;quot;temperature&amp;quot;: 0.2&lt;br /&gt;
  }'&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
If VRAM gets tight, reduce context:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
-c 4096&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Or reduce GPU layers:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
-ngl 60&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
But on a 24 GB P40, for many 20B-ish quantized GGUFs, this should be okay.&lt;br /&gt;
&lt;br /&gt;
## Run a Qwen coder model&lt;br /&gt;
&lt;br /&gt;
For coding, Qwen is probably where I would spend most of my time. Example:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
./build/bin/llama-server \&lt;br /&gt;
  -hf unsloth/Qwen3-Coder-Next-GGUF \&lt;br /&gt;
  -ngl 999 \&lt;br /&gt;
  -c 8192 \&lt;br /&gt;
  --host 0.0.0.0 \&lt;br /&gt;
  --port 8080&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
The Unsloth Qwen GGUF page notes recent llama.cpp fixes for Qwen output/tool-calling behavior, so use a fresh llama.cpp build rather than an old distro package. ([Hugging Face][3])&lt;br /&gt;
&lt;br /&gt;
## Should you use Ollama instead?&lt;br /&gt;
&lt;br /&gt;
Ollama is easier, but `llama.cpp` gives you more direct control.&lt;br /&gt;
&lt;br /&gt;
Use **Ollama** if you want simple:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
ollama run hf.co/ggml-org/gpt-oss-20b-GGUF&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Use **llama.cpp directly** if you care about:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
GPU layer tuning&lt;br /&gt;
context size tuning&lt;br /&gt;
exact GGUF file selection&lt;br /&gt;
server flags&lt;br /&gt;
benchmarking&lt;br /&gt;
performance debugging&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Given how you work, I’d use **llama.cpp directly** first.&lt;br /&gt;
&lt;br /&gt;
## Should you use vLLM?&lt;br /&gt;
&lt;br /&gt;
For your **P40**, probably no — not as your first runtime.&lt;br /&gt;
&lt;br /&gt;
vLLM is excellent for newer GPUs and serving many users, but it tends to assume newer CUDA paths and benefits heavily from Tensor Cores / modern attention kernels. On Pascal P40, `llama.cpp` is usually simpler and more forgiving.&lt;br /&gt;
&lt;br /&gt;
I would use vLLM when you have something like:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
RTX 3090&lt;br /&gt;
RTX 4090&lt;br /&gt;
RTX PRO 4500 Blackwell&lt;br /&gt;
A10/A40/A100/H100/L40S&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
For the P40:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
llama.cpp &amp;gt; Ollama &amp;gt; exllamav2/vLLM experiments&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
## My practical recommendation&lt;br /&gt;
&lt;br /&gt;
Start with:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Runtime: llama.cpp&lt;br /&gt;
Driver: 580-server&lt;br /&gt;
Model format: GGUF&lt;br /&gt;
First model: gpt-oss-20b-GGUF&lt;br /&gt;
Coding model: Qwen coder GGUF&lt;br /&gt;
Context: 4096 or 8192 first&lt;br /&gt;
GPU layers: -ngl 999&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Then benchmark:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
./build/bin/llama-bench -m /path/to/model.gguf -ngl 999&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
For a single Tesla P40, **don’t chase the newest serving framework**. Use stable `580-server`, fresh `llama.cpp`, and GGUF models. That will give you the fewest headaches and the best compatibility.&lt;br /&gt;
&lt;br /&gt;
[1]: https://huggingface.co/ggml-org/gpt-oss-20b-GGUF?utm_source=chatgpt.com &amp;quot;ggml-org/gpt-oss-20b-GGUF&amp;quot;&lt;br /&gt;
[2]: https://qwen.readthedocs.io/en/latest/run_locally/llama.cpp.html?utm_source=chatgpt.com &amp;quot;llama.cpp - Qwen - Read the Docs&amp;quot;&lt;br /&gt;
[3]: https://huggingface.co/unsloth/Qwen3-Coder-Next-GGUF?utm_source=chatgpt.com &amp;quot;unsloth/Qwen3-Coder-Next-GGUF&amp;quot;&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Tesla_P40_llama.cpp&amp;diff=5687</id>
		<title>Tesla P40 llama.cpp</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Tesla_P40_llama.cpp&amp;diff=5687"/>
		<updated>2026-05-13T16:21:34Z</updated>

		<summary type="html">&lt;p&gt;Busk: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;```&lt;br /&gt;
sudo apt update&lt;br /&gt;
sudo apt install -y nvidia-cuda-toolkit&lt;br /&gt;
sudo apt install -y git cmake build-essential curl libcurl4-openssl-dev&lt;br /&gt;
&lt;br /&gt;
git clone https://github.com/ggml-org/llama.cpp&lt;br /&gt;
cd llama.cpp&lt;br /&gt;
&lt;br /&gt;
cmake -B build -DGGML_CUDA=ON&lt;br /&gt;
cmake --build build --config Release -j&amp;quot;$(nproc)&amp;quot;&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Yes — for a **Tesla P40**, I would use **`llama.cpp` first** for GPT-OSS, Qwen, DeepSeek, Llama, Gemma, etc.&lt;br /&gt;
&lt;br /&gt;
Your P40 is a great cheap 24 GB VRAM card, but it is **Pascal**, so it lacks Tensor Cores and newer datatype acceleration. That makes `llama.cpp` a better fit than heavier stacks like vLLM/TGI for most local single-GPU use.&lt;br /&gt;
&lt;br /&gt;
## Recommendation for your P40&lt;br /&gt;
&lt;br /&gt;
Use this stack:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Best low-friction stack:&lt;br /&gt;
llama.cpp + GGUF models + CUDA build&lt;br /&gt;
&lt;br /&gt;
Optional convenience layer:&lt;br /&gt;
Ollama or LM Studio&lt;br /&gt;
&lt;br /&gt;
Avoid as first choice on P40:&lt;br /&gt;
vLLM, TensorRT-LLM, exllamav2 for GPT-OSS&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
`llama.cpp` has active support for GPT-OSS and GGUF models, and there are ready GGUF builds for `gpt-oss-20b`. Hugging Face’s GGUF page for `ggml-org/gpt-oss-20b-GGUF` shows direct `llama.cpp`, `llama-cpp-python`, Ollama, LM Studio, and Jan usage paths. ([Hugging Face][1])&lt;br /&gt;
&lt;br /&gt;
For Qwen, the official Qwen docs also document running Qwen3/Qwen3MoE with `llama.cpp` and GGUF, with support starting from specific llama.cpp builds. ([Qwen][2])&lt;br /&gt;
&lt;br /&gt;
## What models fit your P40?&lt;br /&gt;
&lt;br /&gt;
With 24 GB VRAM:&lt;br /&gt;
&lt;br /&gt;
| Model type                       | Recommendation                              |&lt;br /&gt;
| -------------------------------- | ------------------------------------------- |&lt;br /&gt;
| GPT-OSS 20B                      | Good target                                 |&lt;br /&gt;
| GPT-OSS 120B                     | Not realistic on one P40                    |&lt;br /&gt;
| Qwen 7B / 14B / 30B-A3B MoE      | Good targets depending quant                |&lt;br /&gt;
| DeepSeek Coder / Qwen Coder GGUF | Good for coding                             |&lt;br /&gt;
| Llama 3.x 8B / 70B quantized     | 8B easy, 70B partially/offload or CPU spill |&lt;br /&gt;
| Mixtral / MoE GGUF               | Can work, speed varies                      |&lt;br /&gt;
&lt;br /&gt;
For your use — Go, SQL, Python, React, infra code — I would start with:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
1. Qwen coder GGUF model&lt;br /&gt;
2. GPT-OSS 20B GGUF&lt;br /&gt;
3. DeepSeek coder/distill GGUF&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
For the P40 specifically, **GGUF quantized models** are the practical path.&lt;br /&gt;
&lt;br /&gt;
## Build `llama.cpp` with CUDA&lt;br /&gt;
&lt;br /&gt;
On Ubuntu:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
sudo apt update&lt;br /&gt;
sudo apt install -y git cmake build-essential curl libcurl4-openssl-dev&lt;br /&gt;
&lt;br /&gt;
git clone https://github.com/ggml-org/llama.cpp&lt;br /&gt;
cd llama.cpp&lt;br /&gt;
&lt;br /&gt;
cmake -B build -DGGML_CUDA=ON&lt;br /&gt;
cmake --build build --config Release -j&amp;quot;$(nproc)&amp;quot;&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Check CUDA offload works:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
./build/bin/llama-cli --help | grep -i gpu&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
## Run GPT-OSS 20B&lt;br /&gt;
&lt;br /&gt;
Example using llama.cpp’s Hugging Face download support:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
./build/bin/llama-server \&lt;br /&gt;
  -hf ggml-org/gpt-oss-20b-GGUF \&lt;br /&gt;
  -ngl 999 \&lt;br /&gt;
  -c 8192 \&lt;br /&gt;
  --host 0.0.0.0 \&lt;br /&gt;
  --port 8080&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Then test OpenAI-compatible API:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
curl http://127.0.0.1:8080/v1/chat/completions \&lt;br /&gt;
  -H 'Content-Type: application/json' \&lt;br /&gt;
  -d '{&lt;br /&gt;
    &amp;quot;model&amp;quot;: &amp;quot;gpt-oss-20b&amp;quot;,&lt;br /&gt;
    &amp;quot;messages&amp;quot;: [&lt;br /&gt;
      {&amp;quot;role&amp;quot;: &amp;quot;user&amp;quot;, &amp;quot;content&amp;quot;: &amp;quot;Write a small Go HTTP server with health check.&amp;quot;}&lt;br /&gt;
    ],&lt;br /&gt;
    &amp;quot;temperature&amp;quot;: 0.2&lt;br /&gt;
  }'&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
If VRAM gets tight, reduce context:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
-c 4096&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Or reduce GPU layers:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
-ngl 60&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
But on a 24 GB P40, for many 20B-ish quantized GGUFs, this should be okay.&lt;br /&gt;
&lt;br /&gt;
## Run a Qwen coder model&lt;br /&gt;
&lt;br /&gt;
For coding, Qwen is probably where I would spend most of my time. Example:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
./build/bin/llama-server \&lt;br /&gt;
  -hf unsloth/Qwen3-Coder-Next-GGUF \&lt;br /&gt;
  -ngl 999 \&lt;br /&gt;
  -c 8192 \&lt;br /&gt;
  --host 0.0.0.0 \&lt;br /&gt;
  --port 8080&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
The Unsloth Qwen GGUF page notes recent llama.cpp fixes for Qwen output/tool-calling behavior, so use a fresh llama.cpp build rather than an old distro package. ([Hugging Face][3])&lt;br /&gt;
&lt;br /&gt;
## Should you use Ollama instead?&lt;br /&gt;
&lt;br /&gt;
Ollama is easier, but `llama.cpp` gives you more direct control.&lt;br /&gt;
&lt;br /&gt;
Use **Ollama** if you want simple:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
ollama run hf.co/ggml-org/gpt-oss-20b-GGUF&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Use **llama.cpp directly** if you care about:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
GPU layer tuning&lt;br /&gt;
context size tuning&lt;br /&gt;
exact GGUF file selection&lt;br /&gt;
server flags&lt;br /&gt;
benchmarking&lt;br /&gt;
performance debugging&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Given how you work, I’d use **llama.cpp directly** first.&lt;br /&gt;
&lt;br /&gt;
## Should you use vLLM?&lt;br /&gt;
&lt;br /&gt;
For your **P40**, probably no — not as your first runtime.&lt;br /&gt;
&lt;br /&gt;
vLLM is excellent for newer GPUs and serving many users, but it tends to assume newer CUDA paths and benefits heavily from Tensor Cores / modern attention kernels. On Pascal P40, `llama.cpp` is usually simpler and more forgiving.&lt;br /&gt;
&lt;br /&gt;
I would use vLLM when you have something like:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
RTX 3090&lt;br /&gt;
RTX 4090&lt;br /&gt;
RTX PRO 4500 Blackwell&lt;br /&gt;
A10/A40/A100/H100/L40S&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
For the P40:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
llama.cpp &amp;gt; Ollama &amp;gt; exllamav2/vLLM experiments&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
## My practical recommendation&lt;br /&gt;
&lt;br /&gt;
Start with:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Runtime: llama.cpp&lt;br /&gt;
Driver: 580-server&lt;br /&gt;
Model format: GGUF&lt;br /&gt;
First model: gpt-oss-20b-GGUF&lt;br /&gt;
Coding model: Qwen coder GGUF&lt;br /&gt;
Context: 4096 or 8192 first&lt;br /&gt;
GPU layers: -ngl 999&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Then benchmark:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
./build/bin/llama-bench -m /path/to/model.gguf -ngl 999&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
For a single Tesla P40, **don’t chase the newest serving framework**. Use stable `580-server`, fresh `llama.cpp`, and GGUF models. That will give you the fewest headaches and the best compatibility.&lt;br /&gt;
&lt;br /&gt;
[1]: https://huggingface.co/ggml-org/gpt-oss-20b-GGUF?utm_source=chatgpt.com &amp;quot;ggml-org/gpt-oss-20b-GGUF&amp;quot;&lt;br /&gt;
[2]: https://qwen.readthedocs.io/en/latest/run_locally/llama.cpp.html?utm_source=chatgpt.com &amp;quot;llama.cpp - Qwen - Read the Docs&amp;quot;&lt;br /&gt;
[3]: https://huggingface.co/unsloth/Qwen3-Coder-Next-GGUF?utm_source=chatgpt.com &amp;quot;unsloth/Qwen3-Coder-Next-GGUF&amp;quot;&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Tesla_P40_llama.cpp&amp;diff=5686</id>
		<title>Tesla P40 llama.cpp</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Tesla_P40_llama.cpp&amp;diff=5686"/>
		<updated>2026-05-13T16:19:32Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot;``` sudo apt update sudo apt install -y git cmake build-essential curl libcurl4-openssl-dev  git clone https://github.com/ggml-org/llama.cpp cd llama.cpp  cmake -B build -DGGM...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;```&lt;br /&gt;
sudo apt update&lt;br /&gt;
sudo apt install -y git cmake build-essential curl libcurl4-openssl-dev&lt;br /&gt;
&lt;br /&gt;
git clone https://github.com/ggml-org/llama.cpp&lt;br /&gt;
cd llama.cpp&lt;br /&gt;
&lt;br /&gt;
cmake -B build -DGGML_CUDA=ON&lt;br /&gt;
cmake --build build --config Release -j&amp;quot;$(nproc)&amp;quot;&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Yes — for a **Tesla P40**, I would use **`llama.cpp` first** for GPT-OSS, Qwen, DeepSeek, Llama, Gemma, etc.&lt;br /&gt;
&lt;br /&gt;
Your P40 is a great cheap 24 GB VRAM card, but it is **Pascal**, so it lacks Tensor Cores and newer datatype acceleration. That makes `llama.cpp` a better fit than heavier stacks like vLLM/TGI for most local single-GPU use.&lt;br /&gt;
&lt;br /&gt;
## Recommendation for your P40&lt;br /&gt;
&lt;br /&gt;
Use this stack:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Best low-friction stack:&lt;br /&gt;
llama.cpp + GGUF models + CUDA build&lt;br /&gt;
&lt;br /&gt;
Optional convenience layer:&lt;br /&gt;
Ollama or LM Studio&lt;br /&gt;
&lt;br /&gt;
Avoid as first choice on P40:&lt;br /&gt;
vLLM, TensorRT-LLM, exllamav2 for GPT-OSS&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
`llama.cpp` has active support for GPT-OSS and GGUF models, and there are ready GGUF builds for `gpt-oss-20b`. Hugging Face’s GGUF page for `ggml-org/gpt-oss-20b-GGUF` shows direct `llama.cpp`, `llama-cpp-python`, Ollama, LM Studio, and Jan usage paths. ([Hugging Face][1])&lt;br /&gt;
&lt;br /&gt;
For Qwen, the official Qwen docs also document running Qwen3/Qwen3MoE with `llama.cpp` and GGUF, with support starting from specific llama.cpp builds. ([Qwen][2])&lt;br /&gt;
&lt;br /&gt;
## What models fit your P40?&lt;br /&gt;
&lt;br /&gt;
With 24 GB VRAM:&lt;br /&gt;
&lt;br /&gt;
| Model type                       | Recommendation                              |&lt;br /&gt;
| -------------------------------- | ------------------------------------------- |&lt;br /&gt;
| GPT-OSS 20B                      | Good target                                 |&lt;br /&gt;
| GPT-OSS 120B                     | Not realistic on one P40                    |&lt;br /&gt;
| Qwen 7B / 14B / 30B-A3B MoE      | Good targets depending quant                |&lt;br /&gt;
| DeepSeek Coder / Qwen Coder GGUF | Good for coding                             |&lt;br /&gt;
| Llama 3.x 8B / 70B quantized     | 8B easy, 70B partially/offload or CPU spill |&lt;br /&gt;
| Mixtral / MoE GGUF               | Can work, speed varies                      |&lt;br /&gt;
&lt;br /&gt;
For your use — Go, SQL, Python, React, infra code — I would start with:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
1. Qwen coder GGUF model&lt;br /&gt;
2. GPT-OSS 20B GGUF&lt;br /&gt;
3. DeepSeek coder/distill GGUF&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
For the P40 specifically, **GGUF quantized models** are the practical path.&lt;br /&gt;
&lt;br /&gt;
## Build `llama.cpp` with CUDA&lt;br /&gt;
&lt;br /&gt;
On Ubuntu:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
sudo apt update&lt;br /&gt;
sudo apt install -y git cmake build-essential curl libcurl4-openssl-dev&lt;br /&gt;
&lt;br /&gt;
git clone https://github.com/ggml-org/llama.cpp&lt;br /&gt;
cd llama.cpp&lt;br /&gt;
&lt;br /&gt;
cmake -B build -DGGML_CUDA=ON&lt;br /&gt;
cmake --build build --config Release -j&amp;quot;$(nproc)&amp;quot;&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Check CUDA offload works:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
./build/bin/llama-cli --help | grep -i gpu&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
## Run GPT-OSS 20B&lt;br /&gt;
&lt;br /&gt;
Example using llama.cpp’s Hugging Face download support:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
./build/bin/llama-server \&lt;br /&gt;
  -hf ggml-org/gpt-oss-20b-GGUF \&lt;br /&gt;
  -ngl 999 \&lt;br /&gt;
  -c 8192 \&lt;br /&gt;
  --host 0.0.0.0 \&lt;br /&gt;
  --port 8080&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Then test OpenAI-compatible API:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
curl http://127.0.0.1:8080/v1/chat/completions \&lt;br /&gt;
  -H 'Content-Type: application/json' \&lt;br /&gt;
  -d '{&lt;br /&gt;
    &amp;quot;model&amp;quot;: &amp;quot;gpt-oss-20b&amp;quot;,&lt;br /&gt;
    &amp;quot;messages&amp;quot;: [&lt;br /&gt;
      {&amp;quot;role&amp;quot;: &amp;quot;user&amp;quot;, &amp;quot;content&amp;quot;: &amp;quot;Write a small Go HTTP server with health check.&amp;quot;}&lt;br /&gt;
    ],&lt;br /&gt;
    &amp;quot;temperature&amp;quot;: 0.2&lt;br /&gt;
  }'&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
If VRAM gets tight, reduce context:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
-c 4096&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Or reduce GPU layers:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
-ngl 60&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
But on a 24 GB P40, for many 20B-ish quantized GGUFs, this should be okay.&lt;br /&gt;
&lt;br /&gt;
## Run a Qwen coder model&lt;br /&gt;
&lt;br /&gt;
For coding, Qwen is probably where I would spend most of my time. Example:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
./build/bin/llama-server \&lt;br /&gt;
  -hf unsloth/Qwen3-Coder-Next-GGUF \&lt;br /&gt;
  -ngl 999 \&lt;br /&gt;
  -c 8192 \&lt;br /&gt;
  --host 0.0.0.0 \&lt;br /&gt;
  --port 8080&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
The Unsloth Qwen GGUF page notes recent llama.cpp fixes for Qwen output/tool-calling behavior, so use a fresh llama.cpp build rather than an old distro package. ([Hugging Face][3])&lt;br /&gt;
&lt;br /&gt;
## Should you use Ollama instead?&lt;br /&gt;
&lt;br /&gt;
Ollama is easier, but `llama.cpp` gives you more direct control.&lt;br /&gt;
&lt;br /&gt;
Use **Ollama** if you want simple:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
ollama run hf.co/ggml-org/gpt-oss-20b-GGUF&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Use **llama.cpp directly** if you care about:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
GPU layer tuning&lt;br /&gt;
context size tuning&lt;br /&gt;
exact GGUF file selection&lt;br /&gt;
server flags&lt;br /&gt;
benchmarking&lt;br /&gt;
performance debugging&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Given how you work, I’d use **llama.cpp directly** first.&lt;br /&gt;
&lt;br /&gt;
## Should you use vLLM?&lt;br /&gt;
&lt;br /&gt;
For your **P40**, probably no — not as your first runtime.&lt;br /&gt;
&lt;br /&gt;
vLLM is excellent for newer GPUs and serving many users, but it tends to assume newer CUDA paths and benefits heavily from Tensor Cores / modern attention kernels. On Pascal P40, `llama.cpp` is usually simpler and more forgiving.&lt;br /&gt;
&lt;br /&gt;
I would use vLLM when you have something like:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
RTX 3090&lt;br /&gt;
RTX 4090&lt;br /&gt;
RTX PRO 4500 Blackwell&lt;br /&gt;
A10/A40/A100/H100/L40S&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
For the P40:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
llama.cpp &amp;gt; Ollama &amp;gt; exllamav2/vLLM experiments&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
## My practical recommendation&lt;br /&gt;
&lt;br /&gt;
Start with:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Runtime: llama.cpp&lt;br /&gt;
Driver: 580-server&lt;br /&gt;
Model format: GGUF&lt;br /&gt;
First model: gpt-oss-20b-GGUF&lt;br /&gt;
Coding model: Qwen coder GGUF&lt;br /&gt;
Context: 4096 or 8192 first&lt;br /&gt;
GPU layers: -ngl 999&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Then benchmark:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
./build/bin/llama-bench -m /path/to/model.gguf -ngl 999&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
For a single Tesla P40, **don’t chase the newest serving framework**. Use stable `580-server`, fresh `llama.cpp`, and GGUF models. That will give you the fewest headaches and the best compatibility.&lt;br /&gt;
&lt;br /&gt;
[1]: https://huggingface.co/ggml-org/gpt-oss-20b-GGUF?utm_source=chatgpt.com &amp;quot;ggml-org/gpt-oss-20b-GGUF&amp;quot;&lt;br /&gt;
[2]: https://qwen.readthedocs.io/en/latest/run_locally/llama.cpp.html?utm_source=chatgpt.com &amp;quot;llama.cpp - Qwen - Read the Docs&amp;quot;&lt;br /&gt;
[3]: https://huggingface.co/unsloth/Qwen3-Coder-Next-GGUF?utm_source=chatgpt.com &amp;quot;unsloth/Qwen3-Coder-Next-GGUF&amp;quot;&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Tesla_P40&amp;diff=5685</id>
		<title>Tesla P40</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Tesla_P40&amp;diff=5685"/>
		<updated>2026-05-13T16:16:18Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot; You must reboot after update so in kernel   ```bash nvidia-srv/580.126.20, 6.8.0-111-generic, x86_64: installed ```  I would **stay on `580-server`** for the Tesla P40.  Do *...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
You must reboot after update so in kernel &lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
nvidia-srv/580.126.20, 6.8.0-111-generic, x86_64: installed&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
I would **stay on `580-server`** for the Tesla P40.&lt;br /&gt;
&lt;br /&gt;
Do **not** move to `590` or `595` for this card unless you are testing on a disposable install. The P40 is Pascal, and the newer 590+ branch is where Pascal support becomes problematic/legacy. NVIDIA’s 580 data-center release notes still list Tesla P40 support, while 595 is a newer data-center branch focused on newer supported platforms and has newer compatibility requirements like DCGM 4.3.x+ for DCGM users. ([NVIDIA Docs][1])&lt;br /&gt;
&lt;br /&gt;
Your best option:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
sudo apt install nvidia-driver-580-server nvidia-utils-580-server&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Then pin/hold it so Ubuntu does not “helpfully” move you to 590/595 later:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
sudo apt-mark hold nvidia-driver-580-server nvidia-utils-580-server&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Also check whether you have mixed normal and server 580 packages. You currently show:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
nvidia-utils-580/noble-updates 580.142...&lt;br /&gt;
nvidia-utils-580-server ... 580.126... [installed]&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
That is okay as long as only the `*-server` package is installed. Verify:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
dpkg -l | grep -E 'nvidia-driver|nvidia-utils|libnvidia|nvidia-dkms' | awk '{print $2, $3}'&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
For a Tesla P40 compute box, I would prefer this branch order:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Best:   580-server&lt;br /&gt;
Okay:   535-server / 570-server if 580 gives issues&lt;br /&gt;
Avoid:  590 / 595 for P40&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
For GPT-OSS/local LLM use, newer `590`/`595` is unlikely to give you anything meaningful on a P40. The bigger limitations are the P40’s Pascal architecture: no Tensor Cores, no BF16, no FP8/MXFP4 acceleration. Stability matters more than newest driver here.&lt;br /&gt;
&lt;br /&gt;
[1]: https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-580-126-20/index.html?utm_source=chatgpt.com &amp;quot;Version 580.126.20(Linux) :: NVIDIA Data Center GPU ...&amp;quot;&lt;br /&gt;
&lt;br /&gt;
```&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Codex_cli&amp;diff=5684</id>
		<title>Codex cli</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Codex_cli&amp;diff=5684"/>
		<updated>2026-05-07T23:08:50Z</updated>

		<summary type="html">&lt;p&gt;Busk: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This AppArmor profile is a **targeted exception**. Instead of turning off security restrictions for the entire operating system, it grants a specific permission to only one program: `bubblewrap` (`bwrap`).&lt;br /&gt;
&lt;br /&gt;
### Breakdown of the Code&lt;br /&gt;
&lt;br /&gt;
* **`abi &amp;lt;abi/4.0&amp;gt;,`**: This tells AppArmor to use the feature set and syntax rules introduced in version 4.0.&lt;br /&gt;
* **`include &amp;lt;tunables/global&amp;gt;`**: This imports standard variables, such as common paths for user home directories or system libraries, so the profile understands where files are located.&lt;br /&gt;
* **`/usr/bin/bwrap`**: This line specifies that the following rules apply **only** to the `bubblewrap` executable.&lt;br /&gt;
* **`flags=(unconfined)`**: This is the most important part. It tells AppArmor **not** to restrict what files or network resources `bwrap` can access. It allows the program to run with its normal system permissions.&lt;br /&gt;
* **`userns,`**: This explicitly allows `bwrap` to create **unprivileged user namespaces**. This is the specific &amp;quot;key&amp;quot; required to fix the error you encountered.&lt;br /&gt;
* **`include if exists &amp;lt;local/usr.bin.bwrap&amp;gt;`**: This is a standard placeholder that allows you to add custom local rules without modifying the main profile.&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### What does it do?&lt;br /&gt;
&lt;br /&gt;
By default, Ubuntu 24.04 and later block applications from creating &amp;quot;User Namespaces&amp;quot; because they are frequently used in cyberattacks to escalate privileges. However, `bubblewrap` **needs** these namespaces to create the sandbox that protects your system from the code Codex runs.&lt;br /&gt;
&lt;br /&gt;
This profile tells the operating system: *&amp;quot;I know unprivileged namespaces are generally risky, but I trust `/usr/bin/bwrap` specifically to use them.&amp;quot;*&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### The Security Risks&lt;br /&gt;
&lt;br /&gt;
#### 1. The &amp;quot;Hole&amp;quot; in the Wall&lt;br /&gt;
&lt;br /&gt;
By using `flags=(unconfined)`, you are essentially telling AppArmor to &amp;quot;stand down&amp;quot; regarding `bwrap`. If there were a security vulnerability inside the `bubblewrap` code itself, AppArmor would not be there to block it from accessing your private files or sensitive system areas.&lt;br /&gt;
&lt;br /&gt;
#### 2. Trusted Binaries&lt;br /&gt;
&lt;br /&gt;
The risk is considered low because `bubblewrap` is a small, heavily audited, and &amp;quot;security-first&amp;quot; piece of software. Its entire purpose is to provide isolation. Most Linux security experts consider it safer to allow `bwrap` to function (so it can sandbox other dangerous code) than to leave it broken.&lt;br /&gt;
&lt;br /&gt;
#### 3. Targeted vs. Global&lt;br /&gt;
&lt;br /&gt;
This is significantly **safer** than the alternative method of running `sudo sysctl -w kernel.unprivileged_userns_clone=1`.&lt;br /&gt;
&lt;br /&gt;
* **The Global Method:** Allows *every* program on your computer (including a malicious script or a compromised web browser) to create namespaces.&lt;br /&gt;
* **The AppArmor Method:** Limits that power strictly to the `bwrap` utility.&lt;br /&gt;
&lt;br /&gt;
### Summary&lt;br /&gt;
&lt;br /&gt;
This configuration is the &amp;quot;middle ground&amp;quot; of security. It restores functionality to your development tools while keeping the rest of the operating system locked down. The risk is that you are placing absolute trust in the integrity of the `bwrap` developers.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
# More&lt;br /&gt;
&lt;br /&gt;
This message appears because modern Linux distributions (especially **Ubuntu 23.10+ and 24.04**) have introduced security restrictions on **unprivileged user namespaces**, which `bubblewrap` (the engine behind Codex's sandbox) requires to function.&lt;br /&gt;
&lt;br /&gt;
### Fix for Ubuntu 24.04+ (AppArmor)&lt;br /&gt;
&lt;br /&gt;
Ubuntu now requires an AppArmor profile to allow unprivileged applications to create namespaces. You can create a profile specifically for `bubblewrap`:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
sudo nano /etc/apparmor.d/usr.bin.bwrap&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Paste the following into the file:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
abi &amp;lt;abi/4.0&amp;gt;,&lt;br /&gt;
include &amp;lt;tunables/global&amp;gt;&lt;br /&gt;
&lt;br /&gt;
/usr/bin/bwrap flags=(unconfined) {&lt;br /&gt;
  userns,&lt;br /&gt;
  include if exists &amp;lt;local/usr.bin.bwrap&amp;gt;&lt;br /&gt;
}&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Reload AppArmor:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
sudo systemctl reload apparmor&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### Alternative: Enable via Kernel Parameter&lt;br /&gt;
&lt;br /&gt;
If the AppArmor fix does not apply or you are on a different distribution (like Debian), you can enable unprivileged namespaces directly via `sysctl`.&lt;br /&gt;
&lt;br /&gt;
**Temporary (until reboot):**&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
sudo sysctl -w kernel.unprivileged_userns_clone=1&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
**Permanent:**&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
echo &amp;quot;kernel.unprivileged_userns_clone=1&amp;quot; | sudo tee /etc/sysctl.d/99-userns.conf&lt;br /&gt;
sudo sysctl -p /etc/sysctl.d/99-userns.conf&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
*Note: On some systems, you may also need to ensure `user.max_user_namespaces` is set to a high value:*&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
sudo sysctl -w user.max_user_namespaces=15000&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### Verification&lt;br /&gt;
&lt;br /&gt;
Once applied, you can verify that namespaces are working by running:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
unshare -U --map-root-user true &amp;amp;&amp;amp; echo &amp;quot;Success&amp;quot;&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
If it prints &amp;quot;Success&amp;quot;, Codex should now be able to initialize its sandbox without the warning.&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Codex_cli&amp;diff=5683</id>
		<title>Codex cli</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Codex_cli&amp;diff=5683"/>
		<updated>2026-05-07T23:08:05Z</updated>

		<summary type="html">&lt;p&gt;Busk: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This AppArmor profile is a **targeted exception**. Instead of turning off security restrictions for the entire operating system, it grants a specific permission to only one program: `bubblewrap` (`bwrap`).&lt;br /&gt;
&lt;br /&gt;
### Breakdown of the Code&lt;br /&gt;
&lt;br /&gt;
* **`abi &amp;lt;abi/4.0&amp;gt;,`**: This tells AppArmor to use the feature set and syntax rules introduced in version 4.0.&lt;br /&gt;
* **`include &amp;lt;tunables/global&amp;gt;`**: This imports standard variables, such as common paths for user home directories or system libraries, so the profile understands where files are located.&lt;br /&gt;
* **`/usr/bin/bwrap`**: This line specifies that the following rules apply **only** to the `bubblewrap` executable.&lt;br /&gt;
* **`flags=(unconfined)`**: This is the most important part. It tells AppArmor **not** to restrict what files or network resources `bwrap` can access. It allows the program to run with its normal system permissions.&lt;br /&gt;
* **`userns,`**: This explicitly allows `bwrap` to create **unprivileged user namespaces**. This is the specific &amp;quot;key&amp;quot; required to fix the error you encountered.&lt;br /&gt;
* **`include if exists &amp;lt;local/usr.bin.bwrap&amp;gt;`**: This is a standard placeholder that allows you to add custom local rules without modifying the main profile.&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### What does it do?&lt;br /&gt;
&lt;br /&gt;
By default, Ubuntu 24.04 and later block applications from creating &amp;quot;User Namespaces&amp;quot; because they are frequently used in cyberattacks to escalate privileges. However, `bubblewrap` **needs** these namespaces to create the sandbox that protects your system from the code Codex runs.&lt;br /&gt;
&lt;br /&gt;
This profile tells the operating system: *&amp;quot;I know unprivileged namespaces are generally risky, but I trust `/usr/bin/bwrap` specifically to use them.&amp;quot;*&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### The Security Risks&lt;br /&gt;
&lt;br /&gt;
#### 1. The &amp;quot;Hole&amp;quot; in the Wall&lt;br /&gt;
&lt;br /&gt;
By using `flags=(unconfined)`, you are essentially telling AppArmor to &amp;quot;stand down&amp;quot; regarding `bwrap`. If there were a security vulnerability inside the `bubblewrap` code itself, AppArmor would not be there to block it from accessing your private files or sensitive system areas.&lt;br /&gt;
&lt;br /&gt;
#### 2. Trusted Binaries&lt;br /&gt;
&lt;br /&gt;
The risk is considered low because `bubblewrap` is a small, heavily audited, and &amp;quot;security-first&amp;quot; piece of software. Its entire purpose is to provide isolation. Most Linux security experts consider it safer to allow `bwrap` to function (so it can sandbox other dangerous code) than to leave it broken.&lt;br /&gt;
&lt;br /&gt;
#### 3. Targeted vs. Global&lt;br /&gt;
&lt;br /&gt;
This is significantly **safer** than the alternative method of running `sudo sysctl -w kernel.unprivileged_userns_clone=1`.&lt;br /&gt;
&lt;br /&gt;
* **The Global Method:** Allows *every* program on your computer (including a malicious script or a compromised web browser) to create namespaces.&lt;br /&gt;
* **The AppArmor Method:** Limits that power strictly to the `bwrap` utility.&lt;br /&gt;
&lt;br /&gt;
### Summary&lt;br /&gt;
&lt;br /&gt;
This configuration is the &amp;quot;middle ground&amp;quot; of security. It restores functionality to your development tools while keeping the rest of the operating system locked down. The risk is that you are placing absolute trust in the integrity of the `bwrap` developers.&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Codex_cli&amp;diff=5682</id>
		<title>Codex cli</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Codex_cli&amp;diff=5682"/>
		<updated>2026-05-07T23:07:48Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot;2. Trusted Binaries The risk is considered low because bubblewrap is a small, heavily audited, and &amp;quot;security-first&amp;quot; piece of software. Its entire purpose is to provide isolati...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;2. Trusted Binaries&lt;br /&gt;
The risk is considered low because bubblewrap is a small, heavily audited, and &amp;quot;security-first&amp;quot; piece of software. Its entire purpose is to provide isolation. Most Linux security experts consider it safer to allow bwrap to function (so it can sandbox other dangerous code) than to leave it broken.&lt;br /&gt;
&lt;br /&gt;
3. Targeted vs. Global&lt;br /&gt;
This is significantly safer than the alternative method of running sudo sysctl -w kernel.unprivileged_userns_clone=1.&lt;br /&gt;
&lt;br /&gt;
The Global Method: Allows every program on your computer (including a malicious script or a compromised web browser) to create namespaces.&lt;br /&gt;
&lt;br /&gt;
The AppArmor Method: Limits that power strictly to the bwrap utility.&lt;br /&gt;
&lt;br /&gt;
Summary&lt;br /&gt;
This configuration is the &amp;quot;middle ground&amp;quot; of security. It restores functionality to your development tools while keeping the rest of the operating system locked down. The risk is that you are placing absolute trust in the integrity of the bwrap developers.&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Context_window&amp;diff=5681</id>
		<title>Context window</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Context_window&amp;diff=5681"/>
		<updated>2026-05-03T16:04:09Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot;Mostly, but not exactly.  Context window means the model’s total working space for one request/conversation turn:  context window = input tokens + output tokens  So if a mod...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Mostly, but not exactly.&lt;br /&gt;
&lt;br /&gt;
Context window means the model’s total working space for one request/conversation turn:&lt;br /&gt;
&lt;br /&gt;
context window = input tokens + output tokens&lt;br /&gt;
&lt;br /&gt;
So if a model has a 128K context window, that does not mean you can always send 128K input and still get a big answer. The output also has to fit.&lt;br /&gt;
&lt;br /&gt;
Example:&lt;br /&gt;
&lt;br /&gt;
Model context window: 128K tokens&lt;br /&gt;
Your input:           100K tokens&lt;br /&gt;
Room left:             28K tokens for output&lt;br /&gt;
&lt;br /&gt;
But there may also be a separate max output token cap:&lt;br /&gt;
&lt;br /&gt;
Context window:      128K&lt;br /&gt;
Input tokens:        100K&lt;br /&gt;
Remaining room:       28K&lt;br /&gt;
Max output cap:       16K&lt;br /&gt;
&lt;br /&gt;
Actual max output:    16K&lt;br /&gt;
&lt;br /&gt;
So the terms mean:&lt;br /&gt;
&lt;br /&gt;
Term	Simple meaning&lt;br /&gt;
&lt;br /&gt;
Input tokens	What you send in: prompt, code, files, logs, chat history&lt;br /&gt;
Output tokens	What the model writes back&lt;br /&gt;
Context	The total text the model can “see” while answering&lt;br /&gt;
Context window	The maximum token capacity for input + output&lt;br /&gt;
Token limit	Usually a general term; could mean context limit or output limit&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
So: context/window is not synonymous with input size. It is the total capacity that input and output share.&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Input_tokens&amp;diff=5680</id>
		<title>Input tokens</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Input_tokens&amp;diff=5680"/>
		<updated>2026-05-03T16:01:36Z</updated>

		<summary type="html">&lt;p&gt;Busk: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;your question&lt;br /&gt;
system/developer instructions&lt;br /&gt;
chat history included in the context&lt;br /&gt;
file contents pasted or attached&lt;br /&gt;
repo/folder code the tool chooses to include&lt;br /&gt;
error logs&lt;br /&gt;
terminal output&lt;br /&gt;
tool results&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Yes. Input tokens are everything the model receives for that request, including:&lt;br /&gt;
&lt;br /&gt;
your question&lt;br /&gt;
system/developer instructions&lt;br /&gt;
chat history included in the context&lt;br /&gt;
file contents pasted or attached&lt;br /&gt;
repo/folder code the tool chooses to include&lt;br /&gt;
error logs&lt;br /&gt;
terminal output&lt;br /&gt;
tool results&lt;br /&gt;
&lt;br /&gt;
So if you say:&lt;br /&gt;
&lt;br /&gt;
Look at my whole repo and find the bug&lt;br /&gt;
&lt;br /&gt;
the input may include a lot of files, summaries, search results, and code chunks.&lt;br /&gt;
&lt;br /&gt;
If you say:&lt;br /&gt;
&lt;br /&gt;
Only look at backend/auth/middleware.go and backend/db/users.sql&lt;br /&gt;
&lt;br /&gt;
then the input is much smaller and usually more accurate.&lt;br /&gt;
&lt;br /&gt;
Best way to reduce tokens&lt;br /&gt;
&lt;br /&gt;
Be specific about scope:&lt;br /&gt;
&lt;br /&gt;
Look only at this file.&lt;br /&gt;
&lt;br /&gt;
or:&lt;br /&gt;
&lt;br /&gt;
Look only in cmd/server and internal/auth.&lt;br /&gt;
&lt;br /&gt;
or:&lt;br /&gt;
&lt;br /&gt;
Do not scan node_modules, dist, vendor, .git, build, coverage, generated files, or lock files.&lt;br /&gt;
&lt;br /&gt;
For repo work, this is usually best:&lt;br /&gt;
&lt;br /&gt;
Here is the error. First inspect only the files likely involved. Ask before expanding scope.&lt;br /&gt;
&lt;br /&gt;
Even better, give:&lt;br /&gt;
&lt;br /&gt;
the exact error&lt;br /&gt;
the command you ran&lt;br /&gt;
the file/function involved&lt;br /&gt;
the expected behavior&lt;br /&gt;
the actual behavior&lt;br /&gt;
&lt;br /&gt;
Simple rule&lt;br /&gt;
&lt;br /&gt;
Bigger scope = more input tokens = more cost/slower/more noise&lt;br /&gt;
Smaller scope = fewer input tokens = faster/cheaper/often better&lt;br /&gt;
&lt;br /&gt;
So yes: asking it to inspect a specific file, directory, function, or error path is the right way to keep tokens down.&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Input_tokens&amp;diff=5679</id>
		<title>Input tokens</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Input_tokens&amp;diff=5679"/>
		<updated>2026-05-03T16:00:10Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot;your question system/developer instructions chat history included in the context file contents pasted or attached repo/folder code the tool chooses to include error logs termi...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;your question&lt;br /&gt;
system/developer instructions&lt;br /&gt;
chat history included in the context&lt;br /&gt;
file contents pasted or attached&lt;br /&gt;
repo/folder code the tool chooses to include&lt;br /&gt;
error logs&lt;br /&gt;
terminal output&lt;br /&gt;
tool results&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Codex_Usage_notes&amp;diff=5678</id>
		<title>Codex Usage notes</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Codex_Usage_notes&amp;diff=5678"/>
		<updated>2026-04-24T15:52:00Z</updated>

		<summary type="html">&lt;p&gt;Busk: Busk moved page Usage notes to Codex Usage notes without leaving a redirect
&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Here are compact prompt templates that usually keep Codex usage down while still getting good output.&lt;br /&gt;
&lt;br /&gt;
**The pattern that saves the most usage**&lt;br /&gt;
Give it:&lt;br /&gt;
&lt;br /&gt;
1. one narrow goal,&lt;br /&gt;
2. one bounded file or function,&lt;br /&gt;
3. one explicit output format,&lt;br /&gt;
4. one stop condition.&lt;br /&gt;
&lt;br /&gt;
That lines up with OpenAI’s current guidance that Codex usage depends mainly on task size, complexity, context held, and where tasks are executed. OpenAI also notes that Codex pricing for many plans is now token-based, which makes oversized prompts and large context especially relevant. ([OpenAI Help Center][1])&lt;br /&gt;
&lt;br /&gt;
### Low-usage templates&lt;br /&gt;
&lt;br /&gt;
**1) Single-function bug fix**&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Only inspect `main.go`, function `handleUpload`.&lt;br /&gt;
Find the bug causing the failure.&lt;br /&gt;
Return:&lt;br /&gt;
1. root cause in 3 bullets max&lt;br /&gt;
2. minimal unified diff&lt;br /&gt;
Do not scan other files.&lt;br /&gt;
Do not refactor unrelated code.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
**2) Small targeted refactor**&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Only modify `internal/auth/middleware.go`.&lt;br /&gt;
Goal: reduce duplication in token parsing without changing behavior.&lt;br /&gt;
Constraints:&lt;br /&gt;
- keep exported APIs unchanged&lt;br /&gt;
- no new dependencies&lt;br /&gt;
Return only a unified diff.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
**3) Error log triage**&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Analyze only this error and the code I pasted below.&lt;br /&gt;
Do not assume repo-wide context.&lt;br /&gt;
&lt;br /&gt;
Error:&lt;br /&gt;
&amp;lt;paste error&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Code:&lt;br /&gt;
&amp;lt;paste small relevant snippet&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Return:&lt;br /&gt;
- most likely cause&lt;br /&gt;
- second most likely cause&lt;br /&gt;
- exact patch to try first&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
**4) Focused code review**&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Review only `storage.go` for:&lt;br /&gt;
- race conditions&lt;br /&gt;
- nil dereferences&lt;br /&gt;
- leaked resources&lt;br /&gt;
&lt;br /&gt;
Do not suggest style changes.&lt;br /&gt;
Rank findings by severity.&lt;br /&gt;
Limit to top 5 issues.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
**5) Test generation without repo crawl**&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Write table-driven tests for `ParseConfig` in `config.go`.&lt;br /&gt;
Assume no other files unless referenced here.&lt;br /&gt;
Return a complete `_test.go` file only.&lt;br /&gt;
Keep cases minimal but high value.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
**6) Safe optimization pass**&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Inspect only this function for performance issues:&lt;br /&gt;
&amp;lt;paste function&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Constraints:&lt;br /&gt;
- preserve behavior&lt;br /&gt;
- prefer simpler code over clever code&lt;br /&gt;
- no concurrency changes&lt;br /&gt;
Return:&lt;br /&gt;
1. brief explanation&lt;br /&gt;
2. revised function only&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
**7) CLI command help**&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Create a Cobra subcommand named `serve-certs`.&lt;br /&gt;
Only produce:&lt;br /&gt;
- command struct/function&lt;br /&gt;
- flags&lt;br /&gt;
- RunE body stub&lt;br /&gt;
&lt;br /&gt;
Do not implement unrelated package wiring.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
**8) SQL / migration help**&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Review this migration only.&lt;br /&gt;
Check for:&lt;br /&gt;
- invalid PostgreSQL syntax&lt;br /&gt;
- unsafe defaults&lt;br /&gt;
- ordering issues&lt;br /&gt;
- rollback concerns&lt;br /&gt;
&lt;br /&gt;
Return only concrete problems and corrected SQL.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
**9) “Do not roam” repo instruction**&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Work only in these files:&lt;br /&gt;
- cmd/app/main.go&lt;br /&gt;
- internal/config/config.go&lt;br /&gt;
&lt;br /&gt;
Ignore the rest of the repository unless I explicitly add files later.&lt;br /&gt;
If you think another file is needed, name it but do not open it.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
**10) Patch-first mode**&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
I want the smallest fix that works.&lt;br /&gt;
Do not redesign.&lt;br /&gt;
Do not rename symbols.&lt;br /&gt;
Do not move files.&lt;br /&gt;
Return only the patch.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
### Good add-ons that reduce waste&lt;br /&gt;
&lt;br /&gt;
Use these as suffixes when needed:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Keep the answer under 200 lines.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Stop after the first good fix.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Ask for no follow-up unless a missing type/signature blocks the patch.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Do not include explanation unless the patch is non-obvious.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Prefer editing existing code over introducing abstractions.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
### Best workflow for lowest usage&lt;br /&gt;
&lt;br /&gt;
For routine work, this tends to be the most efficient:&lt;br /&gt;
&lt;br /&gt;
* Start with **local Codex in CLI or VS Code**.&lt;br /&gt;
* Use **GPT-5-Codex-Mini** when it is good enough, since OpenAI says it is a smaller, more cost-effective option that can provide up to 4x more usage in the subscription. ([OpenAI Help Center][2])&lt;br /&gt;
* Limit work to **one file / one function / one bug**.&lt;br /&gt;
* Ask for a **diff only** whenever possible.&lt;br /&gt;
* Start a **fresh session** once context gets bloated.&lt;br /&gt;
* Use **cloud / long-running tasks only for multi-file or project-scale work**, because OpenAI says larger codebases, long-running tasks, and sessions that require more held context use significantly more. ([OpenAI Help Center][1])&lt;br /&gt;
&lt;br /&gt;
### What to avoid&lt;br /&gt;
&lt;br /&gt;
These tend to increase usage fast:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Review my whole repo and improve everything.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Find all bugs, refactor, add tests, and optimize performance.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Here are 5,000 lines of logs, what happened?&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Keep trying different fixes until all tests pass.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
OpenAI’s current notes also indicate Plus is tuned more for steady day-to-day use, while Pro is aimed at longer, higher-intensity Codex sessions. ([OpenAI Help Center][3])&lt;br /&gt;
&lt;br /&gt;
Here’s a reusable “best default” template:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Only inspect &amp;lt;file_or_function&amp;gt;.&lt;br /&gt;
Goal: &amp;lt;single goal&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Constraints:&lt;br /&gt;
- no unrelated refactors&lt;br /&gt;
- no new dependencies&lt;br /&gt;
- keep public behavior unchanged&lt;br /&gt;
- do not inspect other files&lt;br /&gt;
&lt;br /&gt;
Return:&lt;br /&gt;
1. brief root cause&lt;br /&gt;
2. minimal unified diff&lt;br /&gt;
3. one sentence on risk&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
And here’s the lowest-usage version:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Only inspect &amp;lt;file&amp;gt;.&lt;br /&gt;
Fix &amp;lt;specific bug&amp;gt;.&lt;br /&gt;
Return only a minimal unified diff.&lt;br /&gt;
Do not explain.&lt;br /&gt;
Do not inspect other files.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
I can turn these into a version tailored for your Go/Postgres/React workflow.&lt;br /&gt;
&lt;br /&gt;
[1]: https://help.openai.com/en/articles/11369540-using-codex-with-your-chatgpt-plan?utm_source=chatgpt.com &amp;quot;Using Codex with your ChatGPT plan&amp;quot;&lt;br /&gt;
[2]: https://help.openai.com/en/articles/9624314-model-release-notes?utm_source=chatgpt.com &amp;quot;Model Release Notes | OpenAI Help Center&amp;quot;&lt;br /&gt;
[3]: https://help.openai.com/en/articles/6825453-chatgpt-release-notes?utm_source=chatgpt.com &amp;quot;ChatGPT — Release Notes&amp;quot;&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Codex_Usage_notes&amp;diff=5677</id>
		<title>Codex Usage notes</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Codex_Usage_notes&amp;diff=5677"/>
		<updated>2026-04-23T16:58:34Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot;Here are compact prompt templates that usually keep Codex usage down while still getting good output.  **The pattern that saves the most usage** Give it:  1. one narrow goal,...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Here are compact prompt templates that usually keep Codex usage down while still getting good output.&lt;br /&gt;
&lt;br /&gt;
**The pattern that saves the most usage**&lt;br /&gt;
Give it:&lt;br /&gt;
&lt;br /&gt;
1. one narrow goal,&lt;br /&gt;
2. one bounded file or function,&lt;br /&gt;
3. one explicit output format,&lt;br /&gt;
4. one stop condition.&lt;br /&gt;
&lt;br /&gt;
That lines up with OpenAI’s current guidance that Codex usage depends mainly on task size, complexity, context held, and where tasks are executed. OpenAI also notes that Codex pricing for many plans is now token-based, which makes oversized prompts and large context especially relevant. ([OpenAI Help Center][1])&lt;br /&gt;
&lt;br /&gt;
### Low-usage templates&lt;br /&gt;
&lt;br /&gt;
**1) Single-function bug fix**&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Only inspect `main.go`, function `handleUpload`.&lt;br /&gt;
Find the bug causing the failure.&lt;br /&gt;
Return:&lt;br /&gt;
1. root cause in 3 bullets max&lt;br /&gt;
2. minimal unified diff&lt;br /&gt;
Do not scan other files.&lt;br /&gt;
Do not refactor unrelated code.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
**2) Small targeted refactor**&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Only modify `internal/auth/middleware.go`.&lt;br /&gt;
Goal: reduce duplication in token parsing without changing behavior.&lt;br /&gt;
Constraints:&lt;br /&gt;
- keep exported APIs unchanged&lt;br /&gt;
- no new dependencies&lt;br /&gt;
Return only a unified diff.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
**3) Error log triage**&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Analyze only this error and the code I pasted below.&lt;br /&gt;
Do not assume repo-wide context.&lt;br /&gt;
&lt;br /&gt;
Error:&lt;br /&gt;
&amp;lt;paste error&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Code:&lt;br /&gt;
&amp;lt;paste small relevant snippet&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Return:&lt;br /&gt;
- most likely cause&lt;br /&gt;
- second most likely cause&lt;br /&gt;
- exact patch to try first&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
**4) Focused code review**&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Review only `storage.go` for:&lt;br /&gt;
- race conditions&lt;br /&gt;
- nil dereferences&lt;br /&gt;
- leaked resources&lt;br /&gt;
&lt;br /&gt;
Do not suggest style changes.&lt;br /&gt;
Rank findings by severity.&lt;br /&gt;
Limit to top 5 issues.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
**5) Test generation without repo crawl**&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Write table-driven tests for `ParseConfig` in `config.go`.&lt;br /&gt;
Assume no other files unless referenced here.&lt;br /&gt;
Return a complete `_test.go` file only.&lt;br /&gt;
Keep cases minimal but high value.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
**6) Safe optimization pass**&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Inspect only this function for performance issues:&lt;br /&gt;
&amp;lt;paste function&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Constraints:&lt;br /&gt;
- preserve behavior&lt;br /&gt;
- prefer simpler code over clever code&lt;br /&gt;
- no concurrency changes&lt;br /&gt;
Return:&lt;br /&gt;
1. brief explanation&lt;br /&gt;
2. revised function only&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
**7) CLI command help**&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Create a Cobra subcommand named `serve-certs`.&lt;br /&gt;
Only produce:&lt;br /&gt;
- command struct/function&lt;br /&gt;
- flags&lt;br /&gt;
- RunE body stub&lt;br /&gt;
&lt;br /&gt;
Do not implement unrelated package wiring.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
**8) SQL / migration help**&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Review this migration only.&lt;br /&gt;
Check for:&lt;br /&gt;
- invalid PostgreSQL syntax&lt;br /&gt;
- unsafe defaults&lt;br /&gt;
- ordering issues&lt;br /&gt;
- rollback concerns&lt;br /&gt;
&lt;br /&gt;
Return only concrete problems and corrected SQL.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
**9) “Do not roam” repo instruction**&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Work only in these files:&lt;br /&gt;
- cmd/app/main.go&lt;br /&gt;
- internal/config/config.go&lt;br /&gt;
&lt;br /&gt;
Ignore the rest of the repository unless I explicitly add files later.&lt;br /&gt;
If you think another file is needed, name it but do not open it.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
**10) Patch-first mode**&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
I want the smallest fix that works.&lt;br /&gt;
Do not redesign.&lt;br /&gt;
Do not rename symbols.&lt;br /&gt;
Do not move files.&lt;br /&gt;
Return only the patch.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
### Good add-ons that reduce waste&lt;br /&gt;
&lt;br /&gt;
Use these as suffixes when needed:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Keep the answer under 200 lines.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Stop after the first good fix.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Ask for no follow-up unless a missing type/signature blocks the patch.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Do not include explanation unless the patch is non-obvious.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Prefer editing existing code over introducing abstractions.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
### Best workflow for lowest usage&lt;br /&gt;
&lt;br /&gt;
For routine work, this tends to be the most efficient:&lt;br /&gt;
&lt;br /&gt;
* Start with **local Codex in CLI or VS Code**.&lt;br /&gt;
* Use **GPT-5-Codex-Mini** when it is good enough, since OpenAI says it is a smaller, more cost-effective option that can provide up to 4x more usage in the subscription. ([OpenAI Help Center][2])&lt;br /&gt;
* Limit work to **one file / one function / one bug**.&lt;br /&gt;
* Ask for a **diff only** whenever possible.&lt;br /&gt;
* Start a **fresh session** once context gets bloated.&lt;br /&gt;
* Use **cloud / long-running tasks only for multi-file or project-scale work**, because OpenAI says larger codebases, long-running tasks, and sessions that require more held context use significantly more. ([OpenAI Help Center][1])&lt;br /&gt;
&lt;br /&gt;
### What to avoid&lt;br /&gt;
&lt;br /&gt;
These tend to increase usage fast:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Review my whole repo and improve everything.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Find all bugs, refactor, add tests, and optimize performance.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Here are 5,000 lines of logs, what happened?&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Keep trying different fixes until all tests pass.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
OpenAI’s current notes also indicate Plus is tuned more for steady day-to-day use, while Pro is aimed at longer, higher-intensity Codex sessions. ([OpenAI Help Center][3])&lt;br /&gt;
&lt;br /&gt;
Here’s a reusable “best default” template:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Only inspect &amp;lt;file_or_function&amp;gt;.&lt;br /&gt;
Goal: &amp;lt;single goal&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Constraints:&lt;br /&gt;
- no unrelated refactors&lt;br /&gt;
- no new dependencies&lt;br /&gt;
- keep public behavior unchanged&lt;br /&gt;
- do not inspect other files&lt;br /&gt;
&lt;br /&gt;
Return:&lt;br /&gt;
1. brief root cause&lt;br /&gt;
2. minimal unified diff&lt;br /&gt;
3. one sentence on risk&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
And here’s the lowest-usage version:&lt;br /&gt;
&lt;br /&gt;
```text&lt;br /&gt;
Only inspect &amp;lt;file&amp;gt;.&lt;br /&gt;
Fix &amp;lt;specific bug&amp;gt;.&lt;br /&gt;
Return only a minimal unified diff.&lt;br /&gt;
Do not explain.&lt;br /&gt;
Do not inspect other files.&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
I can turn these into a version tailored for your Go/Postgres/React workflow.&lt;br /&gt;
&lt;br /&gt;
[1]: https://help.openai.com/en/articles/11369540-using-codex-with-your-chatgpt-plan?utm_source=chatgpt.com &amp;quot;Using Codex with your ChatGPT plan&amp;quot;&lt;br /&gt;
[2]: https://help.openai.com/en/articles/9624314-model-release-notes?utm_source=chatgpt.com &amp;quot;Model Release Notes | OpenAI Help Center&amp;quot;&lt;br /&gt;
[3]: https://help.openai.com/en/articles/6825453-chatgpt-release-notes?utm_source=chatgpt.com &amp;quot;ChatGPT — Release Notes&amp;quot;&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Codex_gemini_install_cli&amp;diff=5676</id>
		<title>Codex gemini install cli</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Codex_gemini_install_cli&amp;diff=5676"/>
		<updated>2026-04-23T02:06:05Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot;# Debian/Ubuntu  ``` sudo apt-get remove -y nodejs npm curl -fsSL https://deb.nodesource.com/setup_lts.x | sudo -E bash - sudo apt-get install -y nodejs sudo npm install -g np...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;# Debian/Ubuntu&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
sudo apt-get remove -y nodejs npm&lt;br /&gt;
curl -fsSL https://deb.nodesource.com/setup_lts.x | sudo -E bash -&lt;br /&gt;
sudo apt-get install -y nodejs&lt;br /&gt;
sudo npm install -g npm@latest&lt;br /&gt;
npm install -g @openai/codex&lt;br /&gt;
npm install -g @google/gemini-cli&lt;br /&gt;
codex --version&lt;br /&gt;
gemini --version&lt;br /&gt;
```&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Resize_lxd&amp;diff=5675</id>
		<title>Resize lxd</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Resize_lxd&amp;diff=5675"/>
		<updated>2026-04-17T23:42:38Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot;# 1. Resize via LXD ONLY (don’t touch zfs manually) lxc config device set docker0 root size=300GB  # 2. Restart VM (fast + clean) lxc restart docker0  # 3. Inside VM growpar...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;# 1. Resize via LXD ONLY (don’t touch zfs manually)&lt;br /&gt;
lxc config device set docker0 root size=300GB&lt;br /&gt;
&lt;br /&gt;
# 2. Restart VM (fast + clean)&lt;br /&gt;
lxc restart docker0&lt;br /&gt;
&lt;br /&gt;
# 3. Inside VM&lt;br /&gt;
growpart /dev/sda 1&lt;br /&gt;
resize2fs /dev/sda1&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Dokur_windows&amp;diff=5674</id>
		<title>Dokur windows</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Dokur_windows&amp;diff=5674"/>
		<updated>2026-04-17T22:45:12Z</updated>

		<summary type="html">&lt;p&gt;Busk: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;```&lt;br /&gt;
lscpu | grep -E &amp;quot;Virtualization|VT-x|AMD-V&amp;quot;&lt;br /&gt;
&lt;br /&gt;
sudo apt update&lt;br /&gt;
sudo apt install qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils cpu-checker&lt;br /&gt;
&lt;br /&gt;
kvm-ok&lt;br /&gt;
&lt;br /&gt;
sudo usermod -aG kvm $USER&lt;br /&gt;
sudo usermod -aG libvirt $USER&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
services:&lt;br /&gt;
  windows:&lt;br /&gt;
    image: dockur/windows&lt;br /&gt;
    container_name: windows&lt;br /&gt;
    devices:&lt;br /&gt;
      - /dev/kvm&lt;br /&gt;
    cap_add:&lt;br /&gt;
      - NET_ADMIN&lt;br /&gt;
    ports:&lt;br /&gt;
      - 8006:8006&lt;br /&gt;
      - 3389:3389/tcp&lt;br /&gt;
      - 3389:3389/udp&lt;br /&gt;
    stop_grace_period: 2m&lt;br /&gt;
    restart: on-failure&lt;br /&gt;
    environment:&lt;br /&gt;
      VERSION: &amp;quot;win11&amp;quot;&lt;br /&gt;
      RAM_SIZE: &amp;quot;8G&amp;quot;&lt;br /&gt;
      CPU_CORES: &amp;quot;4&amp;quot;&lt;br /&gt;
      DISK_SIZE: &amp;quot;64G&amp;quot;&lt;br /&gt;
    volumes:&lt;br /&gt;
      - /opt/windows:/storage&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
mkdir -p /opt/windows&lt;br /&gt;
docker compose up -d&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Dokur_windows&amp;diff=5673</id>
		<title>Dokur windows</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Dokur_windows&amp;diff=5673"/>
		<updated>2026-04-17T22:43:14Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot;``` lscpu | grep -E &amp;quot;Virtualization|VT-x|AMD-V&amp;quot;  sudo apt update sudo apt install qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils cpu-checker  kvm-ok  sudo usermod...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;```&lt;br /&gt;
lscpu | grep -E &amp;quot;Virtualization|VT-x|AMD-V&amp;quot;&lt;br /&gt;
&lt;br /&gt;
sudo apt update&lt;br /&gt;
sudo apt install qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils cpu-checker&lt;br /&gt;
&lt;br /&gt;
kvm-ok&lt;br /&gt;
&lt;br /&gt;
sudo usermod -aG kvm $USER&lt;br /&gt;
sudo usermod -aG libvirt $USER&lt;br /&gt;
```&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Frr_commercial&amp;diff=5672</id>
		<title>Frr commercial</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Frr_commercial&amp;diff=5672"/>
		<updated>2026-04-11T17:47:35Z</updated>

		<summary type="html">&lt;p&gt;Busk: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;SONiC: The King of the Cloud Data Center&lt;br /&gt;
Originally created by Microsoft to run Azure, SONiC (Software for Open Networking in the Cloud) is the undisputed heavyweight champion of the modern data center.&lt;br /&gt;
&lt;br /&gt;
Architecture: SONiC is built entirely around containers and a centralized Redis database. Every component (BGP, LLDP, SNMP) runs in its own Docker container. If the BGP container crashes, the switch keeps forwarding packets using the routes stored in the Redis DB while the container restarts.&lt;br /&gt;
&lt;br /&gt;
The &amp;quot;SAI&amp;quot; Advantage: SONiC's superpower is the Switch Abstraction Interface (SAI). SAI is an API that allows SONiC to talk to almost any vendor's switching silicon (Broadcom, Mellanox, Cisco, etc.) without changing the core OS.&lt;br /&gt;
&lt;br /&gt;
Best For: Massive Scale-Out Leaf-Spine networks, Kubernetes environments, and cloud providers. If you are building a data center fabric to support thousands of servers and need deep telemetry and automation, SONiC is the industry standard.&lt;br /&gt;
&lt;br /&gt;
DANOS: The Carrier Edge Workhorse&lt;br /&gt;
Created by AT&amp;amp;T (based on their acquisition of Vyatta), DANOS (Disaggregated Network Operating System) was built specifically for the telecom edge.&lt;br /&gt;
&lt;br /&gt;
Architecture: Unlike SONiC’s database-centric model, DANOS is built around high-performance packet processing in software, heavily utilizing DPDK (Data Plane Development Kit). It is designed to handle complex routing features that simple data center switches usually struggle with.&lt;br /&gt;
&lt;br /&gt;
Telecom Features: Data centers usually just need simple IP routing. Telecoms need deep, complex protocols: MPLS, L2VPN/L3VPN, Carrier-Grade NAT, Hierarchical QoS (throttling specific types of traffic), and cell-tower timing protocols (PTP). DANOS excels here.&lt;br /&gt;
&lt;br /&gt;
Best For: Cell tower aggregation routers, Broadband Network Gateways (the router your home ISP uses to authenticate your modem), and provider edge routers.&lt;br /&gt;
&lt;br /&gt;
# XDP&lt;br /&gt;
&lt;br /&gt;
There is no direct XDP equivalent to DANOS—meaning there isn't a single, monolithic &amp;quot;install this ISO and get a Cisco-like CLI&amp;quot; project that runs pure XDP under the hood. &lt;br /&gt;
&lt;br /&gt;
Because XDP (eXpress Data Path) and eBPF are essentially ways to run highly secure, custom C code directly inside the Linux kernel's network driver, the ecosystem is built more like a toolkit than a finished consumer appliance. &lt;br /&gt;
&lt;br /&gt;
However, there are major open-source projects using XDP to build insanely fast routers and load balancers. Here are the biggest ones you should know about, especially given your work with Kubernetes.&lt;br /&gt;
&lt;br /&gt;
### 1. Cilium (The K8s Heavyweight)&lt;br /&gt;
If you are looking for an open-source, production-ready XDP router, **Cilium** is the undisputed king right now. &lt;br /&gt;
&lt;br /&gt;
While it is primarily known as a Kubernetes CNI (Container Network Interface), it is fundamentally an eBPF/XDP-based distributed router and firewall. &lt;br /&gt;
* **How it works:** Cilium replaces `kube-proxy` entirely. It attaches XDP programs to your host's network interfaces. When a packet arrives destined for a Kubernetes service, the XDP program routes it or load-balances it before the standard Linux networking stack even wakes up.&lt;br /&gt;
* **BGP Integration:** Cilium now has native BGP support built right in. It can establish peering sessions (just like MetalLB and FRR do) to advertise K8s services, but it handles the actual packet forwarding via eBPF/XDP for massive performance gains. &lt;br /&gt;
&lt;br /&gt;
### 2. Katran (Meta's Layer 4 Load Balancer)&lt;br /&gt;
Open-sourced by Meta (Facebook), **Katran** is not a full-featured BGP router, but it is the most famous XDP project in existence. &lt;br /&gt;
* Meta uses Katran to load-balance traffic to their data centers. It uses XDP to process millions of packets per second on standard Linux servers, entirely stateless.&lt;br /&gt;
* If you wanted to build a high-speed VIP (Virtual IP) director without Kubernetes, Katran is the reference architecture.&lt;br /&gt;
&lt;br /&gt;
### 3. Polycube (The Network Function Framework)&lt;br /&gt;
Backed by the Linux Foundation, **Polycube** is probably the closest thing to a &amp;quot;general purpose&amp;quot; XDP network OS. &lt;br /&gt;
* It is a framework that provides ready-to-use network functions (like routers, firewalls, NAT, and bridges) built entirely on eBPF and XDP.&lt;br /&gt;
* You can run a daemon on a Linux server and use the Polycube CLI (or REST API) to spin up a virtual XDP-based router, attach physical interfaces to it, and define routing tables. &lt;br /&gt;
&lt;br /&gt;
### How people actually build XDP Routers (FRR + XDP)&lt;br /&gt;
If a company wants to build an edge router using XDP today, they don't usually look for a pre-built project. They build a hybrid, using the exact tools you are already using.&lt;br /&gt;
&lt;br /&gt;
They split the router into two planes:&lt;br /&gt;
1.  **The Control Plane (FRR):** They run FRR to handle the complex BGP math. &lt;br /&gt;
2.  **The Data Plane (XDP):** They write a small XDP C program attached to the network card. Its only job is to look at a packet's destination IP, look up the MAC address in an &amp;quot;eBPF Map&amp;quot; (a high-speed memory table), and send it out the right port.&lt;br /&gt;
&lt;br /&gt;
**The Glue:** FRR has a feature called the **FPM (Forwarding Plane Manager)**. Instead of FRR pushing routes into the standard Linux routing table via Zebra, you configure Zebra to stream those routes to a custom daemon via FPM. That daemon translates the FRR routes and injects them directly into the XDP program's eBPF Map. &lt;br /&gt;
&lt;br /&gt;
### The Bottom Line&lt;br /&gt;
If you want to play with XDP routing in a lab or a cluster, drop MetalLB and standard `kube-proxy`, and deploy **Cilium** with its strict XDP mode enabled. It will give you a hands-on look at how much faster eBPF dataplanes are compared to standard Linux kernel routing.&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Frr_commercial&amp;diff=5671</id>
		<title>Frr commercial</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Frr_commercial&amp;diff=5671"/>
		<updated>2026-04-11T17:27:50Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot;SONiC: The King of the Cloud Data Center Originally created by Microsoft to run Azure, SONiC (Software for Open Networking in the Cloud) is the undisputed heavyweight champion...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;SONiC: The King of the Cloud Data Center&lt;br /&gt;
Originally created by Microsoft to run Azure, SONiC (Software for Open Networking in the Cloud) is the undisputed heavyweight champion of the modern data center.&lt;br /&gt;
&lt;br /&gt;
Architecture: SONiC is built entirely around containers and a centralized Redis database. Every component (BGP, LLDP, SNMP) runs in its own Docker container. If the BGP container crashes, the switch keeps forwarding packets using the routes stored in the Redis DB while the container restarts.&lt;br /&gt;
&lt;br /&gt;
The &amp;quot;SAI&amp;quot; Advantage: SONiC's superpower is the Switch Abstraction Interface (SAI). SAI is an API that allows SONiC to talk to almost any vendor's switching silicon (Broadcom, Mellanox, Cisco, etc.) without changing the core OS.&lt;br /&gt;
&lt;br /&gt;
Best For: Massive Scale-Out Leaf-Spine networks, Kubernetes environments, and cloud providers. If you are building a data center fabric to support thousands of servers and need deep telemetry and automation, SONiC is the industry standard.&lt;br /&gt;
&lt;br /&gt;
DANOS: The Carrier Edge Workhorse&lt;br /&gt;
Created by AT&amp;amp;T (based on their acquisition of Vyatta), DANOS (Disaggregated Network Operating System) was built specifically for the telecom edge.&lt;br /&gt;
&lt;br /&gt;
Architecture: Unlike SONiC’s database-centric model, DANOS is built around high-performance packet processing in software, heavily utilizing DPDK (Data Plane Development Kit). It is designed to handle complex routing features that simple data center switches usually struggle with.&lt;br /&gt;
&lt;br /&gt;
Telecom Features: Data centers usually just need simple IP routing. Telecoms need deep, complex protocols: MPLS, L2VPN/L3VPN, Carrier-Grade NAT, Hierarchical QoS (throttling specific types of traffic), and cell-tower timing protocols (PTP). DANOS excels here.&lt;br /&gt;
&lt;br /&gt;
Best For: Cell tower aggregation routers, Broadband Network Gateways (the router your home ISP uses to authenticate your modem), and provider edge routers.&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Git_signing&amp;diff=5670</id>
		<title>Git signing</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Git_signing&amp;diff=5670"/>
		<updated>2026-03-31T17:44:52Z</updated>

		<summary type="html">&lt;p&gt;Busk: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;```&lt;br /&gt;
git config --global gpg.program gpg&lt;br /&gt;
git config --global gpg.format openpgp&lt;br /&gt;
gpg --list-secret-keys --keyid-format LONG&lt;br /&gt;
git config --global user.signingkey 708D1D7948FXXXXX&lt;br /&gt;
export GPG_TTY=$(tty)&lt;br /&gt;
```&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Git_signing&amp;diff=5669</id>
		<title>Git signing</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Git_signing&amp;diff=5669"/>
		<updated>2026-03-31T17:44:35Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot;git config --global gpg.program gpg git config --global gpg.format openpgp gpg --list-secret-keys --keyid-format LONG git config --global user.signingkey 708D1D7948FXXXXX expo...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;git config --global gpg.program gpg&lt;br /&gt;
git config --global gpg.format openpgp&lt;br /&gt;
gpg --list-secret-keys --keyid-format LONG&lt;br /&gt;
git config --global user.signingkey 708D1D7948FXXXXX&lt;br /&gt;
export GPG_TTY=$(tty)&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Zfs_vs_hardware_raid_controller&amp;diff=5668</id>
		<title>Zfs vs hardware raid controller</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Zfs_vs_hardware_raid_controller&amp;diff=5668"/>
		<updated>2026-03-28T20:16:37Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot;That’s actually a **very solid default rule**, and it lines up almost perfectly with how ZFS behaves in the real world 👍  Let me sharpen it a bit specifically for ZFS + y...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;That’s actually a **very solid default rule**, and it lines up almost perfectly with how ZFS behaves in the real world 👍&lt;br /&gt;
&lt;br /&gt;
Let me sharpen it a bit specifically for ZFS + your kind of workloads:&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
# 🧠 Your Rule (Refined for ZFS)&lt;br /&gt;
&lt;br /&gt;
&amp;gt; **“Use mirrors unless workload is mostly sequential + read-heavy + static.”**&lt;br /&gt;
&lt;br /&gt;
That’s basically the *right mental model*.&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
# 🔥 Why RAID10 (mirrors) is usually the right choice&lt;br /&gt;
&lt;br /&gt;
For your environment (LXD, Postgres, APIs, etc.):&lt;br /&gt;
&lt;br /&gt;
### You are mostly doing:&lt;br /&gt;
&lt;br /&gt;
* random reads/writes&lt;br /&gt;
* small block IO&lt;br /&gt;
* sync writes (fsync, WAL, etc.)&lt;br /&gt;
* metadata-heavy operations&lt;br /&gt;
&lt;br /&gt;
👉 ZFS mirrors excel at:&lt;br /&gt;
&lt;br /&gt;
* parallel IOPS (each vdev = queue)&lt;br /&gt;
* low latency under load&lt;br /&gt;
* fast resilvers (huge deal)&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
# ⚠️ Where RAIDZ2 actually wins&lt;br /&gt;
&lt;br /&gt;
Your “static / read-heavy” instinct is correct, but let’s define it more precisely:&lt;br /&gt;
&lt;br /&gt;
## RAIDZ2 is best when:&lt;br /&gt;
&lt;br /&gt;
* Large sequential reads (media, backups)&lt;br /&gt;
* Large sequential writes (archives, logs)&lt;br /&gt;
* Data is mostly **append-only**&lt;br /&gt;
* Not latency-sensitive&lt;br /&gt;
* Capacity matters&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
# 🧪 Practical Examples&lt;br /&gt;
&lt;br /&gt;
## 🟢 Mirrors (RAID10)&lt;br /&gt;
&lt;br /&gt;
Use for:&lt;br /&gt;
&lt;br /&gt;
* LXD VM disks&lt;br /&gt;
* PostgreSQL / databases&lt;br /&gt;
* Kubernetes / etcd&lt;br /&gt;
* anything with fsync&lt;br /&gt;
* CI/CD workloads&lt;br /&gt;
* general-purpose root pool&lt;br /&gt;
&lt;br /&gt;
👉 This is probably **90% of real-world infra workloads**&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
## 🔵 RAIDZ2&lt;br /&gt;
&lt;br /&gt;
Use for:&lt;br /&gt;
&lt;br /&gt;
* backups&lt;br /&gt;
* object storage&lt;br /&gt;
* logs (cold)&lt;br /&gt;
* media / large files&lt;br /&gt;
* snapshot archives&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
# 💥 The Big Gotcha Most People Miss&lt;br /&gt;
&lt;br /&gt;
## RAIDZ2 looks good in benchmarks…&lt;br /&gt;
&lt;br /&gt;
…but falls apart under:&lt;br /&gt;
&lt;br /&gt;
* queue depth&lt;br /&gt;
* random IO&lt;br /&gt;
* mixed workloads&lt;br /&gt;
&lt;br /&gt;
👉 Especially with spinning disks&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
# ⚡ Why mirrors feel “faster” even when MB/s looks similar&lt;br /&gt;
&lt;br /&gt;
Because:&lt;br /&gt;
&lt;br /&gt;
* ZFS schedules IO per vdev&lt;br /&gt;
* Mirrors = more vdevs = more parallelism&lt;br /&gt;
&lt;br /&gt;
👉 So system responsiveness is WAY better&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
# 🔁 Resilvering (VERY important)&lt;br /&gt;
&lt;br /&gt;
This is a huge differentiator:&lt;br /&gt;
&lt;br /&gt;
### Mirrors:&lt;br /&gt;
&lt;br /&gt;
* Only resilver **used blocks**&lt;br /&gt;
* Much faster rebuild&lt;br /&gt;
* Less stress on disks&lt;br /&gt;
&lt;br /&gt;
### RAIDZ2:&lt;br /&gt;
&lt;br /&gt;
* Must rebuild **entire vdev stripe**&lt;br /&gt;
* Slower&lt;br /&gt;
* Higher risk during rebuild&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
# 🧠 Your 20-disk scenario (applied)&lt;br /&gt;
&lt;br /&gt;
Your instinct:&lt;br /&gt;
&lt;br /&gt;
&amp;gt; “Always RAID10 unless read-heavy static”&lt;br /&gt;
&lt;br /&gt;
👉 I would translate that to:&lt;br /&gt;
&lt;br /&gt;
## Default:&lt;br /&gt;
&lt;br /&gt;
**10 mirrors (RAID10)**&lt;br /&gt;
&lt;br /&gt;
## Only use RAIDZ2 if:&lt;br /&gt;
&lt;br /&gt;
* You explicitly need capacity&lt;br /&gt;
* AND workload is not latency sensitive&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
# 🔥 What I would actually do (based on your stack)&lt;br /&gt;
&lt;br /&gt;
Since you:&lt;br /&gt;
&lt;br /&gt;
* run infra services&lt;br /&gt;
* care about performance&lt;br /&gt;
* likely run databases / APIs&lt;br /&gt;
&lt;br /&gt;
👉 I would do:&lt;br /&gt;
&lt;br /&gt;
### Pool A (primary)&lt;br /&gt;
&lt;br /&gt;
* **Mirrors (RAID10)**&lt;br /&gt;
* All active workloads&lt;br /&gt;
&lt;br /&gt;
### Pool B (optional)&lt;br /&gt;
&lt;br /&gt;
* RAIDZ2&lt;br /&gt;
* backups / cold data&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
# 💡 One nuance worth knowing&lt;br /&gt;
&lt;br /&gt;
There *is* one middle-ground pattern:&lt;br /&gt;
&lt;br /&gt;
## “Wide mirrors”&lt;br /&gt;
&lt;br /&gt;
Still mirrors, but tuned datasets:&lt;br /&gt;
&lt;br /&gt;
* recordsize tuning&lt;br /&gt;
* compression&lt;br /&gt;
* SLOG if needed&lt;br /&gt;
&lt;br /&gt;
👉 This gives you:&lt;br /&gt;
&lt;br /&gt;
* most of RAID10 benefits&lt;br /&gt;
* decent capacity efficiency&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
# 🏆 Final Take&lt;br /&gt;
&lt;br /&gt;
Your rule is already good. I’d just upgrade it to:&lt;br /&gt;
&lt;br /&gt;
&amp;gt; **Use mirrors by default.&lt;br /&gt;
&amp;gt; Use RAIDZ only when capacity matters more than latency.**&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
# 👍 Blunt version&lt;br /&gt;
&lt;br /&gt;
* Mirrors = **performance + responsiveness + safety**&lt;br /&gt;
* RAIDZ2 = **capacity + efficiency**&lt;br /&gt;
* RAIDZ is almost never the right choice for VM workloads&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
If you want, I can:&lt;br /&gt;
&lt;br /&gt;
* Map your exact 20-disk system into **optimal mirror grouping + spare strategy**&lt;br /&gt;
* Or show how to split it into **two pools for best LXD scheduling**&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Zfs_hdd_tuning&amp;diff=5667</id>
		<title>Zfs hdd tuning</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Zfs_hdd_tuning&amp;diff=5667"/>
		<updated>2026-03-28T19:24:33Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot;Nice—those **10K SAS Hitachi drives** are actually a really solid fit for ZFS if you tune it right. Let’s dial this in specifically for your setup.  ---  # 🧱 Your Drive...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Nice—those **10K SAS Hitachi drives** are actually a really solid fit for ZFS if you tune it right. Let’s dial this in specifically for your setup.&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
# 🧱 Your Drive&lt;br /&gt;
&lt;br /&gt;
## Hitachi DKR5E-J1R2SS&lt;br /&gt;
&lt;br /&gt;
![Image](https://cdn11.bigcommerce.com/s-017c0/images/stencil/1280x1280/products/12022/50761/3282390-E_1__08710.1680120086.JPG?c=2)&lt;br /&gt;
&lt;br /&gt;
![Image](https://m.media-amazon.com/images/I/614y6FroB-L._AC_UF894%2C1000_QL80_.jpg)&lt;br /&gt;
&lt;br /&gt;
![Image](https://www.disctech.com/SCASite/product_images/0B23727_1000-1.jpg)&lt;br /&gt;
&lt;br /&gt;
![Image](https://i.ebayimg.com/images/g/YakAAeSwtkhoFgAO/s-l400.jpg)&lt;br /&gt;
&lt;br /&gt;
### Key characteristics:&lt;br /&gt;
&lt;br /&gt;
* ~120–180 IOPS per drive&lt;br /&gt;
* Low latency vs SATA, but still **spinning disk**&lt;br /&gt;
* Enterprise reliability (great for ZFS)&lt;br /&gt;
* Likely **512e or 4K sectors** (we’ll force correct alignment)&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
# 🎯 Goals for ZFS Tuning&lt;br /&gt;
&lt;br /&gt;
You want to optimize for:&lt;br /&gt;
&lt;br /&gt;
* LXD containers + VMs&lt;br /&gt;
* Mixed workloads (random IO heavy)&lt;br /&gt;
* Stability over raw throughput&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
# ⚙️ Step 1 — Create ZFS Pool (IMPORTANT SETTINGS)&lt;br /&gt;
&lt;br /&gt;
Assuming HBA mode and multiple disks (recommended RAIDZ or mirrors):&lt;br /&gt;
&lt;br /&gt;
### 🔥 Best layout options:&lt;br /&gt;
&lt;br /&gt;
* **Mirrors (RAID10 equivalent)** → BEST performance&lt;br /&gt;
* **RAIDZ2** → BEST safety (recommended if ≥6 disks)&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### Example (mirrors)&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
zpool create -o ashift=12 HDD \&lt;br /&gt;
  mirror /dev/sdb /dev/sdc \&lt;br /&gt;
  mirror /dev/sdd /dev/sde&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
### Example (RAIDZ2)&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
zpool create -o ashift=12 HDD \&lt;br /&gt;
  raidz2 /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
## 🔑 Why `ashift=12` matters&lt;br /&gt;
&lt;br /&gt;
* Forces 4K alignment (even if drive lies)&lt;br /&gt;
* Prevents **massive write amplification**&lt;br /&gt;
&lt;br /&gt;
👉 Always do this. No exceptions.&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
# ⚙️ Step 2 — ZFS Dataset Tuning (CRITICAL)&lt;br /&gt;
&lt;br /&gt;
For LXD workloads:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
zfs set atime=off HDD&lt;br /&gt;
zfs set compression=lz4 HDD&lt;br /&gt;
zfs set xattr=sa HDD&lt;br /&gt;
zfs set acltype=posixacl HDD&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
## 🧠 Recordsize tuning (VERY important)&lt;br /&gt;
&lt;br /&gt;
### For general LXD pool:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
zfs set recordsize=128K HDD&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
### For VM disks (zvols):&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
zfs set volblocksize=16K HDD/vm&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
👉 Why:&lt;br /&gt;
&lt;br /&gt;
* 128K = good throughput default&lt;br /&gt;
* 16K = better for random IO (VMs, DB)&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
# ⚙️ Step 3 — LXD Storage Pool&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
lxc storage create HDD zfs source=HDD&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
# ⚙️ Step 4 — Special LXD tuning&lt;br /&gt;
&lt;br /&gt;
### For containers:&lt;br /&gt;
&lt;br /&gt;
ZFS dataset works great as-is.&lt;br /&gt;
&lt;br /&gt;
### For VMs:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
lxc storage set HDD volume.zfs.block_mode=true&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
👉 This uses zvols instead of files → better performance&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
# ⚡ Optional (BUT HIGHLY RECOMMENDED)&lt;br /&gt;
&lt;br /&gt;
## 🚀 Add SLOG (for sync writes)&lt;br /&gt;
&lt;br /&gt;
Without SLOG:&lt;br /&gt;
&lt;br /&gt;
* Sync writes = slow (HDD latency)&lt;br /&gt;
&lt;br /&gt;
With SLOG:&lt;br /&gt;
&lt;br /&gt;
* Huge boost for:&lt;br /&gt;
&lt;br /&gt;
  * databases&lt;br /&gt;
  * fsync-heavy workloads&lt;br /&gt;
  * Kubernetes / etcd&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### Best SLOG device:&lt;br /&gt;
&lt;br /&gt;
* NVMe with power-loss protection&lt;br /&gt;
&lt;br /&gt;
Example:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
zpool add HDD log /dev/nvme0n1&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
## 🧠 L2ARC (read cache)&lt;br /&gt;
&lt;br /&gt;
Only add if:&lt;br /&gt;
&lt;br /&gt;
* RAM &amp;lt; working set&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
zpool add HDD cache /dev/nvme1n1&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
👉 Not critical unless heavy reads&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
# ⚙️ Step 5 — Kernel tuning (small but helpful)&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
echo 1 &amp;gt; /sys/module/zfs/parameters/zfs_prefetch_disable&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
👉 Disables prefetch (better for random workloads)&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
# ⚠️ What NOT to do&lt;br /&gt;
&lt;br /&gt;
❌ Don’t use:&lt;br /&gt;
&lt;br /&gt;
* RAID controller + ZFS (double abstraction)&lt;br /&gt;
* RAID5 on controller (write hole risk)&lt;br /&gt;
* ashift=9 (huge mistake on modern disks)&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
# 🧪 Expected Performance&lt;br /&gt;
&lt;br /&gt;
With mirrors:&lt;br /&gt;
&lt;br /&gt;
* ~2x IOPS scaling&lt;br /&gt;
* Great for VMs&lt;br /&gt;
&lt;br /&gt;
With RAIDZ2:&lt;br /&gt;
&lt;br /&gt;
* Better capacity&lt;br /&gt;
* Lower IOPS (write penalty)&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
# 🏆 Final Recommended Setup (for YOU)&lt;br /&gt;
&lt;br /&gt;
Given your setup:&lt;br /&gt;
&lt;br /&gt;
### If you want performance:&lt;br /&gt;
&lt;br /&gt;
👉 **Mirrors + ZFS**&lt;br /&gt;
&lt;br /&gt;
* Best VM performance&lt;br /&gt;
* Scales well&lt;br /&gt;
&lt;br /&gt;
### If you want capacity + safety:&lt;br /&gt;
&lt;br /&gt;
👉 **RAIDZ2 + ZFS**&lt;br /&gt;
&lt;br /&gt;
* Survives 2 disk failures&lt;br /&gt;
* Still solid performance&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
# 🔥 My Personal Pick for Your Use Case&lt;br /&gt;
&lt;br /&gt;
Since you're running:&lt;br /&gt;
&lt;br /&gt;
* LXD cluster&lt;br /&gt;
* likely DB / infra workloads&lt;br /&gt;
&lt;br /&gt;
👉 I would do:&lt;br /&gt;
&lt;br /&gt;
**Mirrors + ZFS + optional NVMe SLOG**&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
# 💡 Bonus: LXD Multi-Pool Strategy&lt;br /&gt;
&lt;br /&gt;
You can do:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
lxc storage create fast zfs source=SSDPOOL&lt;br /&gt;
lxc storage create HDD zfs source=HDD&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Then:&lt;br /&gt;
&lt;br /&gt;
* Put VMs → fast&lt;br /&gt;
* Backups / bulk → HDD&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
# 👍 Final Take&lt;br /&gt;
&lt;br /&gt;
Your **10K SAS drives + ZFS (HBA mode)**:&lt;br /&gt;
&lt;br /&gt;
* Extremely reliable&lt;br /&gt;
* Predictable performance&lt;br /&gt;
* WAY safer than RAID for long-term integrity&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
If you want next step, I can:&lt;br /&gt;
&lt;br /&gt;
* Map your exact number of disks → best vdev layout&lt;br /&gt;
* Or tune specifically for PostgreSQL / your monitoring stack&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=IDrac&amp;diff=5666</id>
		<title>IDrac</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=IDrac&amp;diff=5666"/>
		<updated>2026-03-26T20:15:58Z</updated>

		<summary type="html">&lt;p&gt;Busk: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;# Install from ipmi&lt;br /&gt;
&lt;br /&gt;
- Configuration-&amp;gt;Virtual Media-&amp;gt;Connect Virtual Media&lt;br /&gt;
- Select File&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
sudo ipmitool lan print 1&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=LXD_security_luks_ssh_sync&amp;diff=5665</id>
		<title>LXD security luks ssh sync</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=LXD_security_luks_ssh_sync&amp;diff=5665"/>
		<updated>2026-03-20T03:26:12Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot;To sync your SSH configuration and keys across your LXD cluster (the Ubuntu 24.04 node and your older 22.04 nodes), you can use a combination of **LXD's built-in file push** a...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;To sync your SSH configuration and keys across your LXD cluster (the Ubuntu 24.04 node and your older 22.04 nodes), you can use a combination of **LXD's built-in file push** and a simple **Bash alias**.&lt;br /&gt;
&lt;br /&gt;
Since your 6-disk ZFS mirror is likely your &amp;quot;main&amp;quot; node, we’ll use it as the source of truth.&lt;br /&gt;
&lt;br /&gt;
### 1. The Manual Sync (Quickest)&lt;br /&gt;
LXD allows you to push files directly into a container or across nodes if they are part of the same cluster. If you want to push your updated `config` to a specific container:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
lxc file push ~/.ssh/config my-alpine-container/root/.ssh/config&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
### 2. The Cluster-Wide Sync Script&lt;br /&gt;
If you want to ensure every node in your cluster has the same SSH shortcuts, you can run this loop from your primary node. It iterates through all nodes and updates their local `~/.ssh/config`.&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
for node in $(lxc cluster list --format csv -c Name); do&lt;br /&gt;
  scp ~/.ssh/config $node:~/.ssh/config&lt;br /&gt;
done&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### 3. Using &amp;quot;SSH Agent Forwarding&amp;quot; (Recommended)&lt;br /&gt;
Instead of copying your **Private Keys** to every node (which is a security risk), you should use **Agent Forwarding**. This allows you to keep your keys on your local machine but &amp;quot;lend&amp;quot; them to the cluster nodes as you jump between them.&lt;br /&gt;
&lt;br /&gt;
**Update your `~/.ssh/config` on your local machine:**&lt;br /&gt;
```text&lt;br /&gt;
Host lxd-cluster-*&lt;br /&gt;
    ForwardAgent yes&lt;br /&gt;
    User busk&lt;br /&gt;
    IdentityFile ~/.ssh/id_ed25519_personal&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
**How it works:**&lt;br /&gt;
1. You SSH into Node A.&lt;br /&gt;
2. From Node A, you want to `git clone` or SSH into Node B.&lt;br /&gt;
3. Node A &amp;quot;asks&amp;quot; your local machine to sign the request using your key.&lt;br /&gt;
4. **The private key never leaves your laptop.**&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### 4. Syncing via a Git &amp;quot;Dotfiles&amp;quot; Repo&lt;br /&gt;
Since we just set up your Git SSH access, the most professional way to manage this is to create a private **&amp;quot;dotfiles&amp;quot;** repository.&lt;br /&gt;
&lt;br /&gt;
1. **Create a private repo** on GitHub called `dotfiles`.&lt;br /&gt;
2. **Push your config:** ```bash&lt;br /&gt;
   git add ~/.ssh/config&lt;br /&gt;
   git commit -m &amp;quot;update ssh config&amp;quot;&lt;br /&gt;
   git push&lt;br /&gt;
   ```&lt;br /&gt;
3. **Pull on other nodes:** On any new node in your cluster, just `git pull` to get your latest shortcuts and host definitions.&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### 5. Security Warning for ZFS Pools&lt;br /&gt;
If you are syncing these files to a directory on your `tank` pool, ensure the ZFS dataset permissions are restricted. SSH will fail to work if the permissions are too &amp;quot;open&amp;quot; (e.g., if a group has write access).&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
# Ensure the ZFS mount for your home dir is strict&lt;br /&gt;
sudo chmod 700 /home/busk/.ssh&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### Summary Checklist&lt;br /&gt;
| Method | Best For... | Security Level |&lt;br /&gt;
| :--- | :--- | :--- |&lt;br /&gt;
| **Agent Forwarding** | Jumping between nodes | **High** (Keys stay local) |&lt;br /&gt;
| **LXC File Push** | Quick fixes to containers | **Medium** |&lt;br /&gt;
| **Git Dotfiles** | Long-term cluster management | **High** (If repo is private) |&lt;br /&gt;
&lt;br /&gt;
**Would you like me to show you how to automate the &amp;quot;Git Pull&amp;quot; of your config files every time you log into a node?**&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=LXD_security_luks_dropbear&amp;diff=5664</id>
		<title>LXD security luks dropbear</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=LXD_security_luks_dropbear&amp;diff=5664"/>
		<updated>2026-03-20T03:25:46Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot;Setting up **Dropbear** for **Initramfs** is the gold standard for &amp;quot;headless&amp;quot; secure servers. It allows you to SSH into your server *before* the main operating system even sta...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Setting up **Dropbear** for **Initramfs** is the gold standard for &amp;quot;headless&amp;quot; secure servers. It allows you to SSH into your server *before* the main operating system even starts, just to type the LUKS passphrase.&lt;br /&gt;
&lt;br /&gt;
This gives you the security of a manual passphrase with the convenience of not needing a physical keyboard and monitor attached to the server.&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### 1. Install the Dropbear Initramfs Tool&lt;br /&gt;
Ubuntu has a specific package that integrates the lightweight Dropbear SSH server into the boot process.&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
sudo apt update&lt;br /&gt;
sudo apt install dropbear-initramfs -y&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
### 2. Configure Your SSH Key&lt;br /&gt;
For security, you should use an **SSH Key** rather than a password to log in during this pre-boot phase.&lt;br /&gt;
&lt;br /&gt;
1.  **Copy your public key** (usually found at `~/.ssh/id_ed25519.pub` on your local machine).&lt;br /&gt;
2.  **Add it to the authorized keys for boot:**&lt;br /&gt;
    ```bash&lt;br /&gt;
    sudo nano /etc/dropbear/initramfs/authorized_keys&lt;br /&gt;
    ```&lt;br /&gt;
    *Paste your public key there and save.*&lt;br /&gt;
&lt;br /&gt;
### 3. Set the Network Config (Optional but Recommended)&lt;br /&gt;
If your server doesn't get an IP automatically via DHCP during boot, you can hardcode one in the GRUB config, but usually, the default is fine if your router is active. &lt;br /&gt;
&lt;br /&gt;
To change the port (to avoid conflicts with your main SSH later), edit the config:&lt;br /&gt;
```bash&lt;br /&gt;
sudo nano /etc/dropbear/initramfs/config&lt;br /&gt;
```&lt;br /&gt;
Change to: `DROPBEAR_OPTIONS=&amp;quot;-p 2222&amp;quot;` (This puts the &amp;quot;Unlock SSH&amp;quot; on port 2222).&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### 4. Update and Reboot&lt;br /&gt;
Tell the system to rebuild the boot image with Dropbear inside it.&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
sudo update-initramfs -u&lt;br /&gt;
sudo reboot&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### 5. How to Unlock Remotely&lt;br /&gt;
Once the server starts booting, it will &amp;quot;hang&amp;quot; at the encryption prompt. From your local laptop, run:&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
ssh -p 2222 root@your-server-ip&lt;br /&gt;
```&lt;br /&gt;
*Note: It will always be `root`, even if that's not your main user.*&lt;br /&gt;
&lt;br /&gt;
Once you are logged in, you will see a minimal prompt. Run the unlock command:&lt;br /&gt;
```bash&lt;br /&gt;
cryptroot-unlock&lt;br /&gt;
```&lt;br /&gt;
Type your passphrase. Once accepted, the SSH session will close automatically, the drive will mount, and Ubuntu will finish booting into the main OS (and start your LXD containers).&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### Summary of your &amp;quot;Private Cloud&amp;quot; Security&lt;br /&gt;
You now have a setup that mirrors high-end cloud security:&lt;br /&gt;
1.  **LXD Isolation:** Your apps are siloed in containers or VMs.&lt;br /&gt;
2.  **Full Disk Encryption:** Data is encrypted at rest (LUKS).&lt;br /&gt;
3.  **Encrypted Storage Pool:** Your LXD data lives on a ZFS-over-LUKS layer.&lt;br /&gt;
4.  **Remote Secure Unlock:** You can manage reboots from anywhere without leaving keys &amp;quot;in the lock&amp;quot; (the keyfile).&lt;br /&gt;
&lt;br /&gt;
# small bash script that checks the status of your LXD encrypted pool and alerts you if it fails to mount after an unlock&lt;br /&gt;
&lt;br /&gt;
Goal,Config Setting&lt;br /&gt;
Specify a Key,IdentityFile ~/.ssh/key_name&lt;br /&gt;
Use a Nickname,Host nickname&lt;br /&gt;
Set User,User username&lt;br /&gt;
Force Port,Port 2222 (if you changed it for security)&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=LXD_security_luks&amp;diff=5663</id>
		<title>LXD security luks</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=LXD_security_luks&amp;diff=5663"/>
		<updated>2026-03-20T03:23:02Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot;Setting up an encrypted storage pool for LXD is the closest you can get to the &amp;quot;AWS EBS Encryption&amp;quot; model on your own hardware. By using **LUKS** (Linux Unified Key Setup) und...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Setting up an encrypted storage pool for LXD is the closest you can get to the &amp;quot;AWS EBS Encryption&amp;quot; model on your own hardware. By using **LUKS** (Linux Unified Key Setup) underneath **ZFS**, even if someone pulls the physical drive from your server, the data is just random noise without the decryption passphrase.&lt;br /&gt;
&lt;br /&gt;
Here is how to build that secure foundation on Ubuntu 24.04.&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### Phase 1: Prepare the Encrypted Partition&lt;br /&gt;
We will use a raw partition or a secondary drive (e.g., `/dev/sdb`). &lt;br /&gt;
&lt;br /&gt;
1.  **Install the necessary tools:**&lt;br /&gt;
    ```bash&lt;br /&gt;
    sudo apt update &amp;amp;&amp;amp; sudo apt install cryptsetup zfsutils-linux -y&lt;br /&gt;
    ```&lt;br /&gt;
2.  **Format the partition with LUKS:**&lt;br /&gt;
    *Choose a strong passphrase; you will need this every time the server boots.*&lt;br /&gt;
    ```bash&lt;br /&gt;
    sudo cryptsetup luksFormat /dev/sdb&lt;br /&gt;
    ```&lt;br /&gt;
3.  **Open the encrypted &amp;quot;container&amp;quot;:**&lt;br /&gt;
    This maps the encrypted physical drive to a virtual device at `/dev/mapper/lxd_encrypted`.&lt;br /&gt;
    ```bash&lt;br /&gt;
    sudo cryptsetup open /dev/sdb lxd_encrypted&lt;br /&gt;
    ```&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### Phase 2: Create the LXD Storage Pool&lt;br /&gt;
Now that we have an open, decrypted block device, we tell LXD to use it for a ZFS pool.&lt;br /&gt;
&lt;br /&gt;
1.  **Create the Pool:**&lt;br /&gt;
    ```bash&lt;br /&gt;
    sudo lxc storage create secure-pool zfs source=/dev/mapper/lxd_encrypted&lt;br /&gt;
    ```&lt;br /&gt;
2.  **Verify the Pool:**&lt;br /&gt;
    ```bash&lt;br /&gt;
    lxc storage list&lt;br /&gt;
    ```&lt;br /&gt;
    You should see `secure-pool` listed with the driver `zfs`.&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### Phase 3: Launch a Secure Container/VM&lt;br /&gt;
To ensure your workloads actually live on this encrypted disk, you must specify the pool during launch.&lt;br /&gt;
&lt;br /&gt;
* **For a Container:**&lt;br /&gt;
    ```bash&lt;br /&gt;
    lxc launch ubuntu:24.04 my-secure-app --storage secure-pool&lt;br /&gt;
    ```&lt;br /&gt;
* **For a VM (Even More Secure):**&lt;br /&gt;
    ```bash&lt;br /&gt;
    lxc launch ubuntu:24.04 my-vault --vm --storage secure-pool&lt;br /&gt;
    ```&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### Phase 4: Handling Reboots (The &amp;quot;Unlock&amp;quot; Process)&lt;br /&gt;
Unlike a public cloud where the provider handles the hardware keys, **you** own the keys now. This means when your physical server reboots, the LXD pool will stay &amp;quot;offline&amp;quot; because the LUKS partition is locked.&lt;br /&gt;
&lt;br /&gt;
**To restore access after a reboot:**&lt;br /&gt;
1.  **Unlock the drive:**&lt;br /&gt;
    ```bash&lt;br /&gt;
    sudo cryptsetup open /dev/sdb lxd_encrypted&lt;br /&gt;
    ```&lt;br /&gt;
2.  **Restart LXD (if it didn't auto-detect):**&lt;br /&gt;
    ```bash&lt;br /&gt;
    sudo systemctl restart snap.lxd.daemon&lt;br /&gt;
    ```&lt;br /&gt;
&lt;br /&gt;
&amp;gt; [!TIP]&lt;br /&gt;
&amp;gt; If you want the server to unlock automatically at boot (less secure, but common for remote servers), you can use a **Keyfile** stored on a specialized USB drive or an initramfs configuration, though this technically means anyone with physical access to that USB drive can bypass the encryption.&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### How this mimics &amp;quot;Cloud Security&amp;quot;&lt;br /&gt;
* **The &amp;quot;Admin&amp;quot; Problem:** If an admin tries to `cat` a file from the host's `/var/lib/lxd/storage-pools/`, they can only do so while the LUKS partition is &amp;quot;Open.&amp;quot; &lt;br /&gt;
* **Cold Boot Attacks:** If the server is powered off, the data is mathematically inaccessible. &lt;br /&gt;
* **Snapshot Safety:** Since the ZFS snapshots live inside the LUKS container, your backups are also encrypted by default if you replicate the ZFS stream.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
# Automount&lt;br /&gt;
&lt;br /&gt;
Automating the unlock process is a balance between **convenience** and **security**. If you store the key on the same hard drive, you defeat the purpose of encryption (it’s like leaving the key in the lock). &lt;br /&gt;
&lt;br /&gt;
The &amp;quot;Pro&amp;quot; way to do this on-premise is to store the keyfile on a **separate USB drive** or a dedicated **microSD card** that stays plugged into the server.&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### 1. Create the Keyfile&lt;br /&gt;
First, we generate a file full of random data to act as your &amp;quot;digital key.&amp;quot;&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
# Create a 4KB keyfile&lt;br /&gt;
sudo dd if=/dev/urandom of=/root/lxd_keyfile bs=1024 count=4&lt;br /&gt;
# Set strict permissions (Only root can read it)&lt;br /&gt;
sudo chmod 400 /root/lxd_keyfile&lt;br /&gt;
```&lt;br /&gt;
*(Note: For higher security, move this file to a USB drive mounted at `/mnt/usb/keyfile` instead of `/root/`.)*&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### 2. Add the Keyfile to LUKS&lt;br /&gt;
LUKS has &amp;quot;slots&amp;quot; for multiple keys. We will keep your manual passphrase in Slot 0 and add this file to Slot 1.&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
sudo cryptsetup luksAddKey /dev/sdb /root/lxd_keyfile&lt;br /&gt;
```&lt;br /&gt;
*You will be prompted for your **original passphrase** to authorize adding the new key.*&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### 3. Configure `/etc/crypttab`&lt;br /&gt;
This file tells Ubuntu to unlock the drive automatically during the boot sequence using the keyfile.&lt;br /&gt;
&lt;br /&gt;
1.  **Get the UUID of your physical drive:**&lt;br /&gt;
    ```bash&lt;br /&gt;
    blkid /dev/sdb&lt;br /&gt;
    ```&lt;br /&gt;
    *Copy the UUID (e.g., `UUID=&amp;quot;1234-abcd-...&amp;quot;`).*&lt;br /&gt;
&lt;br /&gt;
2.  **Edit the crypttab file:**&lt;br /&gt;
    ```bash&lt;br /&gt;
    sudo nano /etc/crypttab&lt;br /&gt;
    ```&lt;br /&gt;
3.  **Add this line:**&lt;br /&gt;
    ```text&lt;br /&gt;
    lxd_encrypted  UUID=your-uuid-here  /root/lxd_keyfile  luks&lt;br /&gt;
    ```&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### 4. Update Initramfs&lt;br /&gt;
Since storage is a core system component, you need to update the boot RAM disk so the kernel knows how to handle this mapping at startup.&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
sudo update-initramfs -u&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### 5. Final Step: The ZFS Mount&lt;br /&gt;
LXD is usually smart enough to see the `/dev/mapper/lxd_encrypted` device appear and then mount the ZFS pool. However, if the pool doesn't import automatically, you can add a simple systemd override or just run `lxc storage import` if needed.&lt;br /&gt;
&lt;br /&gt;
### The &amp;quot;Cloud-Level&amp;quot; Security Result&lt;br /&gt;
* **Rebooting:** The server boots, finds the keyfile, unlocks the drive, and LXD starts your containers.&lt;br /&gt;
* **Theft Scenario:** If someone steals the server but **unplugs the USB key** (or if you keep the keyfile on a network share that you disconnect), the data remains encrypted and unreadable.&lt;br /&gt;
* **Rogue Admin:** An admin with access to the running OS can still see the data (since it's unlocked), but an admin with physical access to the &amp;quot;cold&amp;quot; hardware cannot.&lt;br /&gt;
&lt;br /&gt;
&amp;gt; [!WARNING]&lt;br /&gt;
&amp;gt; If you lose both your manual passphrase **and** this keyfile, the data on that LXD pool is gone forever. There is no &amp;quot;Password Reset&amp;quot; in LUKS encryption.&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=LXD_security&amp;diff=5662</id>
		<title>LXD security</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=LXD_security&amp;diff=5662"/>
		<updated>2026-03-20T03:19:20Z</updated>

		<summary type="html">&lt;p&gt;Busk: Created page with &amp;quot;To recreate a &amp;quot;Cloud-Style&amp;quot; secure environment using **LXD** on-premise, you have to move away from the idea of being a &amp;quot;System Admin&amp;quot; and start acting like a &amp;quot;Service Provide...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;To recreate a &amp;quot;Cloud-Style&amp;quot; secure environment using **LXD** on-premise, you have to move away from the idea of being a &amp;quot;System Admin&amp;quot; and start acting like a &amp;quot;Service Provider.&amp;quot; &lt;br /&gt;
&lt;br /&gt;
In a standard setup, you sudo into everything. In a high-security LXD setup, you use **Project Isolation**, **Restricted Roles**, and **Hardware-Backed Encryption**.&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### 1. Project Isolation (The Virtual Private Cloud)&lt;br /&gt;
Don't just run containers in the &amp;quot;default&amp;quot; project. Projects in LXD act like AWS Accounts or Azure Subscriptions. They have their own networks, storage volumes, and—crucially—their own security policies.&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
# Create a secure project&lt;br /&gt;
lxc project create secure-zone -c features.networks=true -c features.images=true&lt;br /&gt;
&lt;br /&gt;
# Switch to it&lt;br /&gt;
lxc project switch secure-zone&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
### 2. Restrict the &amp;quot;Admin&amp;quot; (RBAC)&lt;br /&gt;
To prevent a rogue local admin (or yourself by mistake) from having &amp;quot;God Mode&amp;quot; over every container, you should use **Canonical RBAC** (Role-Based Access Control) or integration with **OpenID Connect**.&lt;br /&gt;
&lt;br /&gt;
By integrating LXD with an identity provider (like Keycloak or Authelia), you can ensure that even if someone has a login to the physical host, they don't have the &amp;quot;LXD API&amp;quot; permissions to peek into a specific project's containers without an audited login.&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### 3. Encryption at Rest (The Storage Layer)&lt;br /&gt;
In AWS, the EBS volumes are encrypted. In LXD, you should use **ZFS or LVM with LUKS**.&lt;br /&gt;
&lt;br /&gt;
* **The Goal:** If someone steals the physical hard drives from your server, they see nothing.&lt;br /&gt;
* **The Setup:** Encrypt the entire partition using LUKS before assigning it to the LXD Storage Pool.&lt;br /&gt;
&lt;br /&gt;
```bash&lt;br /&gt;
# Example: Creating an encrypted ZFS pool for LXD&lt;br /&gt;
cryptsetup luksFormat /dev/sdb&lt;br /&gt;
cryptsetup open /dev/sdb crypt_storage&lt;br /&gt;
lxc storage create encrypted-pool zfs source=/dev/mapper/crypt_storage&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### 4. Shielded Containers (The &amp;quot;Nitro&amp;quot; Equivalent)&lt;br /&gt;
LXD allows you to run **Virtual Machines (VMs)** instead of just containers. While containers share the host kernel (easier for a rogue root user to &amp;quot;escape&amp;quot;), LXD VMs use a separate kernel and hardware virtualization (QEMU/KVM).&lt;br /&gt;
&lt;br /&gt;
For your most sensitive data, use an LXD VM with **vTPM** (Virtual Trusted Platform Module) enabled:&lt;br /&gt;
```bash&lt;br /&gt;
lxc launch ubuntu:24.04 my-secure-vm --vm -c limits.cpu=2 -c security.agent=true&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### 5. The &amp;quot;No-Sudo&amp;quot; Architecture (The API Approach)&lt;br /&gt;
The biggest risk in an on-premise setup is someone running `sudo lxc exec container bash`. &lt;br /&gt;
&lt;br /&gt;
To mitigate this:&lt;br /&gt;
1.  **Disable root SSH:** Never allow root login on the host.&lt;br /&gt;
2.  **Use the LXD Brink/Candid:** Force all container access through a remote API call that requires a multi-factor authentication (MFA) token, rather than direct local socket access.&lt;br /&gt;
3.  **Kernel Hardening:** Use **AppArmor** and **Seccomp** (enabled by default in LXD) to prevent containers from making sensitive system calls to the host kernel.&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
&lt;br /&gt;
### Summary Checklist for a &amp;quot;Private Cloud&amp;quot;&lt;br /&gt;
| Feature | Local LXD Setup |&lt;br /&gt;
| :--- | :--- |&lt;br /&gt;
| **Identity** | Integrate with OIDC/Keycloak (No local passwords). |&lt;br /&gt;
| **Storage** | LUKS-encrypted ZFS or LVM pools. |&lt;br /&gt;
| **Isolation** | Use LXD Projects to silo different departments/apps. |&lt;br /&gt;
| **Runtime** | Use VMs (`--vm`) for high-security, Containers for dev. |&lt;br /&gt;
| **Auditing** | Forward LXD logs to a separate, write-only syslog server. |&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Go_install&amp;diff=5661</id>
		<title>Go install</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Go_install&amp;diff=5661"/>
		<updated>2026-02-27T15:49:51Z</updated>

		<summary type="html">&lt;p&gt;Busk: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
https://golang.org/doc/install&lt;br /&gt;
```&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
set -eu&lt;br /&gt;
version=1.26.0&lt;br /&gt;
sudo apt remove -y golang-go || true&lt;br /&gt;
# curl -LO https://go.dev/dl/go1.20.3.linux-amd64.tar.gz&lt;br /&gt;
curl -LO https://golang.org/dl/go$version.linux-amd64.tar.gz&lt;br /&gt;
sudo rm -rf /usr/local/go &amp;amp;&amp;amp; sudo tar -C /usr/local -xzf go$version.linux-amd64.tar.gz&lt;br /&gt;
echo 'export PATH=${PATH}:/usr/local/go/bin' &amp;gt;&amp;gt; ~/.bashrc&lt;br /&gt;
. ~/.bashrc&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Snap&lt;br /&gt;
```&lt;br /&gt;
sudo snap install --classic go&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
Apt&lt;br /&gt;
```&lt;br /&gt;
sudo apt install -y golang-go&lt;br /&gt;
```&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
	<entry>
		<id>https://tech.uvoo.io/index.php?title=Winrm_python&amp;diff=5660</id>
		<title>Winrm python</title>
		<link rel="alternate" type="text/html" href="https://tech.uvoo.io/index.php?title=Winrm_python&amp;diff=5660"/>
		<updated>2025-12-11T07:25:35Z</updated>

		<summary type="html">&lt;p&gt;Busk: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;# Allow Python winrm&lt;br /&gt;
&lt;br /&gt;
## Steps&lt;br /&gt;
&lt;br /&gt;
### Create user&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
# Create the password object&lt;br /&gt;
$Password = ConvertTo-SecureString &amp;quot;myPassword&amp;quot; -AsPlainText -Force&lt;br /&gt;
&lt;br /&gt;
# Create the user account&lt;br /&gt;
New-LocalUser -Name &amp;quot;test&amp;quot; `&lt;br /&gt;
    -Password $Password `&lt;br /&gt;
    -FullName &amp;quot;Test Automation User&amp;quot; `&lt;br /&gt;
    -Description &amp;quot;User for WinRM access&amp;quot; `&lt;br /&gt;
    -PasswordNeverExpires&lt;br /&gt;
&lt;br /&gt;
Add-LocalGroupMember -Group &amp;quot;Remote Management Users&amp;quot; -Member &amp;quot;test&amp;quot;&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
### Firewall rule&lt;br /&gt;
```&lt;br /&gt;
New-NetFirewallRule -DisplayName &amp;quot;Allow WinRM from Specific IP&amp;quot; `&lt;br /&gt;
    -Direction Inbound `&lt;br /&gt;
    -LocalPort 5986 `&lt;br /&gt;
    -Protocol TCP `&lt;br /&gt;
    -Action Allow `&lt;br /&gt;
    -RemoteAddress 10.x.x.x&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
###&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
icm '10.x.x.x' -Cr $c -Port 5986 -UseSSL -SessionOption $o { &amp;quot;5986 OK&amp;quot; }&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
### User must belong to this with Read &amp;amp; Execute for python winrm&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
winrm configSDDL default&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
### Restart if needed&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
Restart-Service WinRM&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
### Python&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
import winrm&lt;br /&gt;
&lt;br /&gt;
s = winrm.Session(&lt;br /&gt;
    '10.x.x.x',                       # IP/host is fine&lt;br /&gt;
    auth=(r'test', 'mypassword'),     # .\ for local user; DOMAIN\user for domain&lt;br /&gt;
    transport='ssl',                     # HTTPS on 5986 with Basic over TLS&lt;br /&gt;
    server_cert_validation='ignore',     # OK for self-signed / lab&lt;br /&gt;
    message_encryption='auto',           # Optional; mostly irrelevant over HTTPS&lt;br /&gt;
)&lt;br /&gt;
&lt;br /&gt;
try:&lt;br /&gt;
    r = s.run_cmd('hostname')&lt;br /&gt;
    print(&amp;quot;Status:&amp;quot;, r.status_code)&lt;br /&gt;
    print(&amp;quot;STDOUT:&amp;quot;, r.std_out.decode(errors=&amp;quot;ignore&amp;quot;).strip())&lt;br /&gt;
    print(&amp;quot;STDERR:&amp;quot;, r.std_err.decode(errors=&amp;quot;ignore&amp;quot;).strip())&lt;br /&gt;
except Exception as e:&lt;br /&gt;
    print(&amp;quot;Error:&amp;quot;, e)&lt;br /&gt;
```&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
## Notes on winrm configSDDL&lt;br /&gt;
&lt;br /&gt;
```&lt;br /&gt;
Opens Permissions Dialog: Running winrm configSDDL default brings up the familiar Windows security permissions dialog for the default WinRM listener.&lt;br /&gt;
Grants Non-Admin Access: You add non-admin users/groups (e.g., DOMAIN\User) and check &amp;quot;Allow&amp;quot; for Read and Execute permissions, enabling them to use remote management tools like PowerShell remoting.&lt;br /&gt;
Manages RootSDDL: This command effectively configures the RootSDDL setting, which defines who can access the WinRM service remotely.&lt;br /&gt;
```&lt;/div&gt;</summary>
		<author><name>Busk</name></author>
	</entry>
</feed>