I had a SAN which went into emergency mode the other day which caused running VM's in vSphere to either pause or to mark disks in read only mode. Most problematic to this was the actual vSphere guest VM and the Platform Services Controller VM. Both of these marked /dev/sda3 as read only and on a forced reboot they entered Recovery Mode.
Luckily for me running a fsck successfully recovered the disk. I used the details from here which worked perfectly. For brevity all it was was:
e2fsck -y /dev/sda3
I have been playing extensively with LLM's, especially self-hosting models to experiment with different models, prompts and their sentiments.
Using Ollama has been one of the quickest ways to get running with local models but it also offers some nice features like API support.
I have a multi-GPU machine running 5x GeForce 1060's, which whilst older, still perform well for a lab environment. Unfortunately the CPU running the system is a cheap Celeron processor which doesn't have AVX or AVX2 support which is needed for Ollama to run GPU inference on. This took me ages to find to understand why GPU inference wasn't supported even though CUDA showed the 5x 1060's. Some of the output I was seeing was:
time=2024-09-26T03:41:50.015Z level=INFO source=common.go:49 msg="Dynamic LLM libraries" runners="[cpu cpu_avx cpu_avx2 cuda_v11 cuda_v12 rocm_v60102]"
time=2024-09-26T03:41:50.020Z level=INFO source=gpu.go:199 msg="looking for compatible GPUs"
time=2024-09-26T03:41:50.035Z level=WARN source=gpu.go:224 msg="CPU does not have minimum vector extensions, GPU inference disabled" required=avx detected="no vector extensions"
time=2024-09-26T03:41:50.035Z level=INFO source=types.go:107 msg="inference compute" id=0 library=cpu variant="no vector extensions" compute="" driver=0.0 name="" total="15.6 GiB" available="14.9 GiB"
You can see that GPU inference gets disabled because the CPU doesn't meet AVX requirements. This is a different output from what I've seen on some bug trackers, I am not sure if it's a version change or perhaps my CPU reports differently, but underlying issue is the same.
Luckily there is an ongoing Github issue - https://github.com/ollama/ollama/issues/2187 which is tracking the need for AVX/2 even to run on GPU. In that thread there are a few workarounds based on the version of Ollama you're using. On v0.3.12 I modified:
gpu/cpu_common.go I added a new line below line 15 to: return CPUCapabilityAVX
llm/generate/gen_linux.sh I commented out line 54 and added a new line below to: COMMON_CMAKE_DEFS="-DBUILD_SHARED_LIBS=off -DCMAKE_POSITION_INDEPENDENT_CODE=on -DGGML_NATIVE=off -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_OPENMP=off"
Once you build from source this bypasses the AVX/2 checks and you can run on GPU. When you run ollama now you can see:
time=2024-09-26T05:49:07.785Z level=INFO source=common.go:49 msg="Dynamic LLM libraries" runners="[cpu_avx2 cuda_v12 cpu cpu_avx]"
time=2024-09-26T05:49:07.786Z level=INFO source=gpu.go:199 msg="looking for compatible GPUs"
time=2024-09-26T05:49:08.703Z level=INFO source=types.go:107 msg="inference compute" id=GPU-e2f3f39f-9a70-1d92-f7da-25ab5291fda8 library=cuda variant=v12 compute=6.1 driver=12.6 name="NVIDIA GeForce GTX 1060 6GB" total="5.9 GiB" available="5.9 GiB"
time=2024-09-26T05:49:08.703Z level=INFO source=types.go:107 msg="inference compute" id=GPU-39bccb3b-5d42-81d2-5f84-ede96f34c3e6 library=cuda variant=v12 compute=6.1 driver=12.6 name="NVIDIA GeForce GTX 1060 6GB" total="5.9 GiB" available="5.9 GiB"
time=2024-09-26T05:49:08.703Z level=INFO source=types.go:107 msg="inference compute" id=GPU-58a48fa9-8b0f-5691-8a71-51e761b4fddc library=cuda variant=v12 compute=6.1 driver=12.6 name="NVIDIA GeForce GTX 1060 6GB" total="5.9 GiB" available="5.9 GiB"
time=2024-09-26T05:49:08.703Z level=INFO source=types.go:107 msg="inference compute" id=GPU-3176d686-d810-04ca-fbda-8f0340bb8faf library=cuda variant=v12 compute=6.1 driver=12.6 name="NVIDIA GeForce GTX 1060 6GB" total="5.9 GiB" available="5.9 GiB"
time=2024-09-26T05:49:08.703Z level=INFO source=types.go:107 msg="inference compute" id=GPU-d09f4bbc-cf74-dc70-8c22-4898d8267937 library=cuda variant=v12 compute=6.1 driver=12.6 name="NVIDIA GeForce GTX 1060 6GB" total="5.9 GiB" available="5.9 GiB"
time=2024-09-26T05:50:03.095Z level=INFO source=sched.go:730 msg="new model will fit in available VRAM, loading" model=/home/x/.ollama/models/blobs/sha256-ff1d1fc78170d787ee1201778e2dd65ea211654ca5fb7d69b5a2e7b123a50373 library=cuda parallel=4 required="16.7 GiB"
time=2024-09-26T05:50:03.095Z level=INFO source=server.go:103 msg="system memory" total="15.6 GiB" free="14.9 GiB" free_swap="4.0 GiB"
Depending on the size of the model you can see Ollama load the model into the GPU's, and when running inference it seems to stripe the query across the cards, although I'm sure that this is just a symptom of memory registers rather than actual striping workload.
You can check the usage, card status etc on nvidia-smi:
x@x:~$ nvidia-smi
Thu Sep 26 05:50:53 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03 Driver Version: 560.35.03 CUDA Version: 12.6 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce GTX 1060 6GB Off | 00000000:02:00.0 Off | N/A |
| 0% 44C P5 11W / 180W | 2695MiB / 6144MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA GeForce GTX 1060 6GB Off | 00000000:03:00.0 Off | N/A |
| 0% 44C P2 29W / 180W | 2097MiB / 6144MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 2 NVIDIA GeForce GTX 1060 6GB Off | 00000000:04:00.0 Off | N/A |
| 0% 41C P2 28W / 180W | 2097MiB / 6144MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 3 NVIDIA GeForce GTX 1060 6GB Off | 00000000:05:00.0 Off | N/A |
| 0% 40C P2 33W / 180W | 2097MiB / 6144MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 4 NVIDIA GeForce GTX 1060 6GB Off | 00000000:06:00.0 Off | N/A |
| 0% 35C P8 5W / 180W | 1927MiB / 6144MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 1445 C ...unners/cuda_v12/ollama_llama_server 2688MiB |
| 1 N/A N/A 1445 C ...unners/cuda_v12/ollama_llama_server 2090MiB |
| 2 N/A N/A 1445 C ...unners/cuda_v12/ollama_llama_server 2090MiB |
| 3 N/A N/A 1445 C ...unners/cuda_v12/ollama_llama_server 2090MiB |
| 4 N/A N/A 1445 C ...unners/cuda_v12/ollama_llama_server 1920MiB |
+-----------------------------------------------------------------------------------------+
If you don't want to pay for a Proxmox subscription you can still get updates through the no-subscription channel.
cd /etc/apt/sources.list.d
cp pve-enterprise.list pve-no-subscription.list
nano pve-no-subscription.list
Edit the pve-no-subscription.list to the below
deb http://download.proxmox.com/debian/pve bookworm pve-no-subscription
Run updates, but only use dist-upgrade and not regular upgrade as it may break dependencies.
apt-get update
apt-get dist-upgrade
I have a project where I routinely build and rebuild containers between two repos, in which one of the docker build steps pulls the latest compiled code from the others repo. When doing this, the Docker cache gets in the way as it caches the published code.
For example:
- Project 1 publishes compiled code to blob storage
- Project 2 pulls the compiled code and publishes a built container
Project 2's Dockerfile will look something like:
FROM ubuntu:22.04
RUN wget https://blob.core.windows.net/version-1.zip
RUN unzip /var/www/version-1.zip -d /var/www/
The issue is if I update the content of version-1.zip, Docker will cache this content in its build process and be out of date.
I came across a great solution on stackoverflow: https://stackoverflow.com/questions/35134713/disable-cache-for-specific-run-commands
This solution doesn't work completely for me, as I am using docker-compose up commands, not docker-compose build. However, after a little trial and error, I have the below workflow working:
FROM ubuntu:22.04
ARG CACHEBUST=1
RUN wget https://blob.core.windows.net/version-1.zip
RUN unzip /var/www/version-1.zip -d /var/www/
Run a build:
docker compose -f "docker-compose.yml" build --build-arg CACHEBUST=someuniquekey
Run an up:
docker compose -f "docker-compose.yml" up -d --build
This way the first run Docker build is cache busted using whatever unique key you want, and the second Docker up uses the newly compiled cache. NOTE: you can omit the last --build to not trigger a new cached build if you like. Now I can selectively bust out of the cache at a particular step, which in a long Dockerfile, can save heaps of time. I guess you could even put multiple args at strategic places along your Dockerfile and be able to trigger a bust where it makes most sense.
du -shc * | sort -rh
WSL is fantastic for allowing devs and engineers to mix and match environments and toolsets. It's saved me many times having to maintain VM's specifically for different environments and versions of software. Microsoft are doing a pretty good job these days at updating it to support new features and bug fixes, however, running WSL and Docker as a permanent part of your workflow it's not without it's flaws.
This post will be added to as I remember optimizations that I have used in the past, however, all of them are specific to running Linux images and containers, not Windows.
Keep the WSL Kernel up to date
Make sure to keep the WSL Kernel up to date to take advantage of all the fixes Microsoft push.
wsl --update
Preventing Docker memory hog
I routinely work from a system with 16GB of RAM and running a few docker images would chew all available memory through the WSL vmmem process which would in turn lock my machine. The best workaround I could find for this was to set an upper limit for WSL memory consumption. You can do this through editing the .wslconfig file in your Users directory.
[wsl2]
memory=3GB # Limits VM memory in WSL 2 up to 3GB
processors=2 # Makes the WSL 2 VM use two virtual processors
You will need to reboot WSL for this to take effect.
wsl --shutdown
NOTE: half memory sizes don't seem to work, I tried 3.5GB and it just let it run max RAM on the system.
Slow filesystem access rates
When performing any sort of intensive file actions to files hosted in Windows but accessed through WSL you'll notice it's incredibly slow. There are a lot of open cases about this on the WSL GitHub repo, but the underlying issue is how the filesystem is "mounted" between the Windows and WSL.
This bug is incredibly frustrating when working with containers that host nginx or Apache as page load times are in the multiple seconds irrespective of local or server caching. The best way around this issue is not have filesystem served from Windows, but serve it inside of the WSL distro. This used to be incredibly finicky to achieve but is easy now given the integration of tooling to WSL.
For example, say you have a single container that serves web content through Apache and that your development workflow means you have to modify the web content and see changes in realtime (ie React, Vue, webpack etc). Instead of building the docker container with files sourced from a windows directory, move the files to the WSL Linux filesystem (clone your Repo in Linux if you're working on committed files), then from the Linux commandline issue your build. Through the WSL2/Docker integration, the Docker socket will let you build inside of Linux using the Linux filesystem but run the container natively on your Windows host.
To edit your files inside the container, you can run VS Code from your Linux commandline which through the Code/WSL integration will let you edit your Linux filesystem.
Mounting into Linux FS
Keep in mind that if you do need to mount from Windows into your Linux filesystem for whatever reason you can do it via a private share that is automatically exposed.
\\wsl$
If you have multiple distros installed they will be in their own directory under that root.
Tuning the Docker vhdx
Optimize-VHD -Path $Env:LOCALAPPDATA\Docker\wsl\data\ext4.vhdx -Mode Full
This command didn't do much for me. It took about 10minutes to run and only reduced my vhdx from 71.4GB to 70.2GB.
Error not binding port
I've had this recurring error every so often when restarting Windows and running Docker with WSL2. Every so often Docker compains it can't bind to a port that I need (like MySQL). Hunting down the cause of this is interesting - https://github.com/docker/for-win/issues/3171 - https://github.com/microsoft/WSL/issues/5306
The quick fix to this is:
net stop winnat
net start winnat
I’m a pretty big fan of TeamViewer.
There are heaps of remote desktop apps like GoTo Assist and the like that are able to punch through NAT by creating a reverse tunnel, but each to their own.
The latest Ubuntu install 8.0.17864 creates a daemon to bring your machine online. Maybe I’m just being a fritata but whenever the daemon was active, the machine would show online, but I could never connect to it. Even when the machine was a added as a partner it would show online, but it would always sit at “Connecting” when you try and remote into it.
The only way I could get into the remote machine was to open the GUI on the remote machine. Once the GUI was open the machines “online” status never changed, but I could remote in.
Due to the nature of a remote machine, you’ll never have remote access to open the GUI in order to remote to it. So the below startup script will launch the GUI upon login so you can successfully remote in. It is exactly the same script that is run when you click on the GUI icon for TV.
/opt/teamviewer8/tv_bin/script/teamviewer
HTH
I’ve had an annoying problem with my Linux VirtualBox Host + Guest combo for some time and have now only just got around to solving it, so hopefully this can help others in the same situation.
My Host runs Ubuntu 11.04 Desktop, but I run this headless. Unfortunately when it’s run headless and you VNC/TeamViewer/Weaponofchoice you get an 800×600 res. The latest version of VirtualBox + guest additions for Windows guests lets you define resolutions up to 6400×1200 without having to resize the guest window from the host GUI of VirtualBox.
Unfortunately my Ubuntu 12.04 guest wasn’t so lucky, and it defaulted to 800×600 even with the guest additions. The resizing of the guest window from the host worked, but in my case my host was at 800×600 and resizing was a massive pain in the ass.
I spent many hours scouring for how to manually resize a guest and came across many answers, none of which worked for me. I’ll throw what didn’t work below just in case anyone tries the same thing.
x90@ban-roy-x90-vm:~$ cvt 1280 102 # 1280×1024 59.89 Hz (CVT 1.31M4) hsync: 63.67 kHz; pclk: 109.00 MHz Modeline “1280x1024_60.00” 109.00 1280 1368 1496 1712 1024 1027 1034 1063 -hsync +vsync x90@ban-roy-x90-vm:~$ xrandr –newmode “1280x1024_60.00” 109.00 1280 1368 1496 1712 1024 1027 1034 1063 -hsync +vsync x90@ban-roy-x90-vm:~$ xrandr –addmode VBOX0 1280x1024_60.00 x90@ban-roy-x90-vm:~$ xrandr –output VBOX0 –mode 1280x1024_60.00 vboxmanage setextradata global GUI/MaxGuestResolution 1280,1024 vboxmanage setextradata “VM name” “CustomVideoMode1” “1280x1024x16”
None of these worked. In the end I had to create a custom xorg.conf file that manually specified the resolution. As newer versions of Ubuntu did away with a default xorg.conf file I created:
/usr/share/X11/xorg.conf.d/20-monitor.conf
Which contained the below:
Section “Device” Identifier “Configured Video Device” Driver “vboxvideo” EndSection Section “Monitor” Identifier “Configured Monitor” Option “DPMS” EndSection Section “Screen” Identifier “Default Screen” Monitor “Configured Monitor” Device “Configured Video Device” DefaultDepth 24 SubSection “Display” Depth 24 Modes “1280×1024” EndSubSection EndSection
I guess I could have restarted gdm but after a reboot everything was finally working as expected without ever having to resize the guest window!
A while ago I was working on a project to decommission the old TACACS server and we chose to replace it with Radius for Cisco router authentication.
After trying a few different radius packages (on Linux) one of our engineers said that he had luck in the past with Radiator – a closed source radius package for Linux. The Radiator software http://open.com.au/radiator/index.html is probably under-utilised for basic authentication, but has been rock solid in our production environment for 6 months+.
What we now have is a radius server that accepts authentication requests from our Cisco devices, checks whether the username or Calling-Station-Id is in a blacklist, authenticates them against LDAP to our Domain Controller and then checks the users group membership to allow them to authenticate. All failed and accepted attempts are also logged.
Whilst the documentation is huge and detailed (376 pages) I couldn’t find any specific examples on the net to tie everything we wanted together. So below is a sample configuration for what we are running as detailed above. Essentially we make a Radius user on the domain who can read LDAP (because we don’t allow anon ldap queries right?). We also make a RadiusSG security group which will contain the users that we want to allow login to our devices (because we don’t want to allow a terminal login for all our other AD users).
Note, I have also included a clients-group1.cfg file to specify each NAS into nice groups. I use this option to create multiple includes to split devices by region/country.
file: /etc/radiator/radius.cfg
#Foreground
LogStdout
LogDir /var/log/radius
DbDir /etc/radiator
# Use a low trace level in production systems. Increase
# it to 4 or 5 for debugging, or use the -trace flag to radiusd
Trace 3
# You will probably want to add other Clients to suit your site,
# one for each NAS you want to work with
# INCLUDE OUR REGION SETTINGS
include %D/clients-group1.cfg
<Realm DEFAULT>
# LOG ALL FAILED REQUESTS TO /var/log/radius/<YEAR>-<MONTH>-attempts-failed.log
<AuthLog FILE>
Filename %L/%Y-%m-attempts-failed.log
LogFailure 1
LogSuccess 0
FailureFormat %d/%m/%Y %H:%M:%S FAIL Username: %U Password: %P from %{Calling-Station-Id} on %{NAS-IP-Address}
</AuthLog>
# LOG ALL ACCEPTED REQUESTS TO /var/log/radius/<YEAR>-<MONTH>-attempts-ok.log
<AuthLog FILE>
Filename %L/%Y-%m-attempts-ok.log
LogSuccess 1
LogFailure 0
SuccessFormat %d/%m/%Y %H:%M:%S OK Username: %U Password: <hidden> from %{Calling-Station-Id} on %{NAS-IP-Address}
</AuthLog>
# CHECK BAD USERNAMES THEN BAD IP’S THEN LDAP FOR AUTHENTICATION
<AuthBy GROUP>
# FLOW THROUGH OUR BLACKLIST MODULES
AuthByPolicy ContinueUntilReject
#CHECK FOR BAD USERNAMES
<AuthBy FILE>
Blacklist
Filename %D/reject-usernames
</AuthBy>
#CHECK FOR BAD IP’S
<AuthBy FILE>
Blacklist
AuthenticateAttribute Calling-Station-Id
Filename %D/reject-ip
</AuthBy>
#CHECK AGAINST OUR AD VIA LDAP
<AuthBy LDAP2>
# SPECIFY THE DOMAIN CONTROLLER ADDRESS AND LDAP PARAMS
Host <INTERNALIPOFDOMAINCONTROLLER>
SSLVerify none
UseTLS
Port 3268
# OUR DC WONT ALLOW ANON READING SO WE HAVE TO AUTH AS A VALID USER
AuthDN cn=Radius, OU=Service Accounts, DC=<DOMAINHERE>, DC=prd
AuthPassword <RadiusUSERPASSWORDHERE>
# USE THE CACHE FOR MULTIPLE ATTEMPTS WHICH SAVES LDAP QUERIES
CachePasswords
# START SEARCHING LDAP FROM THIS DN FORWARDS
BaseDN DC=<DOMAINHERE>, DC=prd
UsernameAttr sAMAccountName
ServerChecksPassword
# REQUIRE GROUP MEMBERSHIP
SearchFilter (&(%0=%1)(memberOf=CN=RadiusGroup SG, OU=Security Groups, DC=<DOMAINHERE>, DC=prd))
</AuthBy>
</AuthBy GROUP>
</Realm>
I have also created some scripts to poll for top IP offenders (bruteforce attempts etc) so I will most likely post these details soon.
Two weeks ago I was fortunate enough to attend Cisco Live (previously networkers).
Part of my goal there was to get clued up on IPv6 transition methods, addressing and all related matter. One of the breakout sessions I attended was on IPv6 security threats and mitigation. All in all very informative, but the major advice for networks not currently running IPv6 was to monitor your IPv6 flows to see what applications and operating systems were doing. Technologies like ISATAP are bound to break security boundaries by tunneling via IPv4 and this is something you should be aware of on your network.
Today I started this quest just by running a regular wireshark session filtering via IPv6. Without a tap or a port span I could only observe multicast traffic, but I picked up on the below packets.
My immediate thought was a users PC was infected with a virus that was acting as part of a botnet and that this PC was using IPv6 to perform its DNS lookups. I went searching for 10 character IPv6 DNS lookups. Luckily what I found meant it wasn’t part of a botnet but I definitely wasn’t expecting what I found. This case has been documented before, so this is definitely nothing new and the fact that this happens in both IPv4 and IPv6 isn’t a suprise. Here are the references I found:
http://code.google.com/p/chromium/issues/detail?id=47262
http://groups.google.com/a/chromium.org/group/chromium-discuss/browse_thread/thread/17bd3e93f3c68448?pli=1
https://isc.sans.edu/diary.html?storyid=10312
http://groups.google.com/a/googleproductforums.com/forum/#!category-topic/chrome/report-a-problem-and-get-troubleshooting-help/dQ92XhrDjfk
As the reports suggest it’s a feature of Chrome to perform fake DNS lookups to determine if your ISP is performing DNS hijacking. In my case our DNS suffix provided by our DHCP server did not get appended, nor was the request a truncation of a proper URL nor was it over IPv4 – but it is most definitely the cause of the events I saw on the network.
As the quest for IPv6 and related security problems goes on I’m sure to throw more stuff up here.