Experimenting with Lllama 3 via Ollama
source link: https://mtlynch.io/notes/ollama-llama3/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Experimenting with Lllama 3 via Ollama
April 25, 2024
3-minute read
I saw that Meta released the Llama 3 AI model, and people seem excited about it, so I decided to give it a try.
I don’t have much experience running open-source AI models, and I didn’t see a lot of documentation about how to run them. I tinkered with it for a few hours and got Llama 3 working with Ollama, so I wanted to share my instructions.
Provisioning a cloud server with a GPU 🔗︎
To run this experiment, I provisioned the following server on Scaleway:
- Server instance type: GPU-3070-S
- OS: Ubuntu Focal
- Disk size: 100 GB (needed because the model is large)
To SSH in, I ran the following command with port forwarding because I’ll need access to the web interface that will run on the server’s localhost
interface.
TARGET_IP='51.159.184.186' # Change to your server's IP.
REMOTE_PORT='8080'
LOCAL_PORT='8080'
# SSH in and port-forward a port to access the Open-WebUI web interface.
ssh "${TARGET_IP}" -L "${REMOTE_PORT}:localhost:${LOCAL_PORT}"
Install CUDA 🔗︎
First, install CUDA to enable Ollama to use the GPU:
sudo apt-get install linux-headers-$(uname -r) && \
sudo apt-key del 7fa2af80 && \
echo "deb [signed-by=/usr/share/keyrings/cudatools.gpg] https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/ /" | sudo tee /etc/apt/sources.list.d/cuda-ubuntu2204-x86_64.list && \
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin && \
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600 && \
sudo apt-get update && \
sudo apt-get install -y cuda-toolkit nvidia-container-toolkit ca-certificates curl
Install docker 🔗︎
Next, install Docker so that you can run ollama under the Open-WebUI web interface for Ollama:
sudo install -m 0755 -d /etc/apt/keyrings && \
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc && \
sudo chmod a+r /etc/apt/keyrings/docker.asc && \
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null && \
sudo apt-get update && \
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin && \
sudo usermod -aG docker "${USER}" && \
newgrp docker
To test everything is working, run the following command:
docker run hello-world
Start Ollama and Open-WebUI 🔗︎
I adapted the standard Open-WebUI Docker Compose file to make one for Ollama, which you can download and run with the following command:
wget https://mtlynch.io/notes/ollama-llama3/docker-compose.yml && \
docker-compose up
Once the server is up and running, visit the following URL in your browser:
You’ll first see a page prompting for a login. Click “Sign up.”
Then enter any details. You don’t really need a valid email, as far as I can tell.
From here, you need to download a model to use. Click the settings button:
I don’t know the differences between the models, but Llama 3 is the newest one that just came out a few days ago, so I decided to try that. It says on ollama.com that llama3:70b
is optimized for chatbot use cases, so I initially went with that one, but it was incredibly slow. I switched to llama3
and that performed decently:
It’s going to sit at 100% for a while, but it’s not done until you see a popup announcing the model is fully downloaded.
Once that’s downloaded, close the settings dialog and select llama3:latest
from the dropdown:
From there, you can start playing with Llama 3. Here’s me having a conversation with Llama 3 as it pretends to be Nathan Fielder:
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK