Title: AIBOX-K3 LLM large model usage [Print This Page] Author: 799959745 Time: 3 day before Title: AIBOX-K3 LLM large model usage Last edited by 799959745 In 5/26/2026 15:34 Editor
Step Two:
2.1 Connect the machine to an HDMI monitor. Open the terminal. Install dependencies. The inference tool llama.cpp-tools-spacemit needs to be installed.
sudo apt-get update
# Install the accelerated version of llama.cpp inference tool from Spacemit
sudo apt install llama.cpp-tools-spacemit
Copy the code
2.2. Downloading the Model
When using llama.cpp, a GGUF format model is required. It is recommended to download the model to the default path ~/.cache/models/llm for easy management. For quick verification, you can download the model from the Spacemit server.
Model source:
Spacemit mirror (recommended): https://archive.spacemit.com/spacemit-ai/model_zoo/llm/
Multiple pre-installed GGUF models (such as Qwen2.5, Qwen3, Deepseek, etc.) can be downloaded directly to the default directory:
Note:
Running the 30B large model requires 32GB of RAM. If you encounter insufficient memory, try disabling desktop display to reduce memory usage. Connect to a serial terminal to execute commands.
(The command to close the terminal is generally not needed with a large memory configuration.)
systemctl stop sddm.service
Copy the code
Welcome Firefly Open Source Community (https://bbs.t-firefly.com/)