Ollama
read-frog can leverage a locally deployed Ollama server to achieve offline AI translation capabilities
What is the Ollama Server?
The Ollama server is a lightweight, open-source AI model runtime platform that allows users to easily deploy various large language models (LLMs) and AI models locally without complex configuration. It acts as an "AI model manager," simplifying tedious processes like model downloading, configuration, and runtime into just a few commands—making it accessible even for AI deployment beginners.
Ollama Local Deployment Tutorial
1. Download the Installer
You can download the installer from the official ollama.ai website here.
2. Run the Installer
Do not directly click the downloaded .exe
file to install (as it will default to the C drive). Instead, open the Windows Command Prompt (CMD), drag the installer into the CMD window, and enter the following command:
This will install Ollama in the Your\Path
folder on the D drive.
3. Configure Ollama's Cross-Origin Support
Set the following environment variables:
- API service listening address:
OLLAMA_HOST=0.0.0.0
- Allow cross-origin access:
OLLAMA_ORIGINS=*
- Model file download location:
OLLAMA_MODELS=F:\ollama\models
(This setting is unrelated to cross-origin access but is crucial because the default model download location is on the C drive, which may fill up your C drive. Manually specify the model storage path.)
Restart your computer after configuring the environment variables.
Common Ollama Commands
1. Check Ollama Version and Start the Server
Use these commands to check the currently installed Ollama version, confirm successful installation, view version details, and start the Ollama server.
The default address for the Ollama server is: http://127.0.0.1:11434
. The API Key can be arbitrarily set in the plugin settings, but the model name must exactly match the names listed by ollama list
.
2. Search for Available Models
Example: Search for LLaMA-related models:
This command lists models matching the keyword, including names and descriptions, to help users select the required model.
3. Pull a Model
Example: Pull the llama2
model:
After execution, Ollama downloads the specified model from the model repository to the local system for offline use.
4. Run a Model
Example: Run the llama2
model and interact with it:
After launching, enter questions or commands in the command line, and the model will return corresponding responses. You can also add parameters during runtime to adjust model behavior, e.g.:
Use the --system
parameter to set the model's role or task description.
5. Stop a Running Model
Example: Stop the running llama2
model:
This command closes the running model process and frees system resources.
6. List Downloaded Models
This command displays all locally downloaded models, including names, sizes, and creation times, to help users manage local model resources.
7. Delete a Model
Example: Delete the unused llama2
model:
After execution, the model is removed from the local system, freeing disk space.
Ollama Usage Recommendations
Recommended model: gemma3:1b
(815MB in size). Despite its small footprint, this model is fully functional, performs well in translation tasks, and has low system load (supports CPU operation). Avoid using inference models like deepseek
, as they output reasoning processes, leading to poor translation quality and speed.