Process Text with Local LLMs on Mac
Run AI models locally on your Mac for maximum privacy and zero API costs.
How It Works
Install Ollama
Download Ollama (ollama.com) and pull a model like llama3 or mistral.
Install Echoo
Download Echoo and open the AI Provider settings.
Connect to local model
Select Ollama as your provider and point to your local endpoint (usually localhost:11434).
Use AI shortcuts offline
All your text processing now runs entirely on your Mac. No internet needed.
Best Practices & Tips
Running AI models locally on your Mac with Ollama and Echoo gives you the full power of AI text processing with complete privacy and zero ongoing costs. Setting this up correctly ensures you get the best possible experience.
Choose the right model for your hardware and use case. For Macs with 8GB RAM, use smaller models like Phi-3 Mini (3.8B parameters) or Gemma 2B. These handle grammar correction, simple rewriting, and translation well. With 16GB RAM, you can run Llama 3 8B or Mistral 7B, which handle more complex tasks like summarization, code review, and creative writing. With 32GB+ RAM, models like Llama 3 70B provide quality comparable to cloud providers.
Install multiple models for different tasks. Ollama makes it easy to pull several models. Use a fast, small model for quick grammar fixes and tone adjustments. Keep a larger, more capable model for complex tasks like code review, detailed summarization, and creative writing. In Echoo, you can assign different models to different commands by creating separate AI provider configurations.
Optimize model performance on Apple Silicon. Ollama automatically uses Metal (Apple's GPU framework) for inference on M-series chips. Keep your Mac plugged in during intensive use, as local inference is computationally demanding. Close unnecessary applications to free up memory, since models need to load entirely into RAM for optimal performance. The first response after loading a model is slower - subsequent requests are much faster while the model stays in memory.
Understand the tradeoffs of local versus cloud models. Local models are completely private, free to run, and work offline. However, they may be slower than cloud APIs and smaller models may produce lower quality results for complex tasks. For most everyday tasks like grammar correction, translation, and tone adjustment, local models perform excellently. For tasks requiring deep reasoning or very long context, cloud models may still be superior.
Set up Ollama to start automatically with your Mac so it is always ready when you need it. Configure Echoo to fall back to a cloud provider if the local model is unavailable, giving you reliability without sacrificing privacy by default.
Explore specialized models for specific tasks. CodeLlama or DeepSeek Coder for programming tasks, Nous Hermes for creative writing, and multilingual models for translation. The open-source model ecosystem is growing rapidly, with new and improved models released regularly.
Pro Tips
Install multiple models and assign them to different commands - a fast small model (Phi-3) for quick grammar fixes and a larger model (Llama 3 8B) for complex tasks like summarization and code review.
Set Ollama to launch at login so models are always available. The first inference loads the model into memory; subsequent requests are much faster.
Use the ollama list command to see your installed models and ollama pull to download new ones. Check ollama.com/library for the latest available models.
If local model quality is insufficient for a specific task, create that command with a cloud provider while keeping your other commands local - you can mix providers across commands.
Who Uses This
Best AI Providers for This
Commands for This Use Case
Frequently Asked Questions
Any model supported by Ollama, LocalAI, or LiteLLM - including Llama 3, Mistral, Phi, CodeLlama, and hundreds more.
Any Apple Silicon Mac (M1 or later) works great. 8GB RAM is sufficient for smaller models; 16GB+ is recommended for larger ones.
Yes. With local LLMs, your text never leaves your device. No data is sent to any external server.
Related Articles
v0.9.11 Beta: Local LLM Support & Commands Marketplace
Run AI entirely on your device with Ollama and LocalAI support - no API costs, no data leaving your Mac. Plus the new Commands Marketplace to share and discover community workflows.
Your Data, Your Control
Privacy is our top priority. Your text never touches our servers, your API keys are stored in Apple Keychain, and we only collect minimal anonymous analytics.
Explore More
Echoo vs Raycast AI
Compare Echoo and Raycast AI for text transformation on macOS. Free vs paid, privacy, features, and workflow differences.
Echoo vs Text Blaze
Compare Echoo and Text Blaze for text transformation. AI-powered shortcuts vs template-based text expansion on macOS.
Echoo vs Espanso
Compare Echoo and Espanso for macOS text productivity. AI-powered transformation vs rule-based text expansion.
OpenAI Integration
Connect Echoo to OpenAI GPT models for powerful AI text transformation on macOS. Use GPT-4, GPT-5, and more with keyboard shortcuts.
Anthropic Integration
Connect Echoo to Anthropic Claude models for thoughtful, nuanced AI text transformation on macOS. Claude Opus, Sonnet, and Haiku.
Google Gemini Integration
Connect Echoo to Google Gemini AI for free, fast text transformation on macOS. Gemini Flash Lite offers a generous free tier.
Ready to Try It?
Download Echoo for free and start transforming text with AI shortcuts.