What models can I use?

Any model supported by Ollama, LocalAI, or LiteLLM - including Llama 3, Mistral, Phi, CodeLlama, and hundreds more.

What hardware do I need?

Any Apple Silicon Mac (M1 or later) works great. 8GB RAM is sufficient for smaller models; 16GB+ is recommended for larger ones.

Is it really 100% private?

Yes. With local LLMs, your text never leaves your device. No data is sent to any external server.

Process Text with Local LLMs on Mac

Run AI models locally on your Mac for maximum privacy and zero API costs.

How It Works

Install Ollama
Download Ollama (ollama.com) and pull a model like llama3 or mistral.
Install Echoo
Download Echoo and open the AI Provider settings.
Connect to local model
Select Ollama as your provider and point to your local endpoint (usually localhost:11434).
Use AI shortcuts offline
All your text processing now runs entirely on your Mac. No internet needed.

Best Practices & Tips

Running AI models locally on your Mac with Ollama and Echoo gives you the full power of AI text processing with complete privacy and zero ongoing costs. Setting this up correctly ensures you get the best possible experience.

Choose the right model for your hardware and use case. For Macs with 8GB RAM, use smaller models like Phi-3 Mini (3.8B parameters) or Gemma 2B. These handle grammar correction, simple rewriting, and translation well. With 16GB RAM, you can run Llama 3 8B or Mistral 7B, which handle more complex tasks like summarization, code review, and creative writing. With 32GB+ RAM, models like Llama 3 70B provide quality comparable to cloud providers.

Install multiple models for different tasks. Ollama makes it easy to pull several models. Use a fast, small model for quick grammar fixes and tone adjustments. Keep a larger, more capable model for complex tasks like code review, detailed summarization, and creative writing. In Echoo, you can assign different models to different commands by creating separate AI provider configurations.

Optimize model performance on Apple Silicon. Ollama automatically uses Metal (Apple's GPU framework) for inference on M-series chips. Keep your Mac plugged in during intensive use, as local inference is computationally demanding. Close unnecessary applications to free up memory, since models need to load entirely into RAM for optimal performance. The first response after loading a model is slower - subsequent requests are much faster while the model stays in memory.

Understand the tradeoffs of local versus cloud models. Local models are completely private, free to run, and work offline. However, they may be slower than cloud APIs and smaller models may produce lower quality results for complex tasks. For most everyday tasks like grammar correction, translation, and tone adjustment, local models perform excellently. For tasks requiring deep reasoning or very long context, cloud models may still be superior.

Set up Ollama to start automatically with your Mac so it is always ready when you need it. Configure Echoo to fall back to a cloud provider if the local model is unavailable, giving you reliability without sacrificing privacy by default.

Explore specialized models for specific tasks. CodeLlama or DeepSeek Coder for programming tasks, Nous Hermes for creative writing, and multilingual models for translation. The open-source model ecosystem is growing rapidly, with new and improved models released regularly.

Pro Tips

Install multiple models and assign them to different commands - a fast small model (Phi-3) for quick grammar fixes and a larger model (Llama 3 8B) for complex tasks like summarization and code review.

Set Ollama to launch at login so models are always available. The first inference loads the model into memory; subsequent requests are much faster.

Use the ollama list command to see your installed models and ollama pull to download new ones. Check ollama.com/library for the latest available models.

If local model quality is insufficient for a specific task, create that command with a cloud provider while keeping your other commands local - you can mix providers across commands.

Who Uses This

Echoo for Developers

Review code, explain errors, and refactor with keyboard shortcuts - right in your editor.

Echoo for Researchers

Summarize papers, translate foreign research, simplify complex text, and process files from Finder.

Best AI Providers for This

Ollama

Run AI text transformation 100% locally on your Mac with Ollama and Echoo. Maximum privacy, zero API costs, offline capable.

Commands for This Use Case

Code Improver

Refactor and optimize code following best practices and clean code principles.

AnthropicSonnet 4.5

Professional Tone

Rewrite for professional context - emails, Slack, docs.

AnthropicHaiku 4.5

Frequently Asked Questions

v0.9.11 Beta: Local LLM Support & Commands Marketplace

Run AI entirely on your device with Ollama and LocalAI support - no API costs, no data leaving your Mac. Plus the new Commands Marketplace to share and discover community workflows.

Your Data, Your Control

Privacy is our top priority. Your text never touches our servers, your API keys are stored in Apple Keychain, and we only collect minimal anonymous analytics.

Ready to Try It?

Download Echoo for free and start transforming text with AI shortcuts.

Cookie Preferences

Process Text with Local LLMs on Mac

How It Works

Install Ollama

Install Echoo

Connect to local model

Use AI shortcuts offline

Best Practices & Tips

Pro Tips

Who Uses This

Echoo for Developers

Echoo for Researchers

Best AI Providers for This

Ollama

Commands for This Use Case

Code Improver

Professional Tone

Frequently Asked Questions

Related Articles

v0.9.11 Beta: Local LLM Support & Commands Marketplace

Your Data, Your Control

Explore More

Echoo vs Raycast AI

Echoo vs Text Blaze

Echoo vs Espanso

OpenAI Integration

Anthropic Integration

Google Gemini Integration

Ready to Try It?