As AI models become increasingly commoditized, startups are racing to build a software layer on top of them. One interesting entrant into this space is Osaurus, an open source Apple-specific LLM server. This allows users to move between different local AI models locally or in the cloud while keeping all their files and tools on their own hardware.
Osaurus evolved from the idea of Dinoki, a desktop AI companion. Osaurus co-founder Terence Pae described it as a kind of “AI-powered Clippy.” Dinoki’s customers asked him why they should buy the app if they still had to pay tokens (units of usage that AI companies charge to process prompts and generate responses).
This led Pae to think more deeply about running AI locally.
“This is the beginning of Osaurus,” Peh, who previously worked as a software engineer at Tesla and Netflix, told TechCrunch by phone. The idea, he explained, was to run the AI assistant locally. “You can do almost everything locally on your Mac, including viewing files, accessing the browser, and accessing system settings. We thought this was a great way to position Osaurus as a personal AI for individuals.”
Pae started building the public tool as an open source project, adding features and fixing bugs along the way.

Currently, Osaurus has the flexibility to connect to locally hosted AI models as well as cloud providers such as OpenAI and Anthropic. Users have the freedom to choose which AI models to use and keep other aspects of the AI experience, such as the model’s own memory and files and tools, on their own hardware.
The advantage of this system is that users can switch to the AI model that best suits their needs, as each AI model has different strengths.
This structure gives Osaurus what is called a “harness.” It is a control layer that connects different AI models, tools, and workflows through a single interface, similar to tools like OpenClaw and Hermes. However, the difference is that such tools are often aimed at developers who are familiar with using the terminal. Other times, as in the case of OpenClaw, there may be security issues or holes that should be of concern.
Osaurus, on the other hand, provides an easy-to-use interface for consumers to use and addresses security concerns by running in a virtual sandbox isolated from hardware. This limits the AI to a certain range and keeps your computer and data safe.

Of course, running AI models on machines is still in its infancy, as it is resource-intensive and hardware-dependent. To run local models, your system must have at least 64 GB of RAM. When running larger models such as DeepSeek v4, Pae recommends a system with approximately 128 GB of RAM.
But Pe believes the need for local AI will diminish over time.
“We can see the potential because the intelligence per watt metric for local AI is going up significantly, and it’s on its own innovation curve. Last year, local AI could barely finish a sentence, now it can actually run tools, write code, access browsers, order things from Amazon (…) It’s getting better and better,” he said.

Current Osaurus can run models such as MiniMax M2.5, Gemma 4, Qwen3.6, GPT-OSS, Llama, and DeepSeek V4. It also supports Apple’s on-device foundation model, Liquid AI’s on-device model LFM family, and can connect to OpenAI, Anthropic, Gemini, xAI/Grok, Venice AI, OpenRouter, Ollama, and LM Studio in the cloud.
As a full Model Context Protocol (MCP) server, it can also give MCP-compatible clients access to tools. Additionally, it comes with over 20 native plugins for Mail, Calendar, Vision, macOS usage, XLSX, PPTX, Browser, Music, Git, Filesystem, Search, Fetch, and more.
Recently, Osaurus has been updated to include audio capabilities.
According to its website, the project has been downloaded more than 112,000 times since it went live about a year ago.
Now, Osaurus’ founders (including co-founder Sam Yoo) are part of the New York-based Startup Accelerator Alliance. They are also considering next steps, with Osaurus potentially being offered to companies in the legal and healthcare sectors, where they can run local LLMs to address privacy concerns.
The team believes that as the power of local AI models increases, the demand for AI data centers may decline.
“We’re seeing explosive growth in the AI space, and[cloud AI providers]are having to scale up with data centers and infrastructure, but I feel like people still don’t really understand the value of local AI,” Peh said. “Instead of relying on the cloud, you can actually deploy Mac Studio on-premises and the power consumption should be significantly lower. You still have the capabilities of the cloud, but you don’t have to rely on a data center to run AI,” he added.
If you buy through links in our articles, we may earn a small commission. This does not affect editorial independence.
