A bit of methodology
I’ve been a bit hesitant to apply the ‘open-source’/self-hosted label to myself and what I do. Don’t get me wrong, I love open source software that is free for all to use and edit! It’s the foundation of software development as we know it. But the thing is, I am also…
lazy.
The truth is that self-hosting software can be a huge pain! It is absolutely the best for privacy and having full control of your setup, but speed is also hugely important for me when it comes to creating and working with these tools. Running AI models locally just isn’t really there yet in terms of performance for your average consumer. If only there were a reputable company offering free access to the best open source models out there…
The open-source AI stack: Part 2
Pick your engine at NVIDIA Inference Microservice (NIM)
Oh wait, there is! Not gonna lie, I’ve been looking forward to writing this since I sent out the first edition of the newsletter last week. This week I get to tell you all about the best provider of free AI endpoints out there.
NVIDIA is offering free access to top-notch open source models at build.nvidia.com. For the skeptics out there, this is actually a really smart business move on their part: providing free AI inference endpoints encourages people to build and scale using - guess what - their infrastructure. And you get to benefit from this by using the platform to build!
A quick note about ‘the car’ / harness
Last week I talked about a metaphor where the AI model is the ‘engine’ and the harness is the rest of the car.
You could argue that there is a difference between the ‘cars’ out there; a lot of people swear by using Claude Code or Codex. However, I haven’t really seen enough that differentiates these tools to make me switch over from the open source Opencode. Right now, it feels like comparing a Toyota Corolla with a Honda Civic. Some slight differences for sure, but ultimately a similar way of getting to the same place.
I’ve definitely seen proxies going by that route NIM models to Claude Code, as well as ones that route Opus 4.7 to Opencode. Again, since I’m lazy and just want to get up and running the fastest, I advocate for using either a free setup or a paid one, not a mixture of both. But ultimately you do you!
Getting set up on Opencode
Click on your profile in the top right.
Click on ‘API Keys’.
Generate a new API key that Opencode will use to access the NIM servers.
Open Opencode.
NIM is supported natively on Opencode, so you just have to paste the API key in Settings on the desktop app or go to ‘Add provider’ in the TUI menu.
Select your model.
Keep on prompting!
There’s a bunch of good models on NIM, but also a lot of bad ones.
To start off I would recommend nemotron-3-super-120b-a12b or gemma-4-31b-it for non-coding tasks.
If you are coding, try glm-5.1 (this is what I use!) or kimi-k2.6 (slower reply from servers since it’s popular, but this is also good for general purpose queries too!).
deepseek-v4-pro and deepseek-v4-flash are the best models on the platform at the moment, but definitely the most requested as well.
Another quick aside
I mentioned Gemini 3.5 last week - this is not available on NIM but I have been using it in chat through Google’s AI Studio. This is a solid free tool that unfortunately can’t be plugged into Opencode in the same way (Google gets mad and wants you to use Antigravity).
Gemini 3.5 is solid in general but very very bad for sycophancy - “yes and”ing and telling you “you’re absolutely right” which I find infuriating. Different models have different quirks and things that you can (or can’t) live with. This is why I encourage trying out more than one and seeing what works for you!
Software roundup
This week, I’m going to use this section to mention a couple open source Claude Code clones I’ve tried out! Claude Code’s source code leaked earlier this year, which led the creation of these two projects. However, neither of them really drew me away from Opencode.
CheetahClaws - the simpler implementation
Out of the three here, this is the repo I tested the most. CheetahClaws, originally ‘nano-claude-code’ before an unfortunate but understandable name change, was designed to be a simple implementation of what was found in Claude Code’s source. Their brainstorm command particularly intrigued me - it is actually what I based my multi-agent tool deepanswer on! Ultimately though, development on Opencode was way ahead and I wanted the tool with better stability.
Grade: B 👍
Not bad but there are better tools out there
Claw Code - bells and whistles
Claw Code was built with a maximal philosophy - lots of features, meant to be used natively by AI agents, etc. The whole ‘ultraworkers’ and oh-my-opencode/codex ecosystem seems designed to drain tokens which is not my bag at all. Seems powerful but again I have doubts about the rate of development and usability. At this point, you might as well just pay for Claude Code.
Grade: D 😕
Could be worse but we disagree on some fundamentals
Quick news
The Pope published an encyclical on AI and how it should serve humanity and be regulated. Honestly pretty based!
The White House delays on signing an executive order on AI after lobbying from Silicon Valley - not sure if this is a net negative or positive
This reminded me of this essay that Anthropic released about AI leadership in 2028 when it comes to the US and China as well
Wrap up
Thanks again for reading issue 2 of The Blueprint. Next week I’ll talk about the last part of the open-source stack - how I use deepanswer and other organization tools like Notion to stay organized (and how I’m continuing to iterate and improve upon this!). See you then :)
Got questions?
Don’t get me wrong - it can be super overwhelming thinking about all the AI tools that are out there, which ones are best to use, and what will actually help you automate or optimize your workflow. I’m always open to discuss software or help you build a custom system!
Feel free to reply to this email with any questions or particular work problems that you want to automate :)
