AI Tool Stack

A living map of the AI tools I use, what each one is for, and how I keep a large stack from turning into tool sprawl.

This is the working map of the AI tools I actually use.

It is not a minimalist stack. I use a lot of AI tools, and that is fine as long as each one has a job. The useful question is not "can this stack be smaller?" It is: what does each tool own, what context does it need, and where does human judgement come back into the loop?

The exact names will change. The operating principle is to review the stack often enough that tools do not quietly overlap, drift, or stay around just because they used to be useful.

Current stack shape

My current stack has six layers:

general reasoning for synthesis, planning, critique, and turning loose notes into structure
writing and editing for voice, prose, drafts, and repeatable writing workflows
coding and site work for repo edits, app prototypes, tests, deploys, and browser checks
research and source checking for live web evidence, citation trails, and second opinions
visual generation for article images, concept sketches, and site visuals
notes and memory for keeping decisions, source trails, and repo history durable

The stack works when each layer has a clear default, a useful fallback, and a reason to stay.

Current tool inventory

This is the list I want to keep honest over time.

General reasoning

Perplexity + Comet Browser. I use Perplexity when I want live, cited synthesis instead of a loose brainstorm. Comet is useful when the work is happening inside the browser itself: reading pages, comparing sources, following a research thread, or carrying context across tabs.
ChatGPT Auto / GPT-5.5. I use ChatGPT as the broad reasoning space for messy planning, trade-offs, and turning scattered context into a usable structure. In practice, Auto is the default surface; GPT-5.5 Thinking is the high-capability mode I expect to matter most for harder multi-step work when it is available or selected.
Review step: did the model compress the problem into a clearer decision, or did it just create another layer of phrasing?

Writing and editing

Sudowrite. I use this for creative prose, draft movement, and fiction-shaped writing support. Its Muse model and Story Bible setup are useful because they are built around story context, style, and longer-form narrative instead of generic business copy.
Spiral by Every. I use Spiral for repeatable writing and thinking workflows: the kind of prompt pattern where I want the output to reliably match a style, structure, or decision format over time.
Review step: does the output sound like something I would actually publish, or does it only sound polished?

Coding and site work

Claude Code. I use Claude Code for deep terminal work across a codebase: reading files, editing, running commands, debugging, and moving from issue to patch.
Droid CLI. I use Droid as another terminal agent, especially when I want a second implementation/review path or a CLI that can run interactively or in single-shot droid exec mode.
Codex App / Codex CLI. I use Codex for local repo work, browser-verified site changes, and managing coding agents with a strong approval and review loop. The app is useful when supervising multiple agent tasks; the CLI is useful when staying close to the terminal and git state.
Review step: tests first, diff second, live site third. A coding agent is not finished just because it says the patch is done.

Research and source checking

Perplexity. I use it for quick source-backed maps of a topic, especially when I need citations and a current answer fast.
Gemini with Google Search. I use Gemini search grounding as a second research lane when freshness matters or when I want a different search-and-synthesis path over the same question.
Review step: prefer primary sources, dates, and direct evidence over confident summaries

Visual generation

GPT Image 2. I use this primarily for generated images and edits, especially when I want strong prompt following and a polished output for article imagery.
Nano Banana 2. I use this as a fast image-generation and editing lane, especially when speed, iteration, and subject consistency matter.
Review step: does the image fit the site, the article, and the existing visual language? Looking good in isolation is not enough.

Notes and memory

GitHub. For site and code work, GitHub is the durable memory layer: commits, pull requests, issues, diffs, and deploy history. It is not where every thought belongs, but it is where durable technical decisions should leave a trail.
Review step: if a decision matters later, it should be findable from the repo, a commit, a pull request, an issue, or a linked source note.

Review rhythm

Because this is a large stack, the maintenance habit matters. I want to keep reviewing the stack instead of letting it turn into a drawer full of subscriptions and one-off experiments.

When I update this page, I want to answer:

Which tools did I actually use this month?
Which tool is the default for each job?
Which tool is only a fallback?
Which tool should be retired or folded into another workflow?
Where does human review still matter most?

Open questions

This is the part to keep updating:

Which tasks should start in Perplexity versus ChatGPT?
When should a writing pass go to Sudowrite versus Spiral?
Which coding jobs should go to Claude Code first, and which should start in Codex?
When is Droid the second opinion instead of the primary coding agent?
Which image tasks are better for GPT Image 2 versus Nano Banana 2?
Which decisions deserve a GitHub issue, pull request note, or source note?