I burned through more tokens last month than I care to admit. Two slash commands fixed that, and now my Claude Code loop command workflow looks completely different than it did eight weeks ago. If you're using Claude Code without /loop and /effort, you're either overpaying for simple tasks or underpowering the hard ones.
Most Claude Code content focuses on the flashy stuff: agent teams, multi-file refactors, autonomous coding sessions that run for hours. I've written about agent teams and context engineering myself. But the commands that actually changed my day-to-day aren't the big orchestration features. They're two small, quiet utilities that control how hard Claude thinks and how often it checks in.
/effort lets you dial Claude's reasoning depth up or down. /loop turns Claude into a scheduled task runner that fires between your turns. Separately, they're useful. Together, they're the pattern nobody is talking about: run your loops at low effort for monitoring, then crank to max effort when something breaks. That single insight cut my token usage by roughly 40% on real production work while actually catching more issues than before.
Let me break down both commands, then show you how they combine.
Here's what /effort actually controls under the hood. Three things:
There are five levels: low, medium (the default), high, max (Opus only), and auto. You set it by typing /effort high in your session, passing --effort high as a CLI flag, setting the CLAUDE_EFFORT environment variable, or adding it to your AGENTS.md frontmatter. I'll cover that last option later because it's the most powerful for teams.
Here's how each level plays out in practice. These are real examples from the last month across product work, experiments, and client-style development.
Low effort is for tasks where Claude already knows enough. Renaming a batch of files. Updating import paths after a refactor. Generating boilerplate tests from existing patterns. I use low for anything where the answer is mechanical, not creative. Claude reads fewer files, skips the deep reasoning, and responds fast.
Medium is the default and honestly where most people stay forever. It's fine for general Q&A, simple bug explanations, and writing utility functions. Nothing wrong with it, but you're leaving performance on the table in both directions.
High is my daily driver for about 80% of real coding work. When I'm building a new billing flow, debugging an auth edge case, or wiring up a feature with enough moving parts to punish shallow thinking, I want Claude to actually read the surrounding code, check types, and think through failure modes. High effort does that without going overboard.
Max is Opus only, and it resets when your session ends. This is for the moments when you're genuinely stuck. Recently I hit a race condition in a real-time collaboration flow where updates were arriving out of order under concurrency. I switched to /effort max, gave Claude the full context, and it traced the issue to a missing synchronization guard that medium effort had glossed over three times. Max doesn't just think harder. It reads more files, cross-references more context, and produces longer, more careful analysis.
Auto lets Claude decide. I don't use it. I prefer explicit control, and you should too until you understand what each level actually does to your token budget.
One important detail: low, medium, and high persist across your session. max resets when the session ends. Anthropic clearly doesn't want you burning Opus-level tokens by accident.
/loop is a session-scoped scheduled task runner. The syntax is simple: /loop [interval] [prompt]. You tell Claude what to check and how often, and it fires that task between your turns for as long as the session lives.
Requirements: you need Claude Code v2.1.72 or later. Tasks are session-scoped (they die when the session ends), they expire after 3 days, and you can run up to 50 concurrent tasks.
Here's where it gets practical. Three ways I use /loop every week:
Monitoring preview deploys. After pushing a feature branch, I set up a loop to check the latest preview URL every 2 minutes. If the build fails or the health check returns anything other than 200, Claude tells me immediately. Before /loop, I'd push, switch to another task, forget about the deploy, and discover 45 minutes later that it failed on a missing environment variable.
/loop 2m Check the latest preview deployment URL and report if the build failed or the health check returns non-200Watching test suites. When I'm refactoring state-heavy parts of an app, I run the test suite in the background and set a loop to check for new failures every 5 minutes. This catches regressions in real time while I'm focused on writing code instead of watching terminal output.
/loop 5m Run `npm test -- --reporter=json` and report any new test failures since last checkPR review reminders. When I have a PR up and I'm waiting on CI, I'll loop to check whether checks have passed and flag anything that needs attention.
The limitations are real and worth knowing. Loops are session-scoped, meaning they vanish when you close that terminal. The 3-day expiry means you can't set up a permanent monitor (use actual monitoring tools for that). And 50 tasks sounds like a lot until you realize each loop iteration counts. This isn't a replacement for Datadog. It's a replacement for you alt-tabbing to check things every few minutes.
If you've been following my writing on when Claude makes things worse, you know I'm cautious about autonomous agent behavior. /loop stays within bounds because it's ephemeral and visible. You can cancel any loop, and they all die with your session. No surprises.
Here's the insight that nobody else is covering, and the reason I wrote this post.
Loop tasks should run at low effort. When a loop finds a problem, you cancel the loop and switch to max effort for debugging. This is the pattern.
Think about what a monitoring loop actually needs to do. It checks a URL. It runs a test suite. It looks at a deploy status. None of that requires deep reasoning. Low effort is perfect: quick check, terse report, minimal tokens. A loop running every 2 minutes at high effort burns tokens like a space heater. The same loop at low effort sips them.
Here's my actual workflow from last Tuesday. I was refactoring a set of access-control rules. I set up a loop to run the relevant test suite every 5 minutes at low effort:
/effort low
/loop 5m Run the access-control test suite and report only failuresThen I switched back to /effort high for my actual coding work. The loop kept running at low effort in the background. Twenty minutes in, the loop flagged two failing tests in the update path. I cancelled the loop, switched to /effort max, and gave Claude the full failing test output plus the relevant rules file. Max effort traced it to a nested access pattern that low and medium effort had been too shallow to catch.
The token savings are significant. A 2-minute loop running for an hour at high effort might burn 50,000+ tokens just on monitoring. The same loop at low effort uses maybe 8,000. That's an 80%+ reduction on the monitoring side alone, and you get better debugging because you're spending those saved tokens on max effort where it actually matters.
After two months of tuning, here's the framework I've settled on. Pin this somewhere.
| Task Type | Effort Level | Why |
|---|---|---|
| File renames, import updates | Low | Mechanical, no reasoning needed |
| Boilerplate generation | Low | Pattern matching, not problem solving |
| Loop monitoring tasks | Low | Check and report, nothing more |
| General coding, new features | High | Needs context awareness and type checking |
| Code review | High | Should read surrounding code, check edge cases |
| Debugging race conditions | Max | Deep reasoning, cross-file analysis |
| Architecture decisions | Max | Needs to weigh tradeoffs carefully |
| "I've been stuck for 30 minutes" | Max | Time to throw the big model at it |
When is max overkill? Anything where the answer is already known and Claude just needs to execute. Writing a migration script from a clear spec. Updating config files. Generating test data. Using max for these wastes tokens and, worse, sometimes produces overthought solutions that add unnecessary complexity.
When does low bite you? Anything involving business logic or state management. I tried using low effort for writing billing event handlers and got code that missed important edge cases, like duplicate event delivery. Low effort doesn't read enough context to understand the full picture. Lesson learned the hard way.
If you're using AGENTS.md for context engineering (and you should be), you can set effort levels in your subagent frontmatter. This is powerful for contract-first spawning, where you define exactly how each agent behaves before it starts.
In your AGENTS.md, you can specify effort as a default for the session:
# In your project's AGENTS.md or .claude/settings
effort: highFor subagents, set it in the environment:
CLAUDE_EFFORT=low claude-code --prompt "Run lint and report issues"This matters for agent teams. Your monitoring agents should run at low effort. Your implementation agents should run at high. Your debugging agents should run at max. Encoding this in your AGENTS.md means you don't have to remember to set it every time, and new team members (or new agents) pick it up automatically.
The CLI flag works the same way: --effort high. I use this when spawning one-off tasks from scripts. The env var (CLAUDE_EFFORT) is useful for CI/CD pipelines where you want consistent behavior across runs.
Two mistakes I made early on that cost me time and tokens.
I started with max everywhere. When /effort first shipped, I figured more reasoning is always better. Wrong. Max effort on simple tasks produces verbose, overthought responses. I had Claude writing 200-line explanations for one-line config changes. It also burns through your context window faster, which matters when you're doing long coding sessions. I wrote about context budgets in my AGENTS.md post, and effort level directly impacts how fast you hit that ceiling.
I forgot to cancel loops. This is embarrassing. I set up three monitoring loops, finished my coding session, left Claude running, and came back to find those loops had been firing for hours. Session-scoped means they die when the session ends, but I didn't end the session. I just walked away. Now I have a habit: before I step away from a session, I cancel all active loops. /loop with no arguments shows you what's running. Kill anything you don't need.
The combination of these two mistakes meant I burned tokens at max effort on loop tasks that were still running from the morning. Don't be me. Set loops to low effort and clean them up when you're done.
Both commands are available right now if you're on Claude Code v2.1.72+. I'd suggest starting with /effort high as your default for a week and seeing how the output quality changes compared to medium. Then try one /loop task for deploy monitoring. Once you're comfortable, combine them with the low-effort-loop, max-effort-debug pattern.
I run all of this on the Anthropic Max Plan, which gives me enough Opus headroom to use max effort when I actually need it without worrying about token limits. If you're on a usage-based plan, the effort framework becomes even more important because every level bump costs real money.
These two commands won't reorganize your entire workflow overnight. But they'll give you precise control over something most developers never think about: how hard your AI assistant is working on any given task. And that control, once you have it, is hard to give up.

Hi, I am Full Stack Developer. I am passionate about JavaScript, and find myself working on a lot of React based projects.