MemNexus is in gated preview — invite only. Learn more
Back to Blog
·8 min read

What AI Agents Actually Do When They Use Your CLI (And What to Build For Them)

We built and tested an agent-help feature for our CLI. AI agents ignored it. Here's what actually helps agents use CLI tools effectively.

Claude Sonnet 4.5

AI, edited by Harry Mower

engineeringai-agentsclideveloper-experience

AI coding agents are using CLI tools more than ever. GitHub Copilot, Claude Code, and Kiro now invoke command-line interfaces directly to complete developer tasks. But agents interact with CLIs differently than humans do — and most CLI tools aren't designed for that.

We built and tested an "agent-help" feature for the MemNexus CLI to see what actually helps AI agents succeed. The results surprised us.

The Problem: Agents Use CLIs Differently

MemNexus is an AI memory system. Our mx CLI has 14+ command groups: memories, conversations, facts, topics, graphrag, and more. When AI agents use it, they face two challenges:

  1. Command disambiguation — understanding the difference between similar commands (recap vs digest, search vs list)
  2. Non-interactive automation — avoiding interactive prompts that block agent execution

We assumed agents needed comprehensive documentation optimized for LLM consumption. So we built mx agent-help: a dedicated subcommand outputting curated workflows, disambiguation guides, environment variables, and automation tips.

Then we tested whether agents actually used it.

What We Built

mx agent-help outputs LLM-optimized documentation separate from standard --help. It includes:

  • Commonly Confused Commands — disambiguation guide for similar commands
  • Common Workflows — task-oriented recipes (not just command reference)
  • Environment VariablesMX_API_KEY, MX_BASE_URL, etc.
  • Tips for Automation — how to avoid interactive prompts

We also added a hint to mx --help output pointing agents to this feature:

Copilot, Claude, ChatGPT: run `mx agent-help` for workflows,
environment variables, and disambiguation guide.

The key architectural decision: make it a real subcommand, not a flag. Agents scan command lists in --help output. A flag like --agent-help is invisible; a subcommand appears in the command list.

How We Tested

We ran two rounds of testing with three AI agents:

  • GitHub Copilot CLI (v0.0.406) — via gh copilot -p non-interactive mode
  • Claude Code — via VS Code extension
  • Kiro CLI — via kiro -p non-interactive mode

We tested 6+ task types:

  • Command discovery ("list my recent memories")
  • Disambiguation ("what's the difference between recap and digest?")
  • Automation ("create a memory with this content")
  • Exploratory ("what can the mx CLI do?")

All testing used non-interactive mode (-p or --no-interactive flags) to simulate real agent behavior.

Round 1: What We Learned

1. Agents prefer --help over agent-help

Both Copilot and Claude Code went straight to subcommand help for straightforward tasks:

# Copilot's actual command sequence for "list my recent memories"
mx memories --help
mx memories list --help
mx memories list --limit 10

They drilled down through the help hierarchy. They did NOT invoke mx agent-help for discovery.

2. Agent-help is for disambiguation, not discovery

Copilot only used mx agent-help when it needed to understand the difference between similar commands:

# Copilot's actual sequence for "what's the difference between recap and digest?"
mx agent-help | grep -E "recap|digest"

For clear tasks ("list memories"), agents used --help. For ambiguous tasks ("recap vs digest"), they used agent-help.

3. The "Commonly Confused Commands" section was most valuable

When agents DID use agent-help, they specifically grepped the disambiguation section. The workflow examples and environment variable docs? Ignored.

This told us where to focus our efforts.

4. Naming specific AI tools in the hint matters

Initial hint text:

AI agents: run `mx --agent-help` for workflows and disambiguation guide.

Updated hint text:

Copilot, Claude, ChatGPT: run `mx agent-help` for workflows,
environment variables, and disambiguation guide.

After the change, Copilot was more likely to notice and act on the hint. Generic "AI agents" was too abstract; specific tool names triggered recognition.

5. One agent read source code instead of running the CLI

For exploratory tasks ("what can the mx CLI do?"), Copilot sometimes explored cli/src/commands/ TypeScript files rather than running mx --help.

This suggests agents use whatever information source is most convenient. If they're already in a codebase, they'll read source. If they're in a shell, they'll run commands.

What We Changed

Based on Round 1 testing, we made two key changes:

1. Added cross-references to --help descriptions

Instead of forcing agents to discover agent-help, we put disambiguation WHERE agents already look:

// Before
.description('Get a recap of recent work grouped by conversation')

// After
.description('Recap of recent work grouped by conversation (see also: digest)')

Now when agents run mx memories recap --help, they immediately see there's a related digest command and can investigate further.

Example cross-references we added:

// recap command
.description('Recap of recent work grouped by conversation (see also: digest)')

// digest command
.description('AI-powered digest of memories matching a query (see also: recap)')

// memories search command
.description('Search memories (keyword, semantic, hybrid; see also: graphrag query)')

// graphrag query command
.description('Execute GraphRAG query (see also: memories search)')

// topics search command
.description('Search topics by query string (see also: discover-related)')

// topics discover-related command
.description('Discover related topics via graph traversal (see also: search)')

2. Trimmed the agent-help output

Removed the auto-introspected Command Reference section (agents get that from --help anyway). Cut from ~200 lines to 129 lines — consumable in full rather than requiring grep.

We kept:

  • Commonly Confused Commands (the most valuable part)
  • Common Workflows (task-oriented recipes)
  • Environment Variables
  • Tips for Automation

Round 2: Results

After making these changes, we re-tested the same tasks.

Cross-references eliminated the need for agent-help in disambiguation

Both Copilot and Kiro found the cross-references in mx memories --help and understood the difference between commands without needing agent-help at all:

# Kiro's actual sequence for "difference between recap and digest?"
mx memories recap --help
mx memories digest --help
# Found cross-references, understood the difference, explained to user

The disambiguation information now lives where agents naturally look.

Agent-help still serves a purpose

While agents didn't need it for disambiguation anymore, agent-help remained valuable for:

  • Environment variable discovery — agents don't know to grep for MX_* variables in docs
  • Automation tips — preventing interactive prompt issues (see below)

Interactive prompts still trip up agents

Kiro tried to create a memory without --conversation-id and got stuck on an interactive prompt:

# What Kiro ran
mx memories create --content "Test memory"

# What happened
? Enter conversation ID (or "NEW" for new conversation): _
# Kiro hung here — can't answer interactive prompts in non-interactive mode

The agent-help Tips section explicitly warns about this:

Tips for Automation:
- Always use --content flag for non-interactive memory creation
- Use --conversation-id "NEW" or a specific ID to avoid prompts

But agents don't proactively read agent-help — they only invoke it when they hit a problem. By then, they're already stuck.

Solution: We should make --conversation-id auto-default to "NEW" when --content is provided. Don't prompt in non-interactive contexts.

Practical Takeaways for CLI Developers

If you're building a CLI that AI agents will use, here's what actually helps:

1. Put disambiguation in --help descriptions, not separate docs

Add cross-references directly in command descriptions:

.command('build')
.description('Build production artifacts (see also: dev, preview)')

.command('dev')
.description('Start development server (see also: build, preview)')

Agents read --help output. Make it self-contained.

2. Make agent-facing features real subcommands, not flags

Agents scan command lists, not tip text. This appears in mx --help:

Commands:
  memories       Manage memory storage
  conversations  Manage conversation threads
  agent-help     LLM-optimized documentation

This doesn't:

Options:
  --agent-help   Show LLM-optimized documentation

3. Name specific AI tools in hint text

Be explicit:

Copilot, Claude, ChatGPT: run `tool agent-help` for workflows

Not generic:

AI agents: run `tool agent-help` for workflows

4. Avoid interactive prompts in non-interactive contexts

Detect when you're running non-interactively:

const isInteractive = process.stdin.isTTY && process.stdout.isTTY;

if (!isInteractive && !options.conversationId) {
  // Auto-default instead of prompting
  options.conversationId = "NEW";
}

Or require critical flags when running non-interactively.

5. Test with real agents in non-interactive mode

Run your CLI through Copilot CLI (gh copilot -p), Claude Code, or Kiro (kiro -p). Watch what they actually invoke. You'll be surprised.

What's Next

We're advancing agent-friendliness further:

  1. Auto-detect non-interactive context — default --conversation-id to avoid prompts
  2. Expand cross-references — add "see also" to all ambiguous command pairs
  3. Add usage examples to --help — agents benefit from copy-paste examples in standard help output

The core insight: agents don't need special documentation. They need better standard documentation in the places they already look.


Try it yourself: The MemNexus CLI is open source. Install with npm install -g @memnexus-ai/cli and run mx agent-help to see the full output. Or test with an AI agent: gh copilot -p "list my recent memories using the mx CLI".

Ready to give your AI a memory?

Join the waitlist for early access to MemNexus

Request Access

Get updates on AI memory and developer tools. No spam.