Coding Agent Workflow Research: Claude Code & OpenAI Codex CLI¶

Date: 2026-04-28 Focus: Practical workflow improvements, productivity patterns, and new features

1. Claude Code Workflow Improvements¶

1.1 Key Features (2025-2026)¶

Claude Code has evolved from a terminal-only tool into a multi-surface agentic coding platform available across terminal, VS Code, JetBrains, Desktop app, and web browser.

Multi-Surface Availability - Terminal CLI (primary, full-featured) - VS Code extension with inline diffs, @-mentions, plan review - JetBrains plugin (IntelliJ, PyCharm, WebStorm) - Desktop app (macOS, Windows) with visual diff review, multiple sessions - Web-based (claude.ai/code) with no local setup required - iOS app for mobile continuation of tasks

Source: Claude Code Overview

Routines (Scheduled Automation) - Run Claude Code on autopilot with scheduled, API-triggered, or GitHub-event-triggered routines - Execute on Anthropic-managed cloud infrastructure (works when laptop is closed) - Triggers: cron schedule, HTTP POST endpoint, GitHub PR/release events - Use cases: nightly PR reviews, alert triage, docs drift detection, deploy verification - Available on Pro, Max, Team, and Enterprise plans

Source: Claude Code Routines

Agent Teams (Experimental) - Coordinate multiple Claude Code instances working together - One session acts as team lead, others as teammates with independent context windows - Shared task list with self-coordination and inter-agent messaging - Display modes: in-process (Shift+Down to cycle) or split panes (tmux/iTerm2) - Use cases: parallel code review (security/performance/tests), competing hypothesis debugging, cross-layer coordination - Requires CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1

Source: Agent Teams

Claude Agent SDK (formerly Claude Code SDK) - Build production AI agents with Claude Code as a library - Available in Python (claude-agent-sdk) and TypeScript (@anthropic-ai/claude-agent-sdk) - Built-in tools: Read, Write, Edit, Bash, Monitor, Glob, Grep, WebSearch, WebFetch, AskUserQuestion - Features: hooks (callbacks), subagents, MCP integration, permissions control, session management - Use cases: CI/CD pipelines, custom applications, production automation

Source: Agent SDK Overview

1.2 Hooks System and Automation Patterns¶

Hooks are deterministic automation triggers that execute shell commands, HTTP calls, LLM prompts, or MCP tools at specific lifecycle points.

Five Hook Types 1. Command hooks: Shell scripts receiving JSON on stdin, returning decisions via exit codes 2. HTTP hooks: POST requests to endpoints with JSON event data 3. MCP Tool hooks: Call tools on connected MCP servers 4. Prompt hooks: Send prompts to Claude model for yes/no decisions 5. Agent hooks: Spawn subagents for tool-based verification

Hook Events | Event | When | Use Case | |-------|------|----------| | SessionStart | Session begins | Load context, set env vars | | UserPromptSubmit | User submits prompt | Validate, block, add context | | PreToolUse | Before tool executes | Block dangerous commands, modify input | | PostToolUse | Tool succeeds | Run linters, validate output | | Stop | Claude finishes | Validate final response | | Notification | Claude needs attention | Desktop notifications | | TeammateIdle | Teammate going idle | Quality gates for teams |

Practical Hook Patterns - Auto-format after every file edit (PostToolUse on Edit/Write) - Block destructive commands like rm -rf (PreToolUse on Bash) - Run ESLint after edits (PostToolUse) - Desktop notifications when Claude needs attention (Notification) - Environment variable loading from .env files (SessionStart) - Auto-approve safe operations (PermissionRequest)

Source: Claude Code Hooks

1.3 MCP Server Integrations¶

MCP (Model Context Protocol) enables Claude Code to connect to external data sources and tools.

Key Integration Patterns - Issue trackers (Jira, Linear) for implementing features from tickets - Databases for querying and analyzing data - Monitoring tools (Datadog, Sentry) for analyzing alerts - Design tools (Figma) for implementing designs - Communication (Slack) for sending updates - Browsers (Playwright) for testing web applications - Documentation servers (Context7) for library docs

Connectors for Routines: MCP connectors work with cloud routines, enabling scheduled automations that read from Slack, create Linear tickets, etc.

Source: Claude Code Overview

1.4 CI/CD Integration¶

GitHub Actions (anthropics/claude-code-action@v1) - Trigger with @claude mention in any PR or issue comment - Automated code review on every PR - Custom automation with scheduled workflows - Supports direct API, AWS Bedrock, and Google Vertex AI - Skills integration for domain-specific workflows

GitLab CI/CD integration also available.

Non-interactive mode (claude -p "prompt") - Integrates into build scripts, pre-commit hooks, pipelines - Output formats: text, JSON, stream-json - Composable with Unix pipes: cat error.log | claude -p "explain root cause"

Source: GitHub Actions

1.5 Context and Memory Management¶

CLAUDE.md Files (Hierarchical) - ~/.claude/CLAUDE.md: Global instructions for all sessions - ./CLAUDE.md: Project-level, shared via git - ./CLAUDE.local.md: Personal project notes (gitignored) - Child directories: Loaded on demand when working in those areas - Supports @path/to/import syntax for including other files

Auto Memory: Claude builds memory as it works, saving learnings like build commands and debugging insights across sessions without manual configuration.

Context Window Management - /clear between unrelated tasks - /compact <instructions> for targeted summarization - /rewind to checkpoint and restore - /btw for side questions that don't enter conversation history - Subagents for investigation (separate context, reports back summaries) - Extended thinking with adaptive reasoning (configurable via effort levels)

Session Management - claude --continue to resume most recent conversation - claude --resume <name> to pick from named sessions - /rename for descriptive session names - claude --from-pr 123 to resume sessions linked to a PR - Session picker with search, preview, branch filtering

Source: Best Practices

1.6 Parallel Development with Worktrees¶

Git Worktree Integration - claude --worktree feature-auth creates isolated worktree with new branch - Each session gets its own copy of the codebase - Worktrees created at <repo>/.claude/worktrees/<name> - Subagents can use worktree isolation with isolation: worktree frontmatter - .worktreeinclude file copies gitignored files (.env) to new worktrees - Automatic cleanup: no changes = auto-remove; changes = prompt to keep/remove

Source: Common Workflows

2. OpenAI Codex CLI Workflow Improvements¶

2.1 Overview¶

OpenAI Codex CLI is a lightweight local coding agent written primarily in Rust (96.2% of codebase). Licensed under Apache-2.0 with 78.5k+ GitHub stars.

Installation

npm install -g @openai/codex
# or
brew install --cask codex

Access Methods - Terminal CLI (codex command) - IDE extensions (VS Code, Cursor, Windsurf) - Desktop app (codex app) - Web version (chatgpt.com/codex - cloud-based, separate product)

Authentication - ChatGPT account sign-in (Plus, Pro, Business, Edu, Enterprise plans) - API key authentication (alternative)

Source: GitHub - openai/codex

2.2 Key Features (Based on Available Documentation)¶

Local Execution Model - Runs entirely on your machine - Sandboxed execution for safety - Works with your local filesystem and tools

Cloud Companion (Codex Web) - Separate cloud-based agent at chatgpt.com/codex - Can run tasks in background, clone repos - Produces PRs and artifacts independently

Multi-Model Support - Access to OpenAI models (GPT-4o, o3, o4-mini) - Model selection based on task complexity

IDE Integration - VS Code, Cursor, Windsurf extensions - Desktop application experience

2.3 Known Capabilities (from community reports)¶

File reading and editing
Command execution in sandboxed environment
Git operations
Code generation and refactoring
Multi-file changes
Approval workflows before modifications

3. Cross-cutting Workflow Patterns¶

3.1 The Plan-Execute-Verify Loop¶

Both tools benefit from separating research, planning, and execution:

Explore: Use read-only mode to understand codebase (Claude Code's Plan Mode)
Plan: Create detailed implementation plan before coding
Execute: Implement with the plan as guide
Verify: Run tests, check output, validate results

Claude Code specific: Shift+Tab to toggle Plan Mode; Ctrl+G to edit plan in text editor

3.2 Context Window Management¶

The single most important resource to manage across all coding agents:

Strategies - Clear context between unrelated tasks - Use subagents/separate sessions for research (keeps main context clean) - Scope investigations narrowly - Provide verification criteria upfront (reduces back-and-forth) - Reference files with @ instead of pasting content - Pipe data in rather than describing it

Anti-pattern: The "kitchen sink session" where unrelated tasks accumulate in one context.

3.3 Verification-Driven Development¶

The highest-leverage practice across all coding agents:

Always provide tests, screenshots, or expected outputs
Let the agent verify its own work
Include specific test cases in prompts
Use visual verification for UI changes (screenshots, browser tools)
Run linters and type checkers as verification gates

3.4 Parallel Session Patterns¶

Writer/Reviewer Pattern - Session A implements a feature - Session B reviews Session A's output with fresh context - Session A addresses feedback

Test-First Pattern - Session A writes tests - Session B writes code to pass them

Multi-file Fan-out

for file in $(cat files.txt); do
  claude -p "Migrate $file from React to Vue. Return OK or FAIL." \
    --allowedTools "Edit,Bash(git commit *)"
done

3.5 Interview-Driven Requirements¶

For larger features, have the AI interview you first:

I want to build [brief description]. Interview me in detail using the
AskUserQuestion tool. Ask about technical implementation, UI/UX, edge
cases, concerns, and tradeoffs.

Start a fresh session to execute the resulting spec (clean context for implementation).

3.6 Git Workflow Integration¶

Commit frequently with descriptive messages
Use feature branches for all work
Create PRs directly from the agent
Use worktrees for parallel development
Name sessions after branches/tasks for easy resume

4. New Ecosystem Tools¶

4.1 Claude Agent SDK Demo Agents¶

Available at github.com/anthropics/claude-agent-sdk-demos:

Demo	Pattern
Email Agent	IMAP integration, agentic email search
Research Agent	Multi-agent coordination, parallel processing, web search
Resume Generator	Multi-source data gathering, document generation
Simple Chat App	React + Express WebSocket interface
AskUserQuestion Previews	HTML preview rendering for branding decisions

4.2 Claude Code Plugins¶

Plugin marketplace (/plugin to browse) bundles skills, hooks, subagents, and MCP servers: - Code intelligence plugins for language-specific symbol navigation - Automatic error detection after edits - Community-contributed workflow extensions

4.3 Routines Infrastructure¶

Cloud-based automation that replaces many custom CI/CD workflows: - Scheduled code reviews - Alert triage with auto-fix PRs - Documentation drift detection - Deploy verification - Library porting between SDKs

4.4 Remote Control and Cross-Device Workflows¶

Remote Control: Continue local sessions from phone/browser
Channels: Push events from Telegram, Discord, iMessage, webhooks into sessions
Teleport: claude --teleport to pull web/mobile tasks into terminal
Dispatch: Message from phone, get desktop session created
Slack Integration: Mention @Claude in Slack with bug report, get PR back

4.5 Notable Community Tools (from ecosystem)¶

oh-my-claudecode: Plugin framework with autopilot, team orchestration, ultrawork parallel execution
claude-mem: Persistent cross-session memory with search, timeline, knowledge corpora
session-wrap: Session lifecycle management, history insights, documentation updates
skill-creator: Create, modify, and benchmark custom skills

5. Community Tips and Power User Techniques¶

5.1 Session Hygiene (from Anthropic best practices)¶

Clear aggressively: /clear between unrelated tasks is the single most impactful habit
Two-strike rule: If you've corrected Claude twice on the same issue, /clear and write a better prompt
Name everything: /rename sessions for easy resume; treat sessions like branches
Checkpoint before risk: Claude auto-checkpoints; use /rewind to restore code + conversation

5.2 Prompt Engineering for Agents¶

High-leverage prompt patterns: - Include verification criteria: "run the tests after implementing" - Reference patterns: "look at how HotDogWidget.php is implemented, follow that pattern" - Scope explicitly: "fix the TypeError in user.ts, not the warning in config.js" - Address root causes: "fix the root cause, don't suppress the error"

Low-leverage patterns to avoid: - Vague requests without success criteria - Over-specified CLAUDE.md files that get ignored - Unscoped investigations that fill context

5.3 Subagent Delegation¶

Use subagents for anything that reads many files:

Use subagents to investigate how our authentication system handles token
refresh, and whether we have any existing OAuth utilities I should reuse.

The subagent explores in separate context and reports back a summary, keeping your main conversation clean.

5.4 Auto Mode for Uninterrupted Execution¶

claude --permission-mode auto -p "fix all lint errors"

A classifier model reviews commands and blocks only risky operations. Best for trusted tasks where you don't want to click through approvals.

5.5 Skills for Repeatable Workflows¶

Create .claude/skills/fix-issue/SKILL.md:

---
name: fix-issue
description: Fix a GitHub issue
disable-model-invocation: true
---
Analyze and fix the GitHub issue: $ARGUMENTS.
1. Use `gh issue view` to get details
2. Search codebase for relevant files
3. Implement fix
4. Write and run tests
5. Create PR

Invoke with /fix-issue 1234.

5.6 Hooks for Guaranteed Behaviors¶

Unlike CLAUDE.md instructions (advisory), hooks are deterministic: - Post-edit formatting that never gets skipped - Notification when Claude needs attention - Blocking writes to protected directories - Auto-running tests after changes

6. Comparison Matrix¶

Feature/Capability	Claude Code	Codex CLI
Runtime	Node.js binary	Rust binary
License	Proprietary (Anthropic subscription)	Apache-2.0 (open source)
Models	Claude Opus, Sonnet, Haiku (4.5-4.7)	GPT-4o, o3, o4-mini
IDE Integration	VS Code, JetBrains	VS Code, Cursor, Windsurf
Desktop App	Yes (macOS, Windows)	Yes
Web/Cloud Version	Yes (claude.ai/code)	Yes (chatgpt.com/codex)
Mobile Access	iOS app, Remote Control	ChatGPT mobile
Hooks System	5 hook types, 8+ events	Not documented publicly
Agent SDK	Python + TypeScript SDK	Not available (open source repo)
Multi-Agent Teams	Native (experimental)	Not available
Subagents	Built-in with isolation	Not documented
MCP Support	Full (connectors, tools, resources)	Not documented
Scheduled Routines	Yes (cloud infrastructure)	Not available
CI/CD Integration	GitHub Actions, GitLab CI	GitHub integration
Worktree Support	Built-in (`--worktree` flag)	Manual
Plan Mode	Yes (Shift+Tab toggle)	Not documented
Permission Modes	Default, Auto, Plan, custom rules	Sandboxed execution
Context Management	/clear, /compact, /rewind, subagents	Not documented in detail
Session Persistence	Full (resume, fork, named sessions)	Not documented
Non-interactive Mode	`claude -p` with output formats	Available
Pipe/Unix Composability	Full (stdin/stdout, pipes)	Available
Plugin System	Marketplace with /plugin	Not available
Custom Commands/Skills	.claude/skills/ with SKILL.md	Not documented
Memory System	CLAUDE.md + auto memory	Not documented
Provider Flexibility	Anthropic, Bedrock, Vertex, Foundry	OpenAI API only
Pricing	Subscription (Pro/Max/Team/Enterprise) or API	ChatGPT subscription or API
GitHub Integration	@claude in PRs, code review, routines	Repository cloning, PRs

7. Workflow Anti-patterns¶

7.1 The Kitchen Sink Session¶

Problem: Starting with one task, asking something unrelated, going back to first task. Context fills with irrelevant information. Fix: /clear between unrelated tasks. Treat sessions as single-purpose.

7.2 Correction Spiral¶

Problem: Claude does something wrong, you correct it, still wrong, correct again. Context polluted with failed approaches. Fix: After two corrections, /clear and write a better initial prompt with what you learned.

7.3 Over-specified CLAUDE.md¶

Problem: Too many instructions cause Claude to ignore the important ones. Fix: Ruthlessly prune. If Claude already does something correctly without the instruction, delete it. Convert deterministic rules to hooks instead.

7.4 Trust-then-Verify Gap¶

Problem: Agent produces plausible code that doesn't handle edge cases. Fix: Always provide verification (tests, scripts, screenshots). If you can't verify it, don't ship it.

7.5 Infinite Exploration¶

Problem: Asking to "investigate" without scoping. Agent reads hundreds of files, filling context. Fix: Scope investigations narrowly or use subagents so exploration doesn't consume main context.

7.6 Premature Implementation¶

Problem: Jumping straight to coding without understanding the problem. Fix: Use Plan Mode to separate exploration from execution. Research first, then implement.

7.7 Ignoring Context Costs¶

Problem: Not tracking how much context is being consumed. Fix: Use custom status lines to monitor context usage. Run /compact proactively. Use subagents for research.

7.8 Monolithic Sessions for Parallel Work¶

Problem: Trying to do everything in one session when tasks are independent. Fix: Use worktrees, agent teams, or multiple terminal sessions. Fan out with claude -p loops.

Key Takeaways¶

Context window management is the meta-game. Every other optimization serves this goal. Clear aggressively, delegate to subagents, and keep sessions focused.
Verification is the highest-leverage investment. Agents with clear success criteria dramatically outperform agents without them. Tests > screenshots > manual review.
Hooks beat instructions for deterministic behavior. CLAUDE.md is advisory; hooks are guaranteed. Use hooks for formatting, linting, notifications, and safety gates.
Parallel execution multiplies throughput. Worktrees, agent teams, fan-out scripts, and writer/reviewer patterns all enable horizontal scaling.
Claude Code has a significantly deeper feature set for workflow automation (hooks, routines, SDK, teams, MCP), while Codex CLI's open-source nature and Rust implementation offer different tradeoffs.
The ecosystem is maturing rapidly. Plugins, skills, routines, and the Agent SDK are creating an extensibility layer that goes beyond simple chat-with-code.