Skip to content

Coding Agent Workflow Research: Claude Code & OpenAI Codex CLI

Date: 2026-04-28 Focus: Practical workflow improvements, productivity patterns, and new features


1. Claude Code Workflow Improvements

1.1 Key Features (2025-2026)

Claude Code has evolved from a terminal-only tool into a multi-surface agentic coding platform available across terminal, VS Code, JetBrains, Desktop app, and web browser.

Multi-Surface Availability - Terminal CLI (primary, full-featured) - VS Code extension with inline diffs, @-mentions, plan review - JetBrains plugin (IntelliJ, PyCharm, WebStorm) - Desktop app (macOS, Windows) with visual diff review, multiple sessions - Web-based (claude.ai/code) with no local setup required - iOS app for mobile continuation of tasks

Source: Claude Code Overview

Routines (Scheduled Automation) - Run Claude Code on autopilot with scheduled, API-triggered, or GitHub-event-triggered routines - Execute on Anthropic-managed cloud infrastructure (works when laptop is closed) - Triggers: cron schedule, HTTP POST endpoint, GitHub PR/release events - Use cases: nightly PR reviews, alert triage, docs drift detection, deploy verification - Available on Pro, Max, Team, and Enterprise plans

Source: Claude Code Routines

Agent Teams (Experimental) - Coordinate multiple Claude Code instances working together - One session acts as team lead, others as teammates with independent context windows - Shared task list with self-coordination and inter-agent messaging - Display modes: in-process (Shift+Down to cycle) or split panes (tmux/iTerm2) - Use cases: parallel code review (security/performance/tests), competing hypothesis debugging, cross-layer coordination - Requires CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1

Source: Agent Teams

Claude Agent SDK (formerly Claude Code SDK) - Build production AI agents with Claude Code as a library - Available in Python (claude-agent-sdk) and TypeScript (@anthropic-ai/claude-agent-sdk) - Built-in tools: Read, Write, Edit, Bash, Monitor, Glob, Grep, WebSearch, WebFetch, AskUserQuestion - Features: hooks (callbacks), subagents, MCP integration, permissions control, session management - Use cases: CI/CD pipelines, custom applications, production automation

Source: Agent SDK Overview

1.2 Hooks System and Automation Patterns

Hooks are deterministic automation triggers that execute shell commands, HTTP calls, LLM prompts, or MCP tools at specific lifecycle points.

Five Hook Types 1. Command hooks: Shell scripts receiving JSON on stdin, returning decisions via exit codes 2. HTTP hooks: POST requests to endpoints with JSON event data 3. MCP Tool hooks: Call tools on connected MCP servers 4. Prompt hooks: Send prompts to Claude model for yes/no decisions 5. Agent hooks: Spawn subagents for tool-based verification

Hook Events | Event | When | Use Case | |-------|------|----------| | SessionStart | Session begins | Load context, set env vars | | UserPromptSubmit | User submits prompt | Validate, block, add context | | PreToolUse | Before tool executes | Block dangerous commands, modify input | | PostToolUse | Tool succeeds | Run linters, validate output | | Stop | Claude finishes | Validate final response | | Notification | Claude needs attention | Desktop notifications | | TeammateIdle | Teammate going idle | Quality gates for teams |

Practical Hook Patterns - Auto-format after every file edit (PostToolUse on Edit/Write) - Block destructive commands like rm -rf (PreToolUse on Bash) - Run ESLint after edits (PostToolUse) - Desktop notifications when Claude needs attention (Notification) - Environment variable loading from .env files (SessionStart) - Auto-approve safe operations (PermissionRequest)

Source: Claude Code Hooks

1.3 MCP Server Integrations

MCP (Model Context Protocol) enables Claude Code to connect to external data sources and tools.

Key Integration Patterns - Issue trackers (Jira, Linear) for implementing features from tickets - Databases for querying and analyzing data - Monitoring tools (Datadog, Sentry) for analyzing alerts - Design tools (Figma) for implementing designs - Communication (Slack) for sending updates - Browsers (Playwright) for testing web applications - Documentation servers (Context7) for library docs

Connectors for Routines: MCP connectors work with cloud routines, enabling scheduled automations that read from Slack, create Linear tickets, etc.

Source: Claude Code Overview

1.4 CI/CD Integration

GitHub Actions (anthropics/claude-code-action@v1) - Trigger with @claude mention in any PR or issue comment - Automated code review on every PR - Custom automation with scheduled workflows - Supports direct API, AWS Bedrock, and Google Vertex AI - Skills integration for domain-specific workflows

GitLab CI/CD integration also available.

Non-interactive mode (claude -p "prompt") - Integrates into build scripts, pre-commit hooks, pipelines - Output formats: text, JSON, stream-json - Composable with Unix pipes: cat error.log | claude -p "explain root cause"

Source: GitHub Actions

1.5 Context and Memory Management

CLAUDE.md Files (Hierarchical) - ~/.claude/CLAUDE.md: Global instructions for all sessions - ./CLAUDE.md: Project-level, shared via git - ./CLAUDE.local.md: Personal project notes (gitignored) - Child directories: Loaded on demand when working in those areas - Supports @path/to/import syntax for including other files

Auto Memory: Claude builds memory as it works, saving learnings like build commands and debugging insights across sessions without manual configuration.

Context Window Management - /clear between unrelated tasks - /compact <instructions> for targeted summarization - /rewind to checkpoint and restore - /btw for side questions that don't enter conversation history - Subagents for investigation (separate context, reports back summaries) - Extended thinking with adaptive reasoning (configurable via effort levels)

Session Management - claude --continue to resume most recent conversation - claude --resume <name> to pick from named sessions - /rename for descriptive session names - claude --from-pr 123 to resume sessions linked to a PR - Session picker with search, preview, branch filtering

Source: Best Practices

1.6 Parallel Development with Worktrees

Git Worktree Integration - claude --worktree feature-auth creates isolated worktree with new branch - Each session gets its own copy of the codebase - Worktrees created at <repo>/.claude/worktrees/<name> - Subagents can use worktree isolation with isolation: worktree frontmatter - .worktreeinclude file copies gitignored files (.env) to new worktrees - Automatic cleanup: no changes = auto-remove; changes = prompt to keep/remove

Source: Common Workflows


2. OpenAI Codex CLI Workflow Improvements

2.1 Overview

OpenAI Codex CLI is a lightweight local coding agent written primarily in Rust (96.2% of codebase). Licensed under Apache-2.0 with 78.5k+ GitHub stars.

Installation

npm install -g @openai/codex
# or
brew install --cask codex

Access Methods - Terminal CLI (codex command) - IDE extensions (VS Code, Cursor, Windsurf) - Desktop app (codex app) - Web version (chatgpt.com/codex - cloud-based, separate product)

Authentication - ChatGPT account sign-in (Plus, Pro, Business, Edu, Enterprise plans) - API key authentication (alternative)

Source: GitHub - openai/codex

2.2 Key Features (Based on Available Documentation)

Local Execution Model - Runs entirely on your machine - Sandboxed execution for safety - Works with your local filesystem and tools

Cloud Companion (Codex Web) - Separate cloud-based agent at chatgpt.com/codex - Can run tasks in background, clone repos - Produces PRs and artifacts independently

Multi-Model Support - Access to OpenAI models (GPT-4o, o3, o4-mini) - Model selection based on task complexity

IDE Integration - VS Code, Cursor, Windsurf extensions - Desktop application experience

2.3 Known Capabilities (from community reports)

  • File reading and editing
  • Command execution in sandboxed environment
  • Git operations
  • Code generation and refactoring
  • Multi-file changes
  • Approval workflows before modifications

3. Cross-cutting Workflow Patterns

3.1 The Plan-Execute-Verify Loop

Both tools benefit from separating research, planning, and execution:

  1. Explore: Use read-only mode to understand codebase (Claude Code's Plan Mode)
  2. Plan: Create detailed implementation plan before coding
  3. Execute: Implement with the plan as guide
  4. Verify: Run tests, check output, validate results

Claude Code specific: Shift+Tab to toggle Plan Mode; Ctrl+G to edit plan in text editor

3.2 Context Window Management

The single most important resource to manage across all coding agents:

Strategies - Clear context between unrelated tasks - Use subagents/separate sessions for research (keeps main context clean) - Scope investigations narrowly - Provide verification criteria upfront (reduces back-and-forth) - Reference files with @ instead of pasting content - Pipe data in rather than describing it

Anti-pattern: The "kitchen sink session" where unrelated tasks accumulate in one context.

3.3 Verification-Driven Development

The highest-leverage practice across all coding agents:

  • Always provide tests, screenshots, or expected outputs
  • Let the agent verify its own work
  • Include specific test cases in prompts
  • Use visual verification for UI changes (screenshots, browser tools)
  • Run linters and type checkers as verification gates

3.4 Parallel Session Patterns

Writer/Reviewer Pattern - Session A implements a feature - Session B reviews Session A's output with fresh context - Session A addresses feedback

Test-First Pattern - Session A writes tests - Session B writes code to pass them

Multi-file Fan-out

for file in $(cat files.txt); do
  claude -p "Migrate $file from React to Vue. Return OK or FAIL." \
    --allowedTools "Edit,Bash(git commit *)"
done

3.5 Interview-Driven Requirements

For larger features, have the AI interview you first:

I want to build [brief description]. Interview me in detail using the
AskUserQuestion tool. Ask about technical implementation, UI/UX, edge
cases, concerns, and tradeoffs.

Start a fresh session to execute the resulting spec (clean context for implementation).

3.6 Git Workflow Integration

  • Commit frequently with descriptive messages
  • Use feature branches for all work
  • Create PRs directly from the agent
  • Use worktrees for parallel development
  • Name sessions after branches/tasks for easy resume

4. New Ecosystem Tools

4.1 Claude Agent SDK Demo Agents

Available at github.com/anthropics/claude-agent-sdk-demos:

Demo Pattern
Email Agent IMAP integration, agentic email search
Research Agent Multi-agent coordination, parallel processing, web search
Resume Generator Multi-source data gathering, document generation
Simple Chat App React + Express WebSocket interface
AskUserQuestion Previews HTML preview rendering for branding decisions

4.2 Claude Code Plugins

Plugin marketplace (/plugin to browse) bundles skills, hooks, subagents, and MCP servers: - Code intelligence plugins for language-specific symbol navigation - Automatic error detection after edits - Community-contributed workflow extensions

4.3 Routines Infrastructure

Cloud-based automation that replaces many custom CI/CD workflows: - Scheduled code reviews - Alert triage with auto-fix PRs - Documentation drift detection - Deploy verification - Library porting between SDKs

4.4 Remote Control and Cross-Device Workflows

  • Remote Control: Continue local sessions from phone/browser
  • Channels: Push events from Telegram, Discord, iMessage, webhooks into sessions
  • Teleport: claude --teleport to pull web/mobile tasks into terminal
  • Dispatch: Message from phone, get desktop session created
  • Slack Integration: Mention @Claude in Slack with bug report, get PR back

4.5 Notable Community Tools (from ecosystem)

  • oh-my-claudecode: Plugin framework with autopilot, team orchestration, ultrawork parallel execution
  • claude-mem: Persistent cross-session memory with search, timeline, knowledge corpora
  • session-wrap: Session lifecycle management, history insights, documentation updates
  • skill-creator: Create, modify, and benchmark custom skills

5. Community Tips and Power User Techniques

5.1 Session Hygiene (from Anthropic best practices)

  1. Clear aggressively: /clear between unrelated tasks is the single most impactful habit
  2. Two-strike rule: If you've corrected Claude twice on the same issue, /clear and write a better prompt
  3. Name everything: /rename sessions for easy resume; treat sessions like branches
  4. Checkpoint before risk: Claude auto-checkpoints; use /rewind to restore code + conversation

5.2 Prompt Engineering for Agents

High-leverage prompt patterns: - Include verification criteria: "run the tests after implementing" - Reference patterns: "look at how HotDogWidget.php is implemented, follow that pattern" - Scope explicitly: "fix the TypeError in user.ts, not the warning in config.js" - Address root causes: "fix the root cause, don't suppress the error"

Low-leverage patterns to avoid: - Vague requests without success criteria - Over-specified CLAUDE.md files that get ignored - Unscoped investigations that fill context

5.3 Subagent Delegation

Use subagents for anything that reads many files:

Use subagents to investigate how our authentication system handles token
refresh, and whether we have any existing OAuth utilities I should reuse.

The subagent explores in separate context and reports back a summary, keeping your main conversation clean.

5.4 Auto Mode for Uninterrupted Execution

claude --permission-mode auto -p "fix all lint errors"
A classifier model reviews commands and blocks only risky operations. Best for trusted tasks where you don't want to click through approvals.

5.5 Skills for Repeatable Workflows

Create .claude/skills/fix-issue/SKILL.md:

---
name: fix-issue
description: Fix a GitHub issue
disable-model-invocation: true
---
Analyze and fix the GitHub issue: $ARGUMENTS.
1. Use `gh issue view` to get details
2. Search codebase for relevant files
3. Implement fix
4. Write and run tests
5. Create PR

Invoke with /fix-issue 1234.

5.6 Hooks for Guaranteed Behaviors

Unlike CLAUDE.md instructions (advisory), hooks are deterministic: - Post-edit formatting that never gets skipped - Notification when Claude needs attention - Blocking writes to protected directories - Auto-running tests after changes


6. Comparison Matrix

Feature/Capability Claude Code Codex CLI
Runtime Node.js binary Rust binary
License Proprietary (Anthropic subscription) Apache-2.0 (open source)
Models Claude Opus, Sonnet, Haiku (4.5-4.7) GPT-4o, o3, o4-mini
IDE Integration VS Code, JetBrains VS Code, Cursor, Windsurf
Desktop App Yes (macOS, Windows) Yes
Web/Cloud Version Yes (claude.ai/code) Yes (chatgpt.com/codex)
Mobile Access iOS app, Remote Control ChatGPT mobile
Hooks System 5 hook types, 8+ events Not documented publicly
Agent SDK Python + TypeScript SDK Not available (open source repo)
Multi-Agent Teams Native (experimental) Not available
Subagents Built-in with isolation Not documented
MCP Support Full (connectors, tools, resources) Not documented
Scheduled Routines Yes (cloud infrastructure) Not available
CI/CD Integration GitHub Actions, GitLab CI GitHub integration
Worktree Support Built-in (--worktree flag) Manual
Plan Mode Yes (Shift+Tab toggle) Not documented
Permission Modes Default, Auto, Plan, custom rules Sandboxed execution
Context Management /clear, /compact, /rewind, subagents Not documented in detail
Session Persistence Full (resume, fork, named sessions) Not documented
Non-interactive Mode claude -p with output formats Available
Pipe/Unix Composability Full (stdin/stdout, pipes) Available
Plugin System Marketplace with /plugin Not available
Custom Commands/Skills .claude/skills/ with SKILL.md Not documented
Memory System CLAUDE.md + auto memory Not documented
Provider Flexibility Anthropic, Bedrock, Vertex, Foundry OpenAI API only
Pricing Subscription (Pro/Max/Team/Enterprise) or API ChatGPT subscription or API
GitHub Integration @claude in PRs, code review, routines Repository cloning, PRs

7. Workflow Anti-patterns

7.1 The Kitchen Sink Session

Problem: Starting with one task, asking something unrelated, going back to first task. Context fills with irrelevant information. Fix: /clear between unrelated tasks. Treat sessions as single-purpose.

7.2 Correction Spiral

Problem: Claude does something wrong, you correct it, still wrong, correct again. Context polluted with failed approaches. Fix: After two corrections, /clear and write a better initial prompt with what you learned.

7.3 Over-specified CLAUDE.md

Problem: Too many instructions cause Claude to ignore the important ones. Fix: Ruthlessly prune. If Claude already does something correctly without the instruction, delete it. Convert deterministic rules to hooks instead.

7.4 Trust-then-Verify Gap

Problem: Agent produces plausible code that doesn't handle edge cases. Fix: Always provide verification (tests, scripts, screenshots). If you can't verify it, don't ship it.

7.5 Infinite Exploration

Problem: Asking to "investigate" without scoping. Agent reads hundreds of files, filling context. Fix: Scope investigations narrowly or use subagents so exploration doesn't consume main context.

7.6 Premature Implementation

Problem: Jumping straight to coding without understanding the problem. Fix: Use Plan Mode to separate exploration from execution. Research first, then implement.

7.7 Ignoring Context Costs

Problem: Not tracking how much context is being consumed. Fix: Use custom status lines to monitor context usage. Run /compact proactively. Use subagents for research.

7.8 Monolithic Sessions for Parallel Work

Problem: Trying to do everything in one session when tasks are independent. Fix: Use worktrees, agent teams, or multiple terminal sessions. Fan out with claude -p loops.


Key Takeaways

  1. Context window management is the meta-game. Every other optimization serves this goal. Clear aggressively, delegate to subagents, and keep sessions focused.

  2. Verification is the highest-leverage investment. Agents with clear success criteria dramatically outperform agents without them. Tests > screenshots > manual review.

  3. Hooks beat instructions for deterministic behavior. CLAUDE.md is advisory; hooks are guaranteed. Use hooks for formatting, linting, notifications, and safety gates.

  4. Parallel execution multiplies throughput. Worktrees, agent teams, fan-out scripts, and writer/reviewer patterns all enable horizontal scaling.

  5. Claude Code has a significantly deeper feature set for workflow automation (hooks, routines, SDK, teams, MCP), while Codex CLI's open-source nature and Rust implementation offer different tradeoffs.

  6. The ecosystem is maturing rapidly. Plugins, skills, routines, and the Agent SDK are creating an extensibility layer that goes beyond simple chat-with-code.


Sources