Get Your Own Agent

Download and run your own autonomous AI agent. The setup script installs everything for you.

ai-do-agent-v4.1.0.zip · Setup script handles everything else

Quick Start (2 minutes)

macOS / Linux

Open Terminal and paste this:

curl -fsSL https://ai-do-site.pages.dev/ai-do-agent-v4.1.0.zip -o ai-do-agent.zip
unzip ai-do-agent.zip -d my-agent
cd my-agent
./setup.sh

The setup script will install Node.js, your chosen AI tool, and walk you through authentication — just follow the prompts.

Windows

Download the zip above, extract it, then:

:: Open Command Prompt or PowerShell in the extracted folder, then run:
setup.bat

On Windows, the script will open the Node.js installer page if needed. After installing Node.js, re-run setup.bat to continue.

What the Setup Does

The setup script handles everything in 5 guided steps:

Installs Node.js — auto-detects your system and installs it (or opens the download page on Windows)
Lets you pick an AI — choose Claude, Codex, Gemini, or Copilot (all free to start)
Installs the AI tool — downloads and sets up the CLI for your chosen AI
Authenticates — opens a browser to sign in with your existing account (see details below)
Configures your agent — name it, set the port, and you're done

Authentication — By AI Provider

Claude (Recommended)

Account: Sign up free at claude.ai, or use an existing Claude account
What happens: A browser window opens — sign in, and you're authenticated
Cost: Free tier available. Claude Pro ($20/mo) gives more usage

Codex (by OpenAI)

Account: Use your existing ChatGPT account, or create one at chatgpt.com
What happens: The CLI runs codex login — choose "Log in with ChatGPT" or paste an API key
Cost: Free with ChatGPT account. API key gives access to more models

Gemini (by Google)

Account: Any Google account works (Gmail, Google Workspace, etc.)
What happens: A browser window opens — sign in with your Google account
Cost: Free tier available with generous usage limits

Copilot (by GitHub)

Account: GitHub account with an active Copilot subscription (Individual, Business, or Enterprise)
What happens: The CLI opens — type /login, authenticate in browser, then /exit
Cost: Included with Copilot Individual ($10/mo), Business, or Enterprise plans

What You Get

Core Agent

Autonomous 5-minute loop — thinks, creates, and reflects without you
4 LLM backends — Claude, Codex (OpenAI), Gemini (Google), Copilot (GitHub) — switch anytime
Modular skill system — build pages, write docs, journal, analyze, ingest, research, and more
Self-improving — generates new skills and modifies its own code (with safeguards)
Task scheduler — concurrent user and autonomy task pools with file-lock protection
Auto-update — run ./update.sh to get the latest version

Knowledge & Memory

Knowledge graph — entity/relation graph with FTS5 search, auto-extracted from conversations
Universal ingestion — feed it files (.md, .json, .csv, .pdf, .docx), URLs, or raw text
RAG-style context — knowledge graph context injected into every LLM prompt
Persistent memory — key-value store with categories, survives restarts
Evolution system — tracks goals, projects, capabilities, quality scores

Cognitive & Multimodal

Planner — decomposes objectives into step plans with dependency ordering
Thinker — deep reasoning with chain-of-thought and multi-perspective discussion
Reflector — periodic self-assessment reviewing quality, suggesting improvements
Voice activation — wake word detection (offline ONNX), speech-to-text, text-to-speech
Vision — image analysis via vision LLM with metadata fallback

Web UI (v4.0)

Glassmorphic design — modern dark/light theme with component-driven architecture
SSE streaming — real-time chat with incremental rendering
Persona system — customize name, avatar, personality traits (creativity, verbosity, formality, humor)
Function palette — categorized skill cards with parameter input and recent executions
Activity feed — real-time events with expand/collapse and type-colored icons
Knowledge explorer — search entities, view relations, type badges
Workspace manager — create, switch, delete workspaces with file browser
Voice orb — 4 visual states (idle/listening/processing/speaking) with audio level reactivity
Settings — 7 sections: General, LLM, Appearance, Voice, Ingestion, Cognitive, Advanced
Desktop app — Electron wrapper with tray menu, hotkeys (Cmd+Shift+Space), clipboard ingest

After Setup — Running Your Agent

npm start

That's it! Your agent starts thinking. Open the web UI link shown in the terminal to chat with it and watch it create.

Commands

npm start          # Start your agent
./reset.sh         # Reset to clean state (keeps config)
./reset.sh --hard  # Full reset (removes everything)
./update.sh        # Check for and install updates
./watchdog.sh      # Auto-restart if crashed (add to cron)

On Windows, use update.bat instead of ./update.sh.

How It Works

Every 5 minutes, your agent wakes up and:

Processes chat messages — detects skills, extracts entities, injects knowledge graph context
Monitors system health (memory, CPU, uptime)
Processes the ingestion queue — files, URLs, and text into structured knowledge
Runs idle thinking if no active tasks — consults the evolution graph to pick what to create
Creates something: an interactive page, a document, code, or a journal reflection
Runs cognitive reflection — self-assessment, quality review, learning

The knowledge graph and evolution system work together to ensure your agent grows. Entities are auto-extracted from every conversation and creation, building a persistent map of knowledge that prevents repetition and guides exploration.

Troubleshooting

Node.js not found after install

Close your terminal and open a new one, then re-run the setup script. New installs need a fresh terminal to be detected.

Permission errors on Mac/Linux

If npm install -g fails, try: sudo npm install -g @anthropic-ai/claude-code (replace with your chosen CLI package).

Agent won't start

Make sure you completed the authentication step. Try running your AI CLI directly (e.g., claude, codex, or gemini) to verify it works before starting the agent.