Let Agent S2 control a virtual desktop
This guide walks through setting up Agent S2, the open-source SOTA computer use agent by Simular AI. These steps include trying it locally on your own computer or on a virtual desktop through Orgo.
Install the required packages:
Set up your API keys:
Run Agent S2 with natural language commands:
This approach uses Agent S2’s compositional framework to execute complex computer use tasks.
Grant Terminal access: System Settings → Privacy & Security → Accessibility
May require running Terminal as Administrator
Install dependencies:
Variable | Default | Description |
---|---|---|
OPENAI_API_KEY | - | OpenAI API key |
ANTHROPIC_API_KEY | - | Anthropic API key |
ORGO_API_KEY | - | Orgo API key (remote mode) |
USE_CLOUD_ENVIRONMENT | false | Set to true for remote execution |
AGENT_MODEL | gpt-4o | Main reasoning model |
GROUNDING_MODEL | claude-3-7-sonnet-20250219 | Visual grounding model |
MAX_STEPS | 10 | Maximum steps per task |
STEP_DELAY | 0.5 | Seconds between actions |
Agent S2 uses a compositional framework with specialized modules:
Mixture of Grounding - Routes actions to specialized visual grounding models for precise UI localization
Proactive Hierarchical Planning - Dynamically refines plans based on evolving observations
Cross-platform Support - Works on macOS, Windows, and Linux
Agent S2 achieves state-of-the-art results on computer use benchmarks:
Benchmark | Success Rate | Rank |
---|---|---|
OSWorld | 27.0% | #3 |
WindowsAgentArena | 29.8% | #1 |
AndroidWorld | 54.3% | #1 |
Agent S2 is currently ranked #3 on the OSWorld benchmark, demonstrating leading performance on complex computer use tasks.
Here is a video version of this guide:
You can follow the video tutorial above or use this written guide.