Overview
OpenAI’s Computer Use lets AI agents control computer interfaces through the Responses API. This guide shows how to use it with Orgo’s virtual desktops.Quick Start
1
Install packages
2
Set up API keys
3
Run your first task
Complete Example
Here’s a full working example that handles the complete agent loop:Usage Examples
Basic Tasks
Complex Workflows
Reusing Sessions
Key Concepts
The Agent Loop
OpenAI Computer Use works in a continuous loop:- Request → Send task to the model
- Action → Model suggests an action (click, type, etc.)
- Execute → Your code executes the action
- Screenshot → Capture the result
- Repeat → Continue until task is complete
Action Types
Action | Description | Example |
---|---|---|
click | Click at coordinates | Click button at (100, 200) |
double_click | Double-click | Open desktop icon |
type | Type text | Enter username |
key | Press key(s) | Press Enter, Ctrl+C |
scroll | Scroll page | Scroll down 3 units |
wait | Pause execution | Wait 2 seconds |
screenshot | Take screenshot | Capture current state |
Safety Features
OpenAI includes safety checks to prevent misuse:Best Practices
1. Clear Instructions
2. Error Handling
3. Session Management
4. Timing Considerations
Comparison with Claude
Feature | OpenAI Computer Use | Claude Computer Use |
---|---|---|
API | Responses API | Messages API |
Model | computer-use-preview | claude-4-sonnet |
Beta Tag | Built-in | computer-use-2025-01-24 |
Reasoning | Optional summaries | Thinking blocks |
Environment | Multiple (browser, OS) | Single tool definition |
Limitations
- Beta Status: Computer Use is in beta and may have unexpected behaviors
- Rate Limits: The model has constrained rate limits
- Accuracy: ~38% success rate on complex OS tasks
- Environment: Best suited for browser-based tasks