Skip to main content
The Desktop resource provides desktop automation capabilities for controlling GUI applications, taking screenshots, and managing windows. Note: Desktop automation requires a template with desktop dependencies installed.

Accessing Desktop Resource

from hopx_ai import Sandbox

sandbox = Sandbox.create(template="desktop")  # Requires desktop template
desktop = sandbox.desktop  # Lazy-loaded resource

VNC Server

start_vnc()

Start VNC server for remote desktop access.
vnc_info = sandbox.desktop.start_vnc()
print(f"VNC URL: {vnc_info.url}")
print(f"Display: {vnc_info.display}")
Method Signature:
def start_vnc(
    self,
    display: int = 1,
    password: Optional[str] = None
) -> VNCInfo
Returns: VNCInfo object with URL, port, and display information Example:
# Start VNC with default settings
vnc = sandbox.desktop.start_vnc()
print(f"Connect at: {vnc.url}")

# Start with password
vnc = sandbox.desktop.start_vnc(password="mypassword")

stop_vnc()

Stop VNC server.
sandbox.desktop.stop_vnc()

get_vnc_status()

Get VNC server status.
vnc = sandbox.desktop.get_vnc_status()
if vnc.running:
    print(f"VNC running at {vnc.url}")

Mouse Control

click()

Click at coordinates.
sandbox.desktop.click(100, 200)
sandbox.desktop.click(500, 300, button='right', clicks=2)
Method Signature:
def click(
    self,
    x: int,
    y: int,
    *,
    button: str = "left",
    clicks: int = 1
) -> None
Parameters:
  • x, y (int): Screen coordinates
  • button (str): Mouse button - "left", "right", or "middle" (default: "left")
  • clicks (int): Number of clicks (default: 1)

move()

Move mouse to coordinates.
sandbox.desktop.move(500, 300)

drag()

Drag mouse from one point to another.
sandbox.desktop.drag(100, 100, 300, 300)

scroll()

Scroll at coordinates.
sandbox.desktop.scroll(500, 300, delta_y=3)  # Scroll down
sandbox.desktop.scroll(500, 300, delta_y=-3)  # Scroll up

Keyboard Control

type()

Type text.
sandbox.desktop.type("Hello, World!")

press()

Press a key.
sandbox.desktop.press("Return")
sandbox.desktop.press("Control_L+c")  # Ctrl+C

combination()

Press key combination.
sandbox.desktop.combination(["Control_L", "c"])  # Ctrl+C
sandbox.desktop.combination(["Alt", "Tab"])  # Alt+Tab

Clipboard

clipboard_get()

Get clipboard contents.
content = sandbox.desktop.clipboard_get()
print(content)

clipboard_set()

Set clipboard contents.
sandbox.desktop.clipboard_set("Text to copy")

Screenshots

screenshot()

Capture full screen screenshot.
img_bytes = sandbox.desktop.screenshot()
with open('screen.png', 'wb') as f:
    f.write(img_bytes)
Returns: Image bytes (PNG format)

screenshot_region()

Capture screenshot of specific region.
img_bytes = sandbox.desktop.screenshot_region(100, 100, 800, 600)
Parameters:
  • x, y (int): Top-left coordinates
  • width, height (int): Region dimensions

Screen Recording

start_recording()

Start screen recording.
sandbox.desktop.start_recording(output_file="/workspace/recording.mp4")

stop_recording()

Stop screen recording.
recording_info = sandbox.desktop.stop_recording()
print(f"Recording saved: {recording_info.file_path}")

get_recording_status()

Get recording status.
status = sandbox.desktop.get_recording_status()
if status.recording:
    print("Recording in progress")

Window Management

list_windows()

List all windows.
windows = sandbox.desktop.list_windows()
for win in windows:
    print(f"{win.title} ({win.window_id})")

focus_window()

Focus a window.
sandbox.desktop.focus_window("window_id_123")

close_window()

Close a window.
sandbox.desktop.close_window("window_id_123")

Error Handling

Desktop automation requires specific dependencies. If not available, methods raise DesktopNotAvailableError:
from hopx_ai.errors import DesktopNotAvailableError

try:
    sandbox.desktop.click(100, 100)
except DesktopNotAvailableError as e:
    print(f"Desktop not available: {e.message}")
    print(f"Install command: {e.install_command}")

Examples

Basic Automation

# Start VNC
vnc = sandbox.desktop.start_vnc()
print(f"VNC at: {vnc.url}")

# Click button
sandbox.desktop.click(500, 300)

# Type text
sandbox.desktop.type("Hello, World!")
sandbox.desktop.press("Return")

# Screenshot
img = sandbox.desktop.screenshot()
with open('screen.png', 'wb') as f:
    f.write(img)

Form Filling

# Click input field
sandbox.desktop.click(200, 150)

# Type form data
sandbox.desktop.type("John Doe")
sandbox.desktop.press("Tab")
sandbox.desktop.type("john@example.com")
sandbox.desktop.press("Tab")
sandbox.desktop.type("password123")

# Submit
sandbox.desktop.press("Return")