Actions are the atomic operations performed during automation. They represent specific tasks like clicking a button, typing text, or finding an image on screen. Actions are organized into workflows and executed during state transitions.
Actions are organized into 7 categories based on their purpose: Find, Mouse, Keyboard, Control Flow, Data, State, and Code. Each category groups related operations together for easy discovery in the workflow builder.
Image matching and element detection on screen.
Find element on screen using image matching
Wait for element to disappear from screen
Find element using AI embeddings (RAG)
Mouse interactions and movements.
Click on element (left, right, middle, or double)
Move mouse cursor to position
Press mouse button down
Release mouse button
Drag element from one position to another
Scroll page or element
Keyboard input and shortcuts.
Type text into an element
Press a single key
Press key down (hold)
Release key
Execute keyboard shortcut combination
Conditional logic and loops.
Conditional branching based on condition
Repeat actions multiple times
Exit from loop early
Skip to next loop iteration
Multi-way branching based on value
Error handling and recovery
Variables and data operations.
Store a value in a variable
Retrieve a variable value
Sort array or list of items
Filter array based on condition
Transform each item in array
Reduce array to single value
Manipulate text strings
Perform mathematical calculations
State management and process control.
Navigate to a different workflow state
Execute another workflow
Capture screen or region
Python code execution, shell commands, and AI automation.
Execute inline Python code with access to workflow context
Execute pre-registered custom Python function
Execute a shell command and capture output
Execute a multi-line shell script
Execute an AI prompt with context isolation
Learn more: See the AI Actions documentation for detailed information on AI-powered automation.