Media Handling
Polyclaw supports sending and receiving files, images, audio, and video through its messaging channels.
Media Classification
The media/classify.py module maintains a MIME type registry that classifies files into categories:
| Category | Examples |
|---|---|
| Image | JPEG, PNG, GIF, WebP, SVG, BMP |
| Audio | MP3, WAV, OGG, FLAC, M4A, AAC |
| Video | MP4, WebM, MOV |
| File | PDF, DOCX, XLSX, ZIP, etc. |
Directory Structure
~/.polyclaw/media/
incoming/ # Downloaded from channels
outgoing/ # Generated by the agent
pending/ # Awaiting delivery
sent/ # Successfully delivered
error/ # Failed delivery
Incoming Media
When a user sends a file through a messaging channel:
- The Bot Framework SDK provides the attachment metadata
incoming.pydownloads the file from the Bot Framework CDN- The file is saved to
media/incoming/with its original filename - A media-aware prompt is built that describes the file to the agent
- For images, the agent can analyze the visual content
- For documents, the content is extracted when possible
Outgoing Media
Outgoing media is handled by two complementary mechanisms.
Inline response attachments (incoming.py::extract_outgoing_attachments): file paths referenced in the agent response text are detected via regex, read from disk, and base64-encoded as inline Attachment objects sent directly with the reply.
Pending directory pipeline (outgoing.py::collect_pending_outgoing): files written to media/pending/ by agent tools or skills are collected and attached on the next message delivery. The pipeline enforces a 190 KB per-file limit. Images that exceed this limit are automatically downscaled using Pillow (up to six progressive attempts at 75% scale each). Files that cannot be reduced to fit are moved to media/error/ with a .error.txt sidecar explaining the reason. Successfully sent files are moved to media/sent/.
Both mechanisms are invoked together in message_processor.py on every agent response.
Media in Web Chat
The web dashboard serves media files via the /api/media/{filename} endpoint. Incoming and outgoing files are accessible for viewing and downloading through the chat interface.
Supported Operations
| Operation | Description |
|---|---|
| Receive images | View and analyze images sent by users |
| Receive documents | Process uploaded documents |
| Send files (inline) | Attach response-referenced files directly to replies |
| Send files (pending) | Attach pre-generated files from media/pending/ |
| Image auto-resize | Downscale oversized images before sending (requires Pillow) |
| Image analysis | Describe image contents (via LLM vision) |
polyclaw