Most "AI agent" tutorials start with theory. This one starts with a working agent.
I'm going to walk you through building a simple agent that monitors a folder, generates content, and posts it — the same pattern I use to run three brands on autopilot. You don't need a CS degree. You need Python, a browser, and about 45 minutes.
What You're Building
A Python script that:
- Watches a folder for new content (images, videos, text files)
- Generates a caption using an LLM
- Posts it to a social platform via browser automation
That's it. No frameworks, no SaaS platforms, no $149/month subscriptions. Just a script that does work for you.
Step 1: Install the Tools (5 minutes)
pip install playwright openai
playwright install chromium
Playwright gives you programmatic browser control. The OpenAI package handles caption generation. You can swap in any LLM — I use GPT-4.1 through an Azure proxy, but the free tier of most providers works fine for getting started.
Step 2: Build the Watcher (10 minutes)
import os, time
from pathlib import Path
WATCH_DIR = Path("./content-queue")
WATCH_DIR.mkdir(exist_ok=True)
def get_new_files(seen: set) -> list:
current = set(WATCH_DIR.iterdir())
new = current - seen
return list(new), current
seen = set(WATCH_DIR.iterdir())
while True:
new_files, seen = get_new_files(seen)
for f in new_files:
print(f"New file: {f.name}")
# This is where the agent logic goes
time.sleep(30)
Drop a file in the folder, the agent picks it up. Simple.
Step 3: Add the Brain (15 minutes)
from openai import OpenAI
client = OpenAI()
def generate_caption(filename: str, context: str = "") -> str:
response = client.chat.completions.create(
model="gpt-4.1",
messages=[{
"role": "user",
"content": f"Write a short, punchy social media caption for this content: {filename}. Context: {context}. Keep it under 200 characters. No hashtags."
}]
)
return response.choices[0].message.content.strip()
Cost per caption: fractions of a cent. I generate hundreds of these daily across my brands and the API bill is negligible.
Step 4: Add the Hands (15 minutes)
This is where most people get stuck. They try to use platform APIs and hit walls — rate limits, business verification requirements, missing features. Skip all that. Use Playwright to control the browser directly.
from playwright.sync_api import sync_playwright
def post_to_platform(file_path: str, caption: str):
with sync_playwright() as p:
browser = p.chromium.launch_persistent_context(
user_data_dir="./browser-session",
headless=False
)
page = browser.pages[0]
page.goto("https://your-platform.com")
# Platform-specific posting logic here
browser.close()
The launch_persistent_context with a user_data_dir is the key move. Log in once manually, and the session persists. Your agent reuses your cookies forever. No OAuth, no API keys, no token refresh headaches.
I use this exact pattern for Instagram, Pinterest, X, LinkedIn, Threads, and Facebook. Each platform has its own quirks — Instagram needs you to walk up the DOM to find the right anchor tag, X needs specific CDP port routing — but the core pattern is identical.
Step 5: Wire It Together
seen = set(WATCH_DIR.iterdir())
while True:
new_files, seen = get_new_files(seen)
for f in new_files:
caption = generate_caption(f.name)
post_to_platform(str(f), caption)
print(f"Posted: {f.name} -> {caption}")
time.sleep(30)
That's your first agent. Drop a file in a folder, it generates a caption and posts it. Under an hour, no dependencies beyond Python and a browser.
Making It Real
The jump from "toy script" to "production agent" is smaller than you think:
- Add a cron trigger instead of the while loop:
crontab -e→0 9 * * * python3 /path/to/agent.py - Add error handling: wrap the post function in try/except, log failures, retry with exponential backoff
- Add multiple platforms: loop through a list of posting functions, one per platform
- Add a content queue: instead of watching a folder, pull from a JSON file or database
My production system (brand_cron.py) is fundamentally this same pattern, just with more platforms, more error handling, and a ThreadPoolExecutor so multiple brands post concurrently.
What Not to Do
Don't start with a framework. Don't sign up for an automation platform. Don't spend three days evaluating tools. Write a script that does one thing, run it, and iterate.
I've seen people spend weeks comparing n8n vs Zapier vs Make vs whatever, and they still haven't automated a single post. Meanwhile, a 50-line Python script could have been posting for them the entire time.
Next Steps
Once your first agent is running, you'll immediately see what to automate next. For me it was: thumbnail extraction (ffmpeg), multi-account support (session management per account), and scheduling (cron). Each addition was another 30-minute build.
The full toolkit I use — social posting, content generation, browser automation, the whole stack — is available at axon.nepa-ai.com. But honestly, start with the 50-line script above. You'll learn more in an hour of building than in a week of reading.
