Back to Blog
TutorialFebruary 2, 202614 min

Puppeteer + AI: Build Smart Browser Automation

Combine Puppeteer with Claude or GPT to create intelligent browser automation. Extract data, fill forms, navigate sites.

puppeteerbrowser automationaiweb scraping

Molted Team

Molted.cloud

Browser automation has been around for years. Selenium, Playwright, Puppeteer. Scripts that click buttons, fill forms, extract data. They work great until the page structure changes, a new captcha appears, or the workflow requires human-like judgment. That is where AI changes the game.

Combining Puppeteer with Claude or GPT-4 creates automation that understands content, adapts to changes, and makes decisions based on context rather than hardcoded selectors. A scraper that actually reads and comprehends the page. A form filler that figures out which field is which, even when IDs are obfuscated. Navigation driven by natural language instead of CSS selectors.

This approach is not theoretical. Companies are using AI-augmented automation for lead generation, competitive intelligence, and workflow automation that would break constantly with traditional methods.

Setting up the environment

You need Node.js 18+, Puppeteer, and an API key from Anthropic or OpenAI. The setup takes about five minutes.

mkdir puppeteer-ai
cd puppeteer-ai
npm init -y
npm install puppeteer @anthropic-ai/sdk dotenv

Create your .env file:

ANTHROPIC_API_KEY=sk-ant-your-key-here

And a basic .gitignore:

.env
node_modules/
screenshots/

The base structure for all examples:

// base.js
require('dotenv').config();
const puppeteer = require('puppeteer');
const Anthropic = require('@anthropic-ai/sdk');

const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

async function createBrowser() {
  return puppeteer.launch({
    headless: 'new',
    args: ['--no-sandbox', '--disable-setuid-sandbox'],
  });
}

async function askAI(prompt, context = '') {
  const response = await anthropic.messages.create({
    model: 'claude-sonnet-4-20250514',
    max_tokens: 2048,
    messages: [
      {
        role: 'user',
        content: context ? `${context}\n\n${prompt}` : prompt,
      },
    ],
  });
  return response.content[0].text;
}

module.exports = { createBrowser, askAI, anthropic };

Intelligent content scraping

Traditional scraping relies on stable selectors. Find the div.product-price, extract the text, parse the number. When the class changes to span.price-value, your script breaks.

AI-powered scraping works differently. You give the model the page content and ask it to extract what you need. It understands that "$29.99" and "29,99 EUR" are both prices, regardless of how they are marked up.

// smart-scraper.js
const { createBrowser, askAI } = require('./base');

async function scrapeProductInfo(url) {
  const browser = await createBrowser();
  const page = await browser.newPage();

  await page.goto(url, { waitUntil: 'networkidle2' });

  // Get the visible text content
  const pageText = await page.evaluate(() => {
    // Remove scripts, styles, and hidden elements
    const scripts = document.querySelectorAll('script, style, noscript');
    scripts.forEach(s => s.remove());

    return document.body.innerText;
  });

  // Also grab the HTML structure for context
  const pageHtml = await page.evaluate(() => {
    return document.body.innerHTML.substring(0, 50000);
  });

  const prompt = `Extract product information from this webpage.

Return a JSON object with these fields:
- name: product name
- price: current price as a number (no currency symbol)
- currency: currency code (USD, EUR, etc.)
- description: short product description (max 200 chars)
- availability: "in_stock", "out_of_stock", or "unknown"
- rating: numerical rating if present, null otherwise
- review_count: number of reviews if present, null otherwise

Page content:
${pageText.substring(0, 10000)}

Return only valid JSON, no markdown formatting.`;

  const result = await askAI(prompt);

  await browser.close();

  try {
    return JSON.parse(result);
  } catch {
    // If JSON parsing fails, return the raw result
    return { raw: result, error: 'Failed to parse JSON' };
  }
}

// Example usage
async function main() {
  const products = [
    'https://example-store.com/product-1',
    'https://example-store.com/product-2',
  ];

  for (const url of products) {
    console.log(`Scraping: ${url}`);
    const info = await scrapeProductInfo(url);
    console.log(JSON.stringify(info, null, 2));
  }
}

main().catch(console.error);

This approach handles edge cases that would require dozens of conditional statements in traditional scrapers. The AI recognizes that "Out of Stock" and "Currently Unavailable" and "Sold Out" all mean the same thing. It extracts prices whether they are formatted as "$1,299.00" or "1299 USD" or "From $1,299".

Extracting structured data from articles

The same pattern works for content extraction. Grab an article page and ask the AI to extract the title, author, publication date, main content, and key topics.

async function extractArticle(url) {
  const browser = await createBrowser();
  const page = await browser.newPage();

  await page.goto(url, { waitUntil: 'networkidle2' });

  const pageText = await page.evaluate(() => document.body.innerText);

  const prompt = `Extract article information from this webpage.

Return a JSON object with:
- title: article title
- author: author name if present
- published_date: publication date in ISO format if present
- content: the main article text (full content, not truncated)
- topics: array of 3-5 key topics/keywords
- summary: 2-3 sentence summary

Page content:
${pageText.substring(0, 30000)}

Return only valid JSON.`;

  const result = await askAI(prompt);
  await browser.close();

  return JSON.parse(result);
}

Dynamic form automation

Forms are where traditional automation gets painful. Field names change. Layouts vary. Required fields differ between sites. The AI can look at a form and figure out what goes where.

// smart-form.js
const { createBrowser, askAI } = require('./base');

async function analyzeForm(page) {
  // Extract form structure
  const formData = await page.evaluate(() => {
    const forms = document.querySelectorAll('form');
    const formInfo = [];

    forms.forEach((form, index) => {
      const fields = [];
      const inputs = form.querySelectorAll('input, textarea, select');

      inputs.forEach(input => {
        const label = document.querySelector(`label[for="${input.id}"]`);
        fields.push({
          type: input.type || input.tagName.toLowerCase(),
          name: input.name,
          id: input.id,
          placeholder: input.placeholder,
          label: label?.innerText || '',
          required: input.required,
          options: input.tagName === 'SELECT'
            ? Array.from(input.options).map(o => o.value)
            : null,
        });
      });

      formInfo.push({ index, fields });
    });

    return formInfo;
  });

  return formData;
}

async function fillFormWithAI(page, formData, userData) {
  const prompt = `Given this form structure and user data, determine how to fill each field.

Form fields:
${JSON.stringify(formData, null, 2)}

User data:
${JSON.stringify(userData, null, 2)}

Return a JSON array of actions:
[
  { "selector": "#field-id", "action": "type", "value": "text to type" },
  { "selector": "#dropdown", "action": "select", "value": "option-value" },
  { "selector": "#checkbox", "action": "click" }
]

Match user data to form fields by understanding the purpose of each field.
If a required field has no matching user data, use a reasonable placeholder.
Return only valid JSON.`;

  const actionsJson = await askAI(prompt);
  const actions = JSON.parse(actionsJson);

  // Execute the actions
  for (const action of actions) {
    try {
      await page.waitForSelector(action.selector, { timeout: 5000 });

      if (action.action === 'type') {
        await page.type(action.selector, action.value);
      } else if (action.action === 'select') {
        await page.select(action.selector, action.value);
      } else if (action.action === 'click') {
        await page.click(action.selector);
      }
    } catch (error) {
      console.log(`Could not perform action on ${action.selector}: ${error.message}`);
    }
  }
}

async function submitContactForm(url, contactInfo) {
  const browser = await createBrowser();
  const page = await browser.newPage();

  await page.goto(url, { waitUntil: 'networkidle2' });

  const formStructure = await analyzeForm(page);

  if (formStructure.length === 0) {
    console.log('No forms found on page');
    await browser.close();
    return;
  }

  await fillFormWithAI(page, formStructure[0], contactInfo);

  // Take a screenshot before submitting
  await page.screenshot({ path: 'form-filled.png' });

  // Find and click submit button
  const submitButton = await page.$('button[type="submit"], input[type="submit"]');
  if (submitButton) {
    await submitButton.click();
    await page.waitForNavigation({ timeout: 10000 }).catch(() => {});
  }

  await browser.close();
}

// Example usage
const userData = {
  firstName: 'John',
  lastName: 'Smith',
  email: 'john.smith@example.com',
  company: 'Acme Corp',
  phone: '+1-555-123-4567',
  message: 'Interested in learning more about your services.',
  country: 'United States',
};

submitContactForm('https://example.com/contact', userData);

The AI understands that "First Name" and "Given Name" and "Prenom" are the same field. It maps "Phone" to "Telephone" to "Mobile Number". This semantic understanding is what makes AI-powered automation resilient.

Need AI without coding?

OpenClaw gives you a personal AI assistant. No code required.

Start free trial

Natural language navigation

The most powerful application: telling the browser what to do in plain English and letting the AI figure out the implementation. Instead of writing selectors and click handlers, you describe the goal.

// natural-nav.js
const { createBrowser, askAI } = require('./base');

async function getPageContext(page) {
  return page.evaluate(() => {
    const clickable = [];

    // Get all clickable elements
    const elements = document.querySelectorAll('a, button, [role="button"], [onclick]');
    elements.forEach((el, index) => {
      const rect = el.getBoundingClientRect();
      if (rect.width > 0 && rect.height > 0) {
        clickable.push({
          index,
          tag: el.tagName,
          text: el.innerText.substring(0, 100).trim(),
          href: el.href || null,
          id: el.id || null,
          class: el.className || null,
        });
      }
    });

    // Get form inputs
    const inputs = [];
    document.querySelectorAll('input, textarea, select').forEach((el, index) => {
      const label = document.querySelector(`label[for="${el.id}"]`);
      inputs.push({
        index,
        type: el.type || el.tagName.toLowerCase(),
        name: el.name,
        id: el.id,
        label: label?.innerText || el.placeholder || null,
      });
    });

    return {
      url: window.location.href,
      title: document.title,
      clickable: clickable.slice(0, 50),
      inputs: inputs.slice(0, 30),
    };
  });
}

async function executeInstruction(page, instruction) {
  const context = await getPageContext(page);

  const prompt = `You are controlling a web browser. Based on the current page state and the user instruction, determine the next action.

Current page:
URL: ${context.url}
Title: ${context.title}

Clickable elements:
${JSON.stringify(context.clickable, null, 2)}

Form inputs:
${JSON.stringify(context.inputs, null, 2)}

User instruction: "${instruction}"

Return a JSON object with one action:
- { "action": "click", "element_index": <number from clickable list> }
- { "action": "type", "input_index": <number from inputs list>, "text": "<text to type>" }
- { "action": "navigate", "url": "<url to go to>" }
- { "action": "wait", "ms": <milliseconds> }
- { "action": "done", "result": "<summary of what was accomplished>" }

Choose the action that best accomplishes the instruction.
Return only valid JSON.`;

  const actionJson = await askAI(prompt);
  const action = JSON.parse(actionJson);

  console.log('AI decided:', action);

  if (action.action === 'click') {
    const elements = await page.$$('a, button, [role="button"], [onclick]');
    if (elements[action.element_index]) {
      await elements[action.element_index].click();
      await page.waitForNavigation({ timeout: 5000 }).catch(() => {});
    }
  } else if (action.action === 'type') {
    const inputs = await page.$$('input, textarea, select');
    if (inputs[action.input_index]) {
      await inputs[action.input_index].type(action.text);
    }
  } else if (action.action === 'navigate') {
    await page.goto(action.url, { waitUntil: 'networkidle2' });
  } else if (action.action === 'wait') {
    await new Promise(r => setTimeout(r, action.ms));
  } else if (action.action === 'done') {
    return { done: true, result: action.result };
  }

  return { done: false };
}

async function executeWorkflow(startUrl, instructions) {
  const browser = await createBrowser();
  const page = await browser.newPage();

  await page.goto(startUrl, { waitUntil: 'networkidle2' });

  for (const instruction of instructions) {
    console.log(`\nExecuting: "${instruction}"`);

    let attempts = 0;
    const maxAttempts = 5;

    while (attempts < maxAttempts) {
      const result = await executeInstruction(page, instruction);

      if (result.done) {
        console.log('Completed:', result.result);
        break;
      }

      attempts++;
      await new Promise(r => setTimeout(r, 1000));
    }
  }

  await browser.close();
}

// Example: natural language workflow
executeWorkflow('https://news.ycombinator.com', [
  'Click on the first article link',
  'Find and click any link to leave a comment or discuss',
  'Report what page we ended up on',
]);

This pattern enables workflows that would be fragile with traditional automation. "Find the contact page and fill out an inquiry form" works across thousands of different websites without writing site-specific code.

Multi-step workflows

async function complexWorkflow() {
  const browser = await createBrowser();
  const page = await browser.newPage();

  // Start at a search engine
  await page.goto('https://duckduckgo.com', { waitUntil: 'networkidle2' });

  const workflow = [
    'Type "best project management software 2026" in the search box',
    'Click the search button',
    'Click on the first organic result (not an ad)',
    'Find pricing information on the page',
    'Summarize the pricing tiers found',
  ];

  for (const step of workflow) {
    console.log(`Step: ${step}`);

    let result = { done: false };
    let retries = 0;

    while (!result.done && retries < 3) {
      result = await executeInstruction(page, step);
      retries++;

      if (!result.done) {
        await new Promise(r => setTimeout(r, 2000));
      }
    }

    if (result.result) {
      console.log('Result:', result.result);
    }
  }

  await browser.close();
}

complexWorkflow();

Error handling and retry logic

Production automation needs robust error handling. Networks fail. Pages load slowly. Elements disappear. The AI itself can return malformed responses.

// robust-automation.js
const { createBrowser, anthropic } = require('./base');

class AutomationError extends Error {
  constructor(message, code, recoverable = false) {
    super(message);
    this.code = code;
    this.recoverable = recoverable;
  }
}

async function askAIWithRetry(prompt, retries = 3) {
  let lastError;

  for (let i = 0; i < retries; i++) {
    try {
      const response = await anthropic.messages.create({
        model: 'claude-sonnet-4-20250514',
        max_tokens: 2048,
        messages: [{ role: 'user', content: prompt }],
      });

      const text = response.content[0].text;

      // Validate JSON if expected
      if (prompt.includes('Return only valid JSON')) {
        JSON.parse(text); // Will throw if invalid
      }

      return text;
    } catch (error) {
      lastError = error;

      // Rate limit - wait and retry
      if (error.status === 429) {
        const waitTime = Math.pow(2, i) * 1000;
        console.log(`Rate limited, waiting ${waitTime}ms`);
        await new Promise(r => setTimeout(r, waitTime));
        continue;
      }

      // Server error - retry
      if (error.status >= 500) {
        await new Promise(r => setTimeout(r, 1000 * (i + 1)));
        continue;
      }

      // JSON parse error - retry with stricter prompt
      if (error instanceof SyntaxError) {
        console.log('Invalid JSON response, retrying...');
        continue;
      }

      throw error;
    }
  }

  throw new AutomationError(
    `AI request failed after ${retries} retries: ${lastError.message}`,
    'AI_ERROR',
    false
  );
}

async function safePageAction(page, action, timeout = 10000) {
  try {
    const result = await Promise.race([
      action(),
      new Promise((_, reject) =>
        setTimeout(() => reject(new Error('Action timeout')), timeout)
      ),
    ]);
    return { success: true, result };
  } catch (error) {
    return { success: false, error: error.message };
  }
}

async function navigateWithRetry(page, url, retries = 3) {
  for (let i = 0; i < retries; i++) {
    const result = await safePageAction(page, async () => {
      await page.goto(url, {
        waitUntil: 'networkidle2',
        timeout: 30000,
      });
      return page.url();
    });

    if (result.success) {
      return result.result;
    }

    console.log(`Navigation attempt ${i + 1} failed: ${result.error}`);
    await new Promise(r => setTimeout(r, 2000));
  }

  throw new AutomationError(
    `Failed to navigate to ${url}`,
    'NAVIGATION_ERROR',
    true
  );
}

async function clickWithRetry(page, selector, retries = 3) {
  for (let i = 0; i < retries; i++) {
    const result = await safePageAction(page, async () => {
      await page.waitForSelector(selector, { timeout: 5000 });
      await page.click(selector);
    });

    if (result.success) {
      return true;
    }

    // Try alternative selectors or scrolling
    if (i === 1) {
      await page.evaluate(() => window.scrollTo(0, document.body.scrollHeight / 2));
      await new Promise(r => setTimeout(r, 500));
    }
  }

  throw new AutomationError(
    `Failed to click ${selector}`,
    'CLICK_ERROR',
    true
  );
}

// Main automation with error recovery
async function robustAutomation(task) {
  const browser = await createBrowser();
  const page = await browser.newPage();

  try {
    // Set up error handling
    page.on('error', error => {
      console.error('Page error:', error);
    });

    page.on('pageerror', error => {
      console.error('Page script error:', error);
    });

    // Execute task with recovery
    let result;
    let attempts = 0;

    while (attempts < 3) {
      try {
        result = await task(page);
        break;
      } catch (error) {
        if (error.recoverable) {
          console.log(`Recoverable error, retrying: ${error.message}`);
          attempts++;
          await page.reload({ waitUntil: 'networkidle2' }).catch(() => {});
        } else {
          throw error;
        }
      }
    }

    return result;
  } finally {
    await browser.close();
  }
}

module.exports = {
  askAIWithRetry,
  safePageAction,
  navigateWithRetry,
  clickWithRetry,
  robustAutomation,
  AutomationError,
};

Rate limiting and resource management

Running AI-powered automation at scale requires careful resource management. Both browser instances and API calls have limits.

// rate-limiter.js

class RateLimiter {
  constructor(maxRequests, windowMs) {
    this.maxRequests = maxRequests;
    this.windowMs = windowMs;
    this.requests = [];
  }

  async acquire() {
    const now = Date.now();

    // Remove old requests
    this.requests = this.requests.filter(t => now - t < this.windowMs);

    if (this.requests.length >= this.maxRequests) {
      // Wait until we can make a request
      const oldestRequest = this.requests[0];
      const waitTime = this.windowMs - (now - oldestRequest) + 100;
      await new Promise(r => setTimeout(r, waitTime));
      return this.acquire();
    }

    this.requests.push(now);
  }
}

class BrowserPool {
  constructor(maxBrowsers = 5) {
    this.maxBrowsers = maxBrowsers;
    this.available = [];
    this.inUse = new Set();
  }

  async acquire() {
    if (this.available.length > 0) {
      const browser = this.available.pop();
      this.inUse.add(browser);
      return browser;
    }

    if (this.inUse.size < this.maxBrowsers) {
      const browser = await require('puppeteer').launch({
        headless: 'new',
        args: ['--no-sandbox'],
      });
      this.inUse.add(browser);
      return browser;
    }

    // Wait for a browser to become available
    await new Promise(r => setTimeout(r, 1000));
    return this.acquire();
  }

  release(browser) {
    this.inUse.delete(browser);
    this.available.push(browser);
  }

  async closeAll() {
    for (const browser of [...this.available, ...this.inUse]) {
      await browser.close().catch(() => {});
    }
    this.available = [];
    this.inUse.clear();
  }
}

// Usage example
async function scrapeManyPages(urls) {
  const aiLimiter = new RateLimiter(50, 60000); // 50 requests per minute
  const browserPool = new BrowserPool(3);

  const results = [];

  for (const url of urls) {
    const browser = await browserPool.acquire();
    const page = await browser.newPage();

    try {
      await page.goto(url, { waitUntil: 'networkidle2' });

      const content = await page.evaluate(() => document.body.innerText);

      await aiLimiter.acquire();

      const analysis = await askAIWithRetry(`
        Analyze this webpage content and extract key information.
        Content: ${content.substring(0, 10000)}
        Return JSON with: title, summary, key_points (array)
      `);

      results.push({ url, data: JSON.parse(analysis) });
    } catch (error) {
      results.push({ url, error: error.message });
    } finally {
      await page.close();
      browserPool.release(browser);
    }
  }

  await browserPool.closeAll();
  return results;
}

module.exports = { RateLimiter, BrowserPool };

Memory management

async function memoryEfficientScrape(urls) {
  const puppeteer = require('puppeteer');

  // Launch with memory limits
  const browser = await puppeteer.launch({
    headless: 'new',
    args: [
      '--no-sandbox',
      '--disable-dev-shm-usage',
      '--disable-gpu',
      '--js-flags=--max-old-space-size=512',
    ],
  });

  const results = [];

  for (const url of urls) {
    const page = await browser.newPage();

    // Disable images and fonts for faster loading
    await page.setRequestInterception(true);
    page.on('request', request => {
      const resourceType = request.resourceType();
      if (['image', 'font', 'media'].includes(resourceType)) {
        request.abort();
      } else {
        request.continue();
      }
    });

    try {
      await page.goto(url, {
        waitUntil: 'domcontentloaded',
        timeout: 15000,
      });

      const data = await extractData(page);
      results.push(data);
    } catch (error) {
      console.error(`Error scraping ${url}:`, error.message);
    } finally {
      await page.close();
    }

    // Force garbage collection if available
    if (global.gc) {
      global.gc();
    }
  }

  await browser.close();
  return results;
}

AI assistant ready to go

Deploy OpenClaw in 60 seconds. 24-hour free trial.

Try free for 24 hours

Real-world use cases

Lead generation

Scrape business directories, extract contact information, and qualify leads based on company descriptions. The AI can read an "About Us" page and determine if a company fits your ideal customer profile.

async function qualifyLead(companyUrl) {
  const browser = await createBrowser();
  const page = await browser.newPage();

  await page.goto(companyUrl, { waitUntil: 'networkidle2' });

  // Find about page
  const aboutLink = await page.$('a[href*="about"], a:contains("About")');
  if (aboutLink) {
    await aboutLink.click();
    await page.waitForNavigation().catch(() => {});
  }

  const content = await page.evaluate(() => document.body.innerText);

  const analysis = await askAI(`
    Analyze this company based on their website content.

    Content: ${content.substring(0, 15000)}

    Return JSON:
    {
      "company_name": "",
      "industry": "",
      "company_size": "startup|small|medium|enterprise|unknown",
      "uses_saas": boolean,
      "tech_stack_mentioned": [],
      "decision_maker_email_pattern": "likely pattern or null",
      "icp_fit_score": 1-10,
      "icp_fit_reasoning": ""
    }

    Our ICP: B2B SaaS companies, 50-500 employees, using modern tech stack.
  `);

  await browser.close();
  return JSON.parse(analysis);
}

Competitive intelligence

Monitor competitor websites for pricing changes, new features, and messaging updates. The AI can compare two versions of a page and summarize what changed.

async function compareVersions(url, previousContent) {
  const browser = await createBrowser();
  const page = await browser.newPage();

  await page.goto(url, { waitUntil: 'networkidle2' });

  const currentContent = await page.evaluate(() => document.body.innerText);

  const analysis = await askAI(`
    Compare these two versions of a webpage and identify meaningful changes.
    Ignore minor text changes, focus on:
    - Pricing changes
    - Feature additions or removals
    - Messaging changes
    - New CTAs or offers

    Previous version:
    ${previousContent.substring(0, 10000)}

    Current version:
    ${currentContent.substring(0, 10000)}

    Return JSON:
    {
      "has_significant_changes": boolean,
      "changes": [
        { "type": "pricing|feature|messaging|offer", "description": "" }
      ],
      "summary": ""
    }
  `);

  await browser.close();

  return {
    content: currentContent,
    analysis: JSON.parse(analysis),
  };
}

Automated testing with AI verification

Traditional E2E tests check for specific elements. AI-powered tests can verify that a page "looks right" or that a user flow "makes sense".

async function aiVerifyCheckout(page) {
  // Take screenshot
  await page.screenshot({ path: 'checkout.png', fullPage: true });

  // Get page content
  const content = await page.evaluate(() => ({
    text: document.body.innerText,
    forms: Array.from(document.forms).map(f => ({
      action: f.action,
      inputs: Array.from(f.elements).map(e => e.name).filter(Boolean),
    })),
  }));

  const verification = await askAI(`
    Verify this checkout page meets requirements.

    Page content:
    ${JSON.stringify(content, null, 2)}

    Requirements:
    1. Has a clear order summary with item names and prices
    2. Shows subtotal, tax, and total
    3. Has shipping address form fields
    4. Has payment method selection
    5. Has a clear submit/purchase button
    6. Shows security indicators (lock icon, SSL mention)

    Return JSON:
    {
      "passes": boolean,
      "requirements_met": ["1", "2", ...],
      "requirements_failed": ["3", ...],
      "issues": ["description of each issue"],
      "suggestions": ["improvement suggestions"]
    }
  `);

  return JSON.parse(verification);
}

When this approach works and when it does not

AI-augmented browser automation shines in scenarios with variety and ambiguity. Scraping hundreds of different e-commerce sites. Filling forms across different platforms. Navigating workflows that change frequently.

It adds overhead for simple, stable tasks. If you are scraping the same page structure daily and it never changes, a traditional selector-based approach is faster and cheaper. The AI call adds latency and cost.

The sweet spot: tasks where maintaining traditional automation costs more than the AI overhead. If you spend hours updating selectors when sites change their markup, AI-powered extraction pays for itself.

Cost considerations

Claude Sonnet costs roughly $3 per million input tokens and $15 per million output tokens. A typical page extraction uses maybe 5,000 input tokens and 500 output tokens. That is about $0.02 per page. At scale, this adds up, but it is often cheaper than developer time spent maintaining brittle selectors.

Performance tips

  • Cache AI responses for identical inputs
  • Use smaller models (Claude Haiku) for simple extractions
  • Batch similar requests when possible
  • Pre-filter content before sending to the AI
  • Run browsers headless with minimal resources

Where to go from here

The combination of browser automation and AI opens up workflows that were impractical before. Research assistants that can browse the web and synthesize information. Data pipelines that adapt to source changes. Testing frameworks that understand intent, not just elements.

Start with a specific problem. Maybe you need to extract data from sites that keep changing their markup. Maybe you want to automate a workflow that spans multiple platforms. Build a proof of concept, measure the results against traditional approaches, and iterate from there.

The code in this article is a starting point. Production implementations will need better error handling, persistent storage for conversation context, and probably some caching layer for repeated operations. But the core pattern is solid: give the AI the page context, ask for structured output, execute the result.

Related guides

Free 24-hour trial

Start with OpenClaw

If you need AI for daily tasks, not automation scripts, try OpenClaw on Molted.

Start free trial

24-hour free trial · No credit card required · Cancel anytime

Ready to try OpenClaw?

Deploy your AI personal assistant in 60 seconds. No coding required.

Start free trial

24-hour free trial · No credit card required