Files
antigravity-skills-reference/skills/web-scraper/references/output-templates.md
ProgramadorBrasil 61ec71c5c7 feat: add 52 specialized AI agent skills (#217)
New skills covering 10 categories:

**Security & Audit**: 007 (STRIDE/PASTA/OWASP), cred-omega (secrets management)
**AI Personas**: Karpathy, Hinton, Sutskever, LeCun (4 sub-skills), Altman, Musk, Gates, Jobs, Buffett
**Multi-agent Orchestration**: agent-orchestrator, task-intelligence, multi-advisor
**Code Analysis**: matematico-tao (Terence Tao-inspired mathematical code analysis)
**Social & Messaging**: Instagram Graph API, Telegram Bot, WhatsApp Cloud API, social-orchestrator
**Image Generation**: AI Studio (Gemini), Stability AI, ComfyUI Gateway, image-studio router
**Brazilian Domain**: 6 auction specialist modules, 2 legal advisors, auctioneers data scraper
**Product & Growth**: design, invention, monetization, analytics, growth engine
**DevOps & LLM Ops**: Docker/CI-CD/AWS, RAG/embeddings/fine-tuning
**Skill Governance**: installer, sentinel auditor, context management

Each skill includes:
- Standardized YAML frontmatter (name, description, risk, source, tags, tools)
- Structured sections (Overview, When to Use, How it Works, Best Practices)
- Python scripts and reference documentation where applicable
- Cross-platform compatibility (Claude Code, Antigravity, Cursor, Gemini CLI, Codex CLI)

Co-authored-by: ProgramadorBrasil <214873561+ProgramadorBrasil@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 10:04:07 +01:00

12 KiB

Output Templates Reference

Complete formatting templates for all supported output formats. Every output must be wrapped in a delivery envelope with metadata.


Delivery Envelope (Required)

Every extraction result MUST include this metadata wrapper, regardless of output format:

## Extraction Results

**Source:** [Page Title](https://example.com/page)
**Date:** 2026-02-25 14:30 UTC
**Items:** 47 records
**Confidence:** HIGH
**Format:** Markdown Table

---

[DATA GOES HERE]

---

**Notes:**
- Any gaps, anomalies, or observations
- Filters or sorts applied
- Pages scraped (if paginated)

Markdown Table Format

Standard Table

| Name           | Price    | Rating | Availability |
|:---------------|---------:|:------:|:-------------|
| Product Alpha  |   $29.99 |  4.5   | In Stock     |
| Product Beta   |   $49.99 |  4.2   | In Stock     |
| Product Gamma  |  $119.00 |  4.8   | Pre-order    |
| Product Delta  |   $15.50 |  3.9   | Out of Stock |

Alignment Rules

Data Type Alignment Markdown Syntax
Text Left :---
Numbers Right ---:
Centered Center :---:
Mixed/Status Left :---

Table with Summary Row

| Product        | Units Sold | Revenue    |
|:---------------|----------:|-----------:|
| Widget A       |     1,234 |  $12,340   |
| Widget B       |       567 |   $8,505   |
| Widget C       |     2,890 |  $57,800   |
| **Total**      | **4,691** | **$78,645**|

Wide Data (Split Tables)

When data has more than 10 columns, split into logical groups:

### Basic Information

| Name    | Category | Brand   | SKU      |
|:--------|:---------|:--------|:---------|
| Item A  | Tools    | Acme    | ACM-001  |

### Pricing and Availability

| Name    | Price   | Sale Price | Stock | Ships In |
|:--------|--------:|-----------:|:------|:---------|
| Item A  | $49.99  |    $39.99  | 142   | 2 days   |

Multi-URL Comparison Table

| Source       | Product    | Price   | Rating |
|:-------------|:-----------|--------:|:------:|
| store-a.com  | Laptop X   | $999    |  4.3   |
| store-b.com  | Laptop X   | $949    |  4.5   |
| store-c.com  | Laptop X   | $1,029  |  4.1   |

Truncation Rules

For values exceeding 60 characters:

| Title                                                       | Author  |
|:------------------------------------------------------------|:--------|
| Introduction to Advanced Machine Learning Techni...         | J. Smith|

JSON Format

Standard JSON Output

{
  "metadata": {
    "source": "https://example.com/products",
    "title": "Product Catalog - Example Store",
    "extractedAt": "2026-02-25T14:30:00Z",
    "itemCount": 3,
    "confidence": "HIGH",
    "fields": ["name", "price", "rating", "availability"],
    "notes": []
  },
  "data": [
    {
      "name": "Product Alpha",
      "price": 29.99,
      "currency": "USD",
      "rating": 4.5,
      "availability": "In Stock"
    },
    {
      "name": "Product Beta",
      "price": 49.99,
      "currency": "USD",
      "rating": 4.2,
      "availability": "In Stock"
    },
    {
      "name": "Product Gamma",
      "price": 119.00,
      "currency": "USD",
      "rating": 4.8,
      "availability": "Pre-order"
    }
  ]
}

JSON Key Naming

Rule Example
camelCase productName, unitPrice
Numbers stay numeric 29.99 not "29.99"
Booleans stay boolean true not "true"
Missing = null null not "" or "N/A"
Arrays for multiples "tags": ["sale", "new"]
ISO-8601 for dates "2026-02-25T14:30:00Z"

Nested JSON (Product with Details)

{
  "metadata": { "..." : "..." },
  "data": [
    {
      "name": "Laptop Pro X",
      "brand": "TechCo",
      "pricing": {
        "current": 999.99,
        "original": 1299.99,
        "currency": "USD",
        "discount": "23%"
      },
      "rating": {
        "score": 4.5,
        "count": 1234
      },
      "specifications": {
        "processor": "M3 Pro",
        "ram": "16 GB",
        "storage": "512 GB SSD",
        "display": "14.2 inch Retina"
      },
      "availability": {
        "inStock": true,
        "shipsIn": "2-3 business days"
      }
    }
  ]
}

Multi-URL JSON

{
  "metadata": {
    "sources": [
      "https://store-a.com/laptop-x",
      "https://store-b.com/laptop-x"
    ],
    "extractedAt": "2026-02-25T14:30:00Z",
    "itemCount": 2,
    "confidence": "HIGH"
  },
  "data": [
    {
      "source": "store-a.com",
      "name": "Laptop X",
      "price": 999,
      "currency": "USD",
      "rating": 4.3
    },
    {
      "source": "store-b.com",
      "name": "Laptop X",
      "price": 949,
      "currency": "USD",
      "rating": 4.5
    }
  ]
}

CSV Format

Standard CSV

# Source: https://example.com/products
# Extracted: 2026-02-25 14:30 UTC
# Items: 3 | Confidence: HIGH
name,price,currency,rating,availability
"Product Alpha",29.99,USD,4.5,"In Stock"
"Product Beta",49.99,USD,4.2,"In Stock"
"Product Gamma",119.00,USD,4.8,"Pre-order"

CSV Rules

Rule Example
Always include header row name,price,rating
Quote fields with commas "Smith, John"
Quote fields with quotes (escape) "He said ""hello"""
Quote fields with newlines "Line 1\nLine 2"
UTF-8 encoding with BOM \xEF\xBB\xBF prefix
Comma delimiter (standard) ,
Metadata as comments (# prefix) # Source: URL
null/missing as empty field field1,,field3

Multi-URL CSV

# Sources: store-a.com, store-b.com
# Extracted: 2026-02-25 14:30 UTC
source,name,price,currency,rating
"store-a.com","Laptop X",999,USD,4.3
"store-b.com","Laptop X",949,USD,4.5

Summary Statistics Template

When extracted data contains numeric fields, include a summary block:

### Summary Statistics

| Metric    | Price     | Rating |
|:----------|----------:|-------:|
| Count     |        47 |     47 |
| Min       |    $12.99 |    2.1 |
| Max       |   $299.99 |    5.0 |
| Average   |    $67.42 |    4.1 |
| Median    |    $54.99 |    4.3 |

Include only when:

  • Data has numeric columns
  • More than 5 items extracted
  • User would likely benefit from aggregate view (prices, ratings, quantities)

Contact Data Template

| Name           | Title              | Email                | Phone          |
|:---------------|:-------------------|:---------------------|:---------------|
| Jane Smith     | CEO                | jane@example.com     | +1-555-0101    |
| John Doe       | CTO                | john@example.com     | +1-555-0102    |
| Alice Johnson  | VP Engineering     | alice@example.com    | N/A            |

Article Extraction Template

## Article: [Title]

**Author:** Author Name
**Published:** YYYY-MM-DD
**Source:** [Site Name](URL)

### Summary
[2-3 sentence summary of the article content]

### Key Data Points
- [Factual data point 1]
- [Factual data point 2]
- [Statistical finding]

### Tags
`tag1` `tag2` `tag3`

Note: Summarize article content. Do not reproduce full article text due to copyright.


FAQ Extraction Template

### FAQ: [Page Title]

**Source:** [Site Name](URL)
**Items:** 12 questions

| # | Question | Answer (excerpt) |
|--:|:---------|:-----------------|
| 1 | How do I reset my password? | Navigate to Settings > Security and click "Reset..." |
| 2 | What payment methods do you accept? | We accept Visa, Mastercard, PayPal, and bank transfer... |

Or as JSON (default for FAQ mode):

{
  "metadata": { "source": "URL", "itemCount": 12, "confidence": "HIGH" },
  "data": [
    { "question": "How do I reset my password?", "answer": "Navigate to...", "category": "Account" },
    { "question": "What payment methods?", "answer": "We accept...", "category": "Billing" }
  ]
}

Pricing Plans Template

### Pricing: [Product Name]

**Source:** [Site Name](URL)
**Plans:** 3 tiers

| Plan        | Monthly   | Annual    | Highlighted |
|:------------|----------:|----------:|:-----------:|
| Starter     |    $9/mo  |   $7/mo   |             |
| Pro         |   $29/mo  |  $24/mo   |     *       |
| Enterprise  |  Custom   |  Custom   |             |

#### Feature Comparison

| Feature               | Starter | Pro | Enterprise |
|:----------------------|:-------:|:---:|:----------:|
| Users                 | 1       | 10  | Unlimited  |
| Storage               | 5 GB    | 50 GB | Unlimited |
| API Access            | N/A     | Yes | Yes        |
| Priority Support      | N/A     | N/A | Yes        |

Job Listings Template

| Title              | Company     | Location       | Salary          | Type      | Posted     |
|:-------------------|:------------|:---------------|:----------------|:----------|:-----------|
| Senior Engineer    | TechCo      | Remote, US     | $150k - $200k   | Full-time | 2026-02-20 |
| Product Manager    | StartupXYZ  | San Francisco  | $130k - $160k   | Full-time | 2026-02-18 |
| Data Analyst       | DataCorp    | London, UK     | GBP 55k - 70k   | Contract  | 2026-02-22 |

Events Template

| Event                  | Date       | Time    | Location          | Speakers       |
|:-----------------------|:-----------|:--------|:------------------|:---------------|
| Opening Keynote        | 2026-03-15 | 09:00   | Main Hall         | J. Smith       |
| Workshop: AI Basics    | 2026-03-15 | 14:00   | Room 201          | A. Johnson     |
| Networking Reception   | 2026-03-15 | 18:00   | Rooftop Lounge    | N/A            |

Differential (Diff) Output Template

When comparing current extraction with a previous run:

## Extraction Results (Diff)

**Source:** [Page Title](URL)
**Date:** 2026-02-25 14:30 UTC
**Compared to:** 2026-02-20 10:00 UTC
**Changes:** +5 new, -2 removed, 3 modified

---

### New Items (+5)

| Name           | Price    | Rating |
|:---------------|--------:|:------:|
| Product Eta    |  $39.99 |  4.6   |
| Product Theta  |  $24.99 |  4.1   |
| ...            |         |        |

### Removed Items (-2)

| Name           | Price    | Rating |
|:---------------|--------:|:------:|
| ~~Product Alpha~~ | ~~$29.99~~ | ~~4.5~~ |
| ~~Product Beta~~  | ~~$49.99~~ | ~~4.2~~ |

### Modified Items (3)

| Name           | Field   | Was        | Now        |
|:---------------|:--------|:-----------|:-----------|
| Product Gamma  | Price   | $119.00    | $109.00    |
| Product Gamma  | Rating  | 4.8        | 4.9        |
| Product Delta  | Stock   | Out of Stock | In Stock |

---

**Summary:**
- 5 new products added since last extraction
- 2 products removed (possibly discontinued)
- Product Gamma had a price drop of $10 and rating increase
- Product Delta is back in stock

Error / Partial Result Template

When extraction partially fails:

## Extraction Results (Partial)

**Source:** [Page Title](URL)
**Date:** 2026-02-25 14:30 UTC
**Items:** 23 of ~50 expected records
**Confidence:** LOW
**Strategy:** A (WebFetch) -> escalated to B (Browser)

---

[PARTIAL DATA]

---

**Issues:**
- 27 items could not be extracted (content behind JS rendering)
- Price field missing for 5 items (marked N/A)
- Auto-escalation from WebFetch to Browser recovered 15 additional items

**Suggestions:**
- Re-run with explicit Browser automation for complete results
- Check if site has an API endpoint for direct data access
- Try at a different time if rate-limited