Files
Ahmed Rehan 2f55f046b9 feat: add 12 official Apify agent-skills for web scraping & data extraction (#165)
* feat: add 12 official Apify skills for web scraping and data extraction

Add the complete Apify agent-skills collection as official vendor skills,
bringing the total skill count from 954 to 966.

New skills:
- apify-actor-development: Develop, debug, and deploy Apify Actors
- apify-actorization: Convert existing projects into Apify Actors
- apify-audience-analysis: Audience demographics across social platforms
- apify-brand-reputation-monitoring: Track reviews, ratings, and sentiment
- apify-competitor-intelligence: Analyze competitor strategies and pricing
- apify-content-analytics: Track engagement metrics and campaign ROI
- apify-ecommerce: E-commerce data scraping for pricing intelligence
- apify-influencer-discovery: Find and evaluate influencers
- apify-lead-generation: B2B/B2C lead generation from multiple platforms
- apify-market-research: Market conditions and geographic opportunities
- apify-trend-analysis: Discover emerging trends across platforms
- apify-ultimate-scraper: Universal AI-powered web scraper

Existing skill fixes:
- design-orchestration: Add missing description, fix markdown list spacing
- multi-agent-brainstorming: Add missing description, fix markdown list spacing

Registry and documentation updates:
- Update skill count to 966+ across README.md, README.vi.md
- Add Apify to official sources in SOURCES.md and all README variants
- Register new skills in catalog.json, skills_index.json, bundles.json, aliases.json
- Update CATALOG.md category counts (data-ai: 152, infrastructure: 95)

Validation script improvements:
- Raise description length limit from 200 to 1024 characters
- Add empty description validation check
- Apply PEP 8 formatting (line length, spacing, trailing whitespace)

* refactor: truncate skill descriptions in SKILL.md files and revert  description length validation to 200 characters.

* feat: Add `apify-ultimate-scraper` to data-ai and move `apify-lead-generation` from business to general categories.
2026-03-01 10:02:50 +01:00

2.4 KiB

JavaScript/TypeScript Actorization

Install the Apify SDK

npm install apify

Wrap Main Code with Actor Lifecycle

import { Actor } from 'apify';

// Initialize connection to Apify platform
await Actor.init();

// ============================================
// Your existing code goes here
// ============================================

// Example: Get input from Apify Console or API
const input = await Actor.getInput();
console.log('Input:', input);

// Example: Your crawler or processing logic
// const crawler = new PlaywrightCrawler({ ... });
// await crawler.run([input.startUrl]);

// Example: Push results to dataset
// await Actor.pushData({ result: 'data' });

// ============================================
// End of your code
// ============================================

// Graceful shutdown
await Actor.exit();

Key Points

  • Actor.init() configures storage to use Apify API when running on platform
  • Actor.exit() handles graceful shutdown and cleanup
  • Both calls must be awaited
  • Local execution remains unchanged - the SDK automatically detects the environment

Crawlee Projects

Crawlee projects require minimal changes - just wrap with Actor lifecycle:

import { Actor } from 'apify';
import { PlaywrightCrawler } from 'crawlee';

await Actor.init();

// Get and validate input
const input = await Actor.getInput();
const {
    startUrl = 'https://example.com',
    maxItems = 100,
} = input ?? {};

let itemCount = 0;

const crawler = new PlaywrightCrawler({
    requestHandler: async ({ page, request, pushData }) => {
        if (itemCount >= maxItems) return;

        const title = await page.title();
        await pushData({ url: request.url, title });
        itemCount++;
    },
});

await crawler.run([startUrl]);

await Actor.exit();

Express/HTTP Servers

For web servers, use standby mode in actor.json:

{
    "actorSpecification": 1,
    "name": "my-api",
    "usesStandbyMode": true
}

Then implement readiness probe. See standby-mode.md.

Batch Processing Scripts

import { Actor } from 'apify';

await Actor.init();

const input = await Actor.getInput();
const items = input.items || [];

for (const item of items) {
    const result = processItem(item);
    await Actor.pushData(result);
}

await Actor.exit();