* feat: add 12 official Apify skills for web scraping and data extraction Add the complete Apify agent-skills collection as official vendor skills, bringing the total skill count from 954 to 966. New skills: - apify-actor-development: Develop, debug, and deploy Apify Actors - apify-actorization: Convert existing projects into Apify Actors - apify-audience-analysis: Audience demographics across social platforms - apify-brand-reputation-monitoring: Track reviews, ratings, and sentiment - apify-competitor-intelligence: Analyze competitor strategies and pricing - apify-content-analytics: Track engagement metrics and campaign ROI - apify-ecommerce: E-commerce data scraping for pricing intelligence - apify-influencer-discovery: Find and evaluate influencers - apify-lead-generation: B2B/B2C lead generation from multiple platforms - apify-market-research: Market conditions and geographic opportunities - apify-trend-analysis: Discover emerging trends across platforms - apify-ultimate-scraper: Universal AI-powered web scraper Existing skill fixes: - design-orchestration: Add missing description, fix markdown list spacing - multi-agent-brainstorming: Add missing description, fix markdown list spacing Registry and documentation updates: - Update skill count to 966+ across README.md, README.vi.md - Add Apify to official sources in SOURCES.md and all README variants - Register new skills in catalog.json, skills_index.json, bundles.json, aliases.json - Update CATALOG.md category counts (data-ai: 152, infrastructure: 95) Validation script improvements: - Raise description length limit from 200 to 1024 characters - Add empty description validation check - Apply PEP 8 formatting (line length, spacing, trailing whitespace) * refactor: truncate skill descriptions in SKILL.md files and revert description length validation to 200 characters. * feat: Add `apify-ultimate-scraper` to data-ai and move `apify-lead-generation` from business to general categories.
2.4 KiB
2.4 KiB
Python Actorization
Install the Apify SDK
pip install apify
Wrap Main Function with Actor Context Manager
import asyncio
from apify import Actor
async def main() -> None:
async with Actor:
# ============================================
# Your existing code goes here
# ============================================
# Example: Get input from Apify Console or API
actor_input = await Actor.get_input()
print(f'Input: {actor_input}')
# Example: Your crawler or processing logic
# crawler = PlaywrightCrawler(...)
# await crawler.run([actor_input.get('startUrl')])
# Example: Push results to dataset
# await Actor.push_data({'result': 'data'})
# ============================================
# End of your code
# ============================================
if __name__ == '__main__':
asyncio.run(main())
Key Points
async with Actor:handles both initialization and cleanup- Automatically manages platform event listeners and graceful shutdown
- Local execution remains unchanged - the SDK automatically detects the environment
Crawlee Python Projects
import asyncio
from apify import Actor
from crawlee.playwright_crawler import PlaywrightCrawler
async def main() -> None:
async with Actor:
# Get and validate input
actor_input = await Actor.get_input() or {}
start_url = actor_input.get('startUrl', 'https://example.com')
max_items = actor_input.get('maxItems', 100)
item_count = 0
async def request_handler(context):
nonlocal item_count
if item_count >= max_items:
return
title = await context.page.title()
await context.push_data({'url': context.request.url, 'title': title})
item_count += 1
crawler = PlaywrightCrawler(request_handler=request_handler)
await crawler.run([start_url])
if __name__ == '__main__':
asyncio.run(main())
Batch Processing Scripts
import asyncio
from apify import Actor
async def main() -> None:
async with Actor:
actor_input = await Actor.get_input() or {}
items = actor_input.get('items', [])
for item in items:
result = process_item(item)
await Actor.push_data(result)
if __name__ == '__main__':
asyncio.run(main())