Harvesting vs Scraping

Harvesting vs Scraping: Building Both Sides in Rust with Ares and Ceres

Two Rust projects, one conceptual divide. Ares fetches arbitrary web pages and uses LLMs to extract structured data; Ceres harvests metadata from CKAN portals and indexes it semantically. Together they show what it looks like to move from scraping scripts to production data pipelines.

February 20, 2026 · 14 min · 2907 words · Andrea Bozzo
Ceres Logo

Ceres: Semantic Search for Open Data

Ceres is a semantic search engine for CKAN portals. Built in Rust with Tokio and PostgreSQL+pgvector, it bridges the gap between how people search and how public administrations name their datasets.

December 20, 2025 · 7 min · 1454 words · Andrea Bozzo