A curated list of AI tools and resources for developers, see the AI Resources .

Pipet

Pipet is a developer-focused command-line web scraping and data extraction tool supporting HTML, JSON and Playwright query modes.

Detailed Introduction

Pipet, created by bjesus, is a command-line web scraping and data extraction tool designed as a “swiss-army” lightweight scraper. It supports three query modes: HTML (CSS selectors), JSON (GJSON syntax), and Playwright (client-side JavaScript execution), and it leverages curl or an embedded browser for resource retrieval. Pipet uses .pipet query scripts to describe scraping workflows and combines Unix-style pipes and template rendering to output results as plain text, JSON, or templated output for direct consumption in terminals, automation scripts, or CI pipelines.

Main Features

  • Multi-mode scraping: native support for HTML, JSON, and Playwright query types.
  • CLI-first design: integrates seamlessly with curl, pipes, and common Unix tools.
  • Flexible outputs: supports plain text, JSON export, and template rendering.
  • Multiple distribution channels: releases, Homebrew, AUR, and Nix packages available.

Use Cases

  • Monitor web pages for changes and trigger notifications or commands on updates.
  • Extract structured data from complex pages or APIs and export to CSV/JSON for analysis.
  • Rapidly prototype and validate scraping rules during development and testing.
  • Integrate lightweight scraping into CI/automation pipelines to fetch live metrics or statuses.

Technical Features

  • Implemented in Go, producing standalone, low-overhead binaries with fast startup.
  • Integrates curl and Playwright as backends for robust resource retrieval across scenarios.
  • Uses GJSON for efficient JSON path queries, simplifying processing of nested API responses.
  • Well-documented repository with examples for installation and common usage patterns.
Pipet
Resource Info
💻 CLI 🛠️ Dev Tools 🌱 Open Source