What is document chunking and why do I need it?
[+]
Document chunking splits large documents into smaller, semantically meaningful pieces for AI processing. LLMs have context window limits (8K-128K tokens), so a 500-page PDF needs to be chunked before it can be searched or analyzed. Poor chunking = lost context = bad AI responses.
How does payment work?
[+]
We accept USDC on Solana with two payment methods:
1. Manual (Web UI): Connect your Phantom wallet on this page, select a paid tier, and approve the USDC transfer. The transaction signature is automatically sent with your file.
2. Programmatic (x402 API): For AI agents and developers - call /estimate to get pricing, execute a USDC transfer on Solana, then include the TX signature in the X-PAYMENT header when calling the chunking endpoint.
Payment is per-page based on document size (~500 chars = 1 page). Demo tier is free (100 pages/day limit).
What's the difference between the tiers?
[+]
Demo (Free): Basic character splitting. Good for testing.
Standard ($0.001/page): Sentence-aware chunking with semantic boundaries. Best for most use cases.
Professional ($0.008/page): Adds context injection and entity extraction. When a chunk mentions "He," we prepend who "He" refers to.
What file formats are supported?
[+]
Currently: PDF and TXT files. PDFs must be text-based (not scanned images). We extract text and calculate pages at ~500 characters per page.
Can AI agents use this API autonomously?
[+]
Yes! That's what we're built for. Agents can call /estimate to get pricing, execute a USDC transfer on Solana, then call the chunking endpoint with the TX signature in the X-PAYMENT header. Fully autonomous, no human intervention needed.
Is my data stored or logged?
[+]
No. Documents are processed in memory and immediately discarded. We don't store your files, chunks, or content. Only basic request logs (IP, timestamp, file size) are kept for rate limiting and abuse prevention.