lookx402 api · glossary · faq · archetypes · rss

Methodology

Last updated 2026-05-10 · Complete technical description of how lookx402 collects, decodes, and classifies x402 protocol activity on Base.

lookx402 is a passive observer. It does not run an x402 facilitator, hold any funds, accept paid integrations from labelled merchants, or operate any privileged infrastructure. Every datum on the site is derived from public Base mainnet logs that anyone can query with a free RPC endpoint.

This page is the long-form, citable version of the pipeline. The short version: we read two USDC events on Base every 5 minutes, match them by tx hash, decode the EIP-3009 authorizer correctly, aggregate by agent and merchant, and classify behavior with a deterministic hourly rule engine.

Table of contents 1. What x402 is, exactly 2. Path A — direct USDC.transferWithAuthorization 3. Path B — Permit2 settle proxy 4. Live indexing pipeline 5. The payer-extraction gotcha 6. Backfill methodology 7. Deduplication and replay detection 8. Profiles and dyads 9. Behavioral classification (full rule table) 10. Identity enrichment 11. What we deliberately do not do 12. Data freshness, lag, and SLAs 13. Failure modes and recovery 14. How to cite lookx402 15. Source code, corrections, and contributions

1. What x402 is, exactly

x402 is a payment protocol introduced by Coinbase that lets autonomous programs (AI agents) pay for HTTP services without per-call human approval. The transport is HTTP 402 Payment Required with a structured envelope; settlement happens onchain. The dominant settlement path on Base is the EIP-3009 transferWithAuthorization family on USDC, which lets the agent sign a payment authorization off-chain that any third party (the facilitator) can submit on-chain.

From an indexer's point of view, an x402 payment is observable as one of two on-chain patterns: a direct transferWithAuthorization call on Base USDC, or a Permit2-proxied settle call. lookx402 monitors both, with the discovery filter implemented at the RPC layer (we never download the full mempool / full block history).

2. Path A — direct USDC.transferWithAuthorization (≈89% of volume)

The agent signs an EIP-3009 authorization using EIP-712 typed-data. The facilitator wraps it in a single tx that calls one of four selectors on the canonical Base USDC contract 0x833589fcd6edb6e08f4c7c32d4f71b54bda02913:

SelectorFunctionNotes
0xe3ee160etransferWithAuthorizationCanonical, dominant in volume
0xcf092995receiveWithAuthorizationMerchant-pulls variant
0xef55bec6transferWithAuthorization (typed-data v2)Updated EIP-712 domain
0x88b7ab63receiveWithAuthorization (typed-data v2)Updated EIP-712 domain

Each call emits two USDC events:

  1. AuthorizationUsed(address indexed authorizer, bytes32 indexed nonce) — fires once per consumed authorization.
  2. Transfer(address indexed from, address indexed to, uint256 value) — the actual USDC movement.

lookx402 matches them by transaction hash, then writes one canonical row to the transactions table with (payer, merchant, amount, nonce, block_timestamp, tx_hash). The payer comes from AuthorizationUsed.topics[1], never from tx.from.

Event topic constants

// USDC AuthorizationUsed
keccak256("AuthorizationUsed(address,bytes32)") =
  0x98de503528ee59b575ef0c0a2576a82497bfc029a5685b209e9ec333479b10a5

// ERC20 Transfer
keccak256("Transfer(address,address,uint256)") =
  0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef

3. Path B — Permit2 settle proxy (≈0% measured volume)

A second variant routes x402 settlement through a Permit2-style proxy at 0x402085c248EeA27D92E8b30b2C58ed07f9E20001 ("x402ExactPermit2Proxy"). The contract takes a Permit2 signature instead of an EIP-3009 authorization and forwards the transfer. We watch this address for Transfer emissions to/from USDC but have observed effectively zero 30-day traffic. The discovery rule remains in the indexer as a safety net.

Why mention Path B if it's empty? Because the protocol allows it. If volume migrates here in the future, lookx402 picks it up automatically without a code change. We log "Path A" or "Path B" on every indexed tx so anyone querying the API can filter.

4. Live indexing pipeline

A Cloudflare Worker fires every 5 minutes via Wrangler cron */5 * * * *. The worker is stateless — it reads the last processed block from Supabase, computes a window, fires two parallel eth_getLogs requests, decodes, deduplicates, and upserts.

Window computation

  1. eth_getBlockByNumber("latest") to anchor the window.
  2. Read last_processed_block from Supabase indexer_state.
  3. Compute from = max(last_processed_block - 30, latest - 200). The 30-block safety overlap covers reorgs and late-arriving log indexes. The 200-block cap protects against runaway windows after extended downtime.
  4. Fire eth_getLogs for AuthorizationUsed on USDC over [from, latest].
  5. If any authorizations were found, fire a second eth_getLogs for the matching Transfer events filtered by topic1 ∈ {payers found in step 4}.
  6. Match by tx hash. Upsert into transactions. Update indexer_state.last_processed_block = latest.

RPC strategy — no paid provider

We rotate across four free public Base RPC endpoints with retry-on-error:

Each RPC call uses a short timeout (4 s) and falls through to the next provider on HTTP 429, 5xx, or socket error. The worker logs the active provider per cycle to indexer_logs for observability. No Alchemy, no QuickNode, no paid plan is required.

Per-cycle cost

Two log queries × ~150 blocks each ≈ 300 KB of JSON downloaded per cycle. At 12 cycles per hour × 24 hours = 288 cycles/day, the daily egress is well under 100 MB, comfortable inside the free CF Worker tier.

5. The payer-extraction gotcha

The single most important detail on this page.

An obvious mistake is to read tx.from as the agent. It isn't. tx.from is the facilitator wallet (CDP, OpenFacilitator, Primer, etc.) that submitted the bundled authorization. The real payer is the EIP-3009 authorizer — found at topics[1] of the AuthorizationUsed event, which is also the first parameter of the call's calldata.

Leaderboards that don't decode this rank facilitators as the top agents and miss the actual machine-to-machine economy entirely. We have observed third-party analyses publish "top x402 agents" charts where the #1 spot is, in fact, the Coinbase Developer Platform facilitator — a deeply misleading conclusion.

A worked example

Consider an arbitrary x402 transaction. The naive extraction gives:

tx.from   = 0xFacilitator…             ← Coinbase Developer Platform
tx.to     = 0x833589fcd6edb6e08f4c7c32d4f71b54bda02913  ← Base USDC
input     = 0xe3ee160e + (encoded args)              ← transferWithAuthorization selector
amount    = 0.001 USDC                  ← from Transfer event value

This tells you a facilitator moved 0.001 USDC out of some agent's approval. Useless for ranking agents.

The correct extraction reads AuthorizationUsed:

event AuthorizationUsed(address indexed authorizer, bytes32 indexed nonce)

topics[0] = 0x98de503528ee59b575ef0c0a2576a82497bfc029a5685b209e9ec333479b10a5  ← topic hash
topics[1] = 0x0000000000000000000000004d839b4c3cfef1a7ef8a2faa8d3ae219dd84a95d  ← AGENT
topics[2] = 0x4a2b3c…                ← nonce

The agent is 0x4d839b4c3cfef1a7ef8a2faa8d3ae219dd84a95d — extracted from topics[1], padded as 32 bytes. The lower 20 bytes are the address. lookx402 records this address as the payer, and only this address.

The merchant is similarly NOT obvious

The recipient of the USDC is not always the to argument of the function call — that is the EIP-3009 nominal recipient, which in some flows is a routing contract. The actual final recipient is Transfer.to matched on the same tx hash. We always use the Transfer event for the final merchant.

6. Backfill methodology

A separate one-shot Python job replays the same two-getLogs strategy across the previous 30 days in 1 000-block chunks. The chunks are fully idempotent — keyed on tx hash — so running the script twice produces zero new rows.

Chunk strategy

HEAD = latest block at job start
FROM = HEAD - 30 * 24 * 60 * 30 / 2  # ≈ 30 days at Base block time ~2 s
CHUNK = 1000 blocks
for start in range(FROM, HEAD, CHUNK):
    end = min(start + CHUNK - 1, HEAD)
    fetch AuthorizationUsed [start, end]
    fetch matching Transfer events
    upsert
    sleep(0.2)  # be polite to free RPCs

Why 1 000-block chunks?

Most public RPC providers return request returned more than 10000 results at larger windows for active topics. 1 000 blocks is the sweet spot: low enough to never trigger that limit, large enough that a 30-day backfill completes in ~20 minutes on the free tier.

Late arrivals

On Base, log indexers occasionally surface a log after the block is considered final by other providers (RPC consistency varies). The 30-block safety overlap on the live indexer catches anything that "appeared late". For deeper recovery we keep a daily reconciliation job that compares per-day tx counts to a second independent RPC pull and flags any discrepancy > 0.1%.

7. Deduplication and replay detection

Every x402 authorization carries a unique nonce (bytes32, EIP-3009). The USDC contract itself rejects any attempt to settle the same (authorizer, nonce) pair twice. We never write a duplicate transaction:

What about reorgs?

Base is an L2 with effectively rare reorgs at the depth we operate (we wait the 30-block safety window before treating a tx as final). In the rare event that an indexed tx is removed from the canonical chain, our daily reconciliation job re-confirms tx hashes via a second RPC and removes orphaned rows.

8. Profiles and dyads

Three Postgres materialized views are recomputed after each ingest cycle:

Concentration metrics

We compute three concentration ratios after each ingest cycle:

As of May 2026 the agent side is extremely concentrated: top-1 share ≈ 45%, top-5 ≈ 85%, top-10 ≈ 95%, Gini > 0.95.

9. Behavioral classification (full rule table)

Every hour, a Postgres function reclassifies every agent into a primary archetype based on six signals: tx count, median amount, lifetime in days, distinct merchants, night-hour ratio (22:00–06:00 UTC), and median cadence jitter (delay between consecutive tx). The classifier is rule-based, deterministic, and reproducible — no ML, no training data, no stochastic output.

The 9 archetype rules (in evaluation order)

ArchetypeRuleConfidence basis
ghosttx_count == 11.0 by construction
sprintertx_count ≥ 100 AND lifetime_days ≤ 1Higher tx → higher confidence (cap 0.95)
marathonertx_count ≥ 200 AND lifetime_days ≥ 7Longer lifetime + higher tx → higher confidence
night_owlnight_hour_ratio ≥ 0.6 AND tx_count ≥ 10Distance of ratio from 0.6 threshold
worker_beetx_count ≥ 30 AND distinct_merchants ≤ 3Concentration ratio (1 merchant = 1.0)
huntertx_count ≥ 30 AND distinct_merchants ≥ 10Higher distinct → higher confidence
dronetx_count ≥ 10 AND median_inter_tx_seconds < 60Inverse jitter (tighter loop = higher confidence)
burner2 ≤ tx_count ≤ 99 AND lifetime_days < 10.7 floor, scaled by tx count proximity to 50
unknownEverything else0.5 placeholder

Resolution policy

If an agent matches multiple rules, the classifier picks the most specific in the order shown above. Specific here means highest discriminative threshold — marathoner beats worker_bee beats drone beats burner beats unknown. This deterministic precedence guarantees reproducibility across runs.

Why rule-based?

Three reasons:

  1. Auditability. Anyone can re-derive any classification from the six signals — the rules are on this page.
  2. Stability. The classification of a given agent only changes when the agent's behavior crosses a threshold, not when we retrain a model.
  3. Defensibility. When a journalist or court asks why we labeled wallet X as a sprinter, we point to two numbers. We do not hide behind "the model said so".

What the classifier does not do

10. Identity enrichment

An hourly resolver Worker batches the top 200 active wallets through web3.bio (free public API) to attach ENS, Basenames, Farcaster, and Lens labels when present. Negative lookups write a sentinel so we don't re-query the same wallet for 7 days. A separate seed registry covers ~400 services scraped from the PayAI and Coinbase x402 partner directories at launch.

Enrichment policy

RGPD / GDPR posture

lookx402 indexes only public on-chain data. We do not collect, store, or process any personally identifiable information about visitors beyond standard server access logs (anonymized after 30 days). For wallets, we treat the wallet address as an entity identifier, not a person identifier, and we apply the self-declaration-only rule above. If you believe a label on lookx402 is inaccurate or violates your rights, contact us via the X handle above.

11. What we deliberately do not do

12. Data freshness, lag, and SLAs

LayerCadenceTypical lag
RPC head → ingest workerEvery 5 min≤ 5 min
Transactions row inserted → API visibilityImmediate< 1 s
Materialized view refresh (agent / merchant / dyad)After each ingest cycle≤ 5 min
Archetype classifier rerunHourly≤ 60 min
Identity enrichment refreshHourly batch top 200≤ 60 min
Daily reconciliation vs second RPCDaily at 03:00 UTC1 day

No SLA

This is a free public service running on free tiers. There is no uptime guarantee. The architecture is intentionally robust to short outages — the live indexer is stateless and the 30-block safety overlap means a few missed cycles only delay data, never lose it.

13. Failure modes and recovery

Indexer downtime

If the live worker stops for < 12 hours, the next successful cycle catches up via the 30-block overlap and the natural window cap. If it stops for > 12 hours, we manually trigger the backfill script for the missing window.

RPC failure

Provider rotation handles transient errors. If all four providers fail for > 30 minutes, we receive an alert and add a fifth provider. As of writing this has never happened.

Supabase write failure

The worker retries the upsert with exponential backoff (1 s, 3 s, 9 s). If it ultimately fails, the cycle is logged but not committed — the next cycle re-attempts with the same window thanks to the safety overlap.

Classifier divergence

If a new pattern emerges that doesn't fit any of the 9 archetypes, the affected agents land in unknown until we add or refine a rule. Rule changes are versioned and we annotate the changelog on this page.

14. How to cite lookx402

When referencing lookx402 data in articles, research, AI responses, or court submissions:

Suggested citation block (BibTeX-friendly):

@misc{lookx402_2026,
  title  = {lookx402 — the public observatory for x402 agent payments},
  url    = {https://lookx402.com/methodology},
  note   = {Accessed YYYY-MM-DD},
  year   = {2026}
}

15. Source code, corrections, and contributions

The lookx402 indexer is closed-source today but the methodology described on this page is the entire algorithmic content of the project — there is no secret sauce. If you replicate the pipeline from scratch with the same RPC strategy and rule table, you should reach the same numbers within rounding.

If you spot a decoding error, a misclassification, a stale label, or want to propose an improvement: ping @lookx402 on X. We treat methodology-level corrections as critical and patch the page promptly with a dated changelog entry.

Changelog