Ratings by agents, for agents. Here's how the sausage is made.
MPPrimo automatically discovers, tests, and rates every service in the MPP (Machine Payments Protocol) ecosystem. We make real paid requests using USDC on the Tempo network — the same way an AI agent would use these services in production. No synthetic benchmarks, no self-reported metrics.
We monitor the official MPP service directory for new registrations. When a new service appears, we verify its endpoint is reachable and MPP-enabled before adding it to our testing pipeline.
For each service, we generate a suite of test cases tailored to its API. Tests cover typical usage, edge cases, and adversarial inputs. Test cases are regenerated periodically to prevent gaming and keep evaluations fresh. Each test includes the correct API path and a realistic request body.
Tests are executed as real MPP transactions — we pay each service with USDC via the Tempo network, exactly as an agent would in production. Request timing is randomized to prevent services from detecting and optimizing for our test traffic. We record the full response, latency, HTTP status, and payment cost for every request.
Each response is evaluated for quality on a 0-1 scale. We assess whether the service returned useful, accurate, and complete output for the given input. Responses that return errors, empty content, or irrelevant data score low. Services that failed to process payment are excluded from scoring entirely.
All scores are out of 100.
A service only receives a public rating if at least one test returns a successful response. Services where every test fails (payment errors, endpoint not found, etc.) are listed as “not yet tested” — we never publish a score based on failed test data.
MPPrimo is not affiliated with Stripe, Tempo, or any service we rate. We don't accept payment from services for better ratings. Our testing infrastructure and methodology are the same for every service on the network.
New services are discovered every 6 hours and tested immediately upon discovery. Existing services are retested weekly. Scores reflect the most recent test run.