Every price movement, every trade, every market resolution on Polymarket is permanently inscribed on the Polygon blockchain. That complete record — stretching back to the platform’s 2020 launch — is not locked behind a paywall or restricted to institutional clients. It is open, queryable, and waiting to be turned into trading edge by anyone willing to learn the toolchain. This guide walks through five concrete methods to access Polymarket’s historical data and shows what useful analysis actually looks like in practice.
If you are new to the underlying data architecture, start with the Polymarket API and on-chain data guide for a foundation in how the system is structured. This article focuses specifically on historical data: how far back it goes, how to pull it, how to clean it, and how to extract signal from the noise.
Why Historical Data Matters
Historical prediction market data serves three distinct purposes, each with real profit implications.
Backtesting. Before committing capital to any systematic trading strategy, you want to know whether it would have worked on past data. A strategy that identifies markets priced below 20% that subsequently resolve YES needs at minimum 500 historical examples to distinguish genuine edge from variance. Polymarket’s history, now spanning over five years and millions of trades, is large enough to run meaningful backtests on most strategy types.
Calibration analysis. A well-calibrated prediction market should resolve its 70% markets YES roughly 70% of the time. If Polymarket markets in a specific category — say, crypto price targets — consistently overestimate the probability of the bullish outcome, that systematic bias represents exploitable edge. Historical data lets you measure this directly rather than speculate about it. Our analysis of whether Polymarket is accurate covers the broader calibration question; historical data is how you go deeper.
Pattern recognition. Time-of-day volume patterns, resolution-week price compression, category-level accuracy divergences — these are all empirical questions answerable only with historical data. The traders consistently outperforming the market are not guessing at these patterns; they are measuring them.
Method 1: Polymarket REST API — The /prices-history Endpoint
The fastest route to historical price data for any individual market is the Polymarket CLOB API’s /prices-history endpoint. No API key is required. The base URL is https://clob.polymarket.com.
The key parameters are:
market_slug— The URL slug of the market (e.g.,will-btc-reach-100k-by-dec-2025)fidelity— Granularity of data points:1(minute),60(hourly),1440(daily)startTsandendTs— Unix timestamps bounding the query window
A typical response returns an array of objects, each with a Unix timestamp and a price expressed as a decimal between 0 and 1 (representing 0% to 100% probability). This is effectively the “close” price equivalent for prediction markets — there is no native OHLCV, but at minute-level granularity you can construct approximate open/high/low/close values from consecutive data points.
Here is a minimal Python example to pull daily price history for a market and output it as a CSV:
import requests, csv, time
CLOB_BASE = "https://clob.polymarket.com"
MARKET_SLUG = "will-the-fed-cut-rates-in-march-2026"
def fetch_price_history(slug, fidelity=1440):
url = f"{CLOB_BASE}/prices-history"
params = {
"market_slug": slug,
"fidelity": fidelity,
}
r = requests.get(url, params=params, timeout=30)
r.raise_for_status()
return r.json().get("history", [])
history = fetch_price_history(MARKET_SLUG)
with open("price_history.csv", "w", newline="") as f:
writer = csv.writer(f)
writer.writerow(["timestamp", "price"])
for point in history:
writer.writerow([point["t"], point["p"]])
print(f"Exported {len(history)} data points.")
For bulk historical research across many markets, combine this with the /markets endpoint (on the Gamma API at https://gamma-api.polymarket.com) to retrieve a list of all resolved markets, then loop through them pulling price history. Expect rate limits to kick in above roughly 300 requests per minute — add a short sleep between calls to stay well under this threshold.
Method 2: The Graph Subgraph — GraphQL for Historical Trades
The Graph Protocol is a decentralised indexing layer for blockchain data. Polymarket has a public subgraph deployed on The Graph that indexes all contract events — including every trade — and makes them queryable via GraphQL. Unlike the REST API, The Graph gives you wallet-level trade history, market creation events, and resolution data in a single query interface.
The Polymarket subgraph endpoint (search for “Polymarket” on thegraph.com/explorer) accepts standard GraphQL queries. A query to retrieve the 1,000 most recent trades across all markets looks like this:
query RecentTrades {
trades(
first: 1000
orderBy: timestamp
orderDirection: desc
) {
id
timestamp
maker { id }
taker { id }
outcome
price
amount
market { id question }
}
}
For historical research, the most useful query pattern is filtering by market ID and ordering by timestamp to reconstruct the complete trade sequence for any resolved market. This gives you individual trade records with wallet addresses, prices, and sizes — the raw material for calibration studies and strategy backtests. The Graph’s skip/first pagination lets you page through the entire trade history of a market without hitting API limits.
Cross-referencing The Graph data with the REST API price history creates a complete picture: continuous price series from the REST API, individual trade records from The Graph. For whale-tracking workflows that extend this data to wallet performance analysis, see the Polymarket whale tracking guide.
Method 3: Dune Analytics — SQL Over the Entire History
For analysts comfortable with SQL but not Python or GraphQL, Dune Analytics provides the most powerful and accessible interface for Polymarket historical research. Dune indexes the Polygon blockchain and exposes it as a queryable database with decoded contract events, making it possible to run arbitrary SQL queries across Polymarket’s entire history without setting up any local infrastructure.
Several community-built Dune dashboards aggregate Polymarket data into ready-to-use visualisations:
- Total volume by month and category going back to 2020
- Top traders by cumulative P&L over rolling windows
- Market resolution accuracy broken down by category
- USDC deposit and withdrawal flows at the protocol level
- New wallet creation rates and user retention cohorts
For custom analysis, the relevant Dune tables are in the polymarket_polygon schema (the exact schema names may differ — search “polymarket” in Dune’s table browser to find current names). A calibration query that checks whether 70% markets resolve YES 70% of the time might look like:
SELECT
ROUND(price_at_t_minus_24h, 1) AS price_bucket,
COUNT(*) AS num_markets,
AVG(CASE WHEN resolved_outcome = 'YES' THEN 1.0 ELSE 0.0 END) AS actual_yes_rate
FROM polymarket_resolved_markets
WHERE price_at_t_minus_24h BETWEEN 0.05 AND 0.95
GROUP BY 1
ORDER BY 1;
Dune queries can be forked from existing public dashboards — start with popular Polymarket dashboards and modify them for your specific research question. Results can be exported as CSV or visualised directly in Dune’s charting interface.
Method 4: Polygon Blockchain Direct — Event Logs via PolygonScan
For the most granular historical access — or when you need data that predates The Graph’s indexing — you can query the Polygon blockchain directly. PolygonScan provides a free API that returns transaction history and event logs for any smart contract address.
The Polymarket conditional token framework (CTF) contract and the CLOB exchange contract are the two primary contracts to query. Their addresses are documented in the Polymarket GitHub repositories. Using PolygonScan’s getLogs API endpoint with the appropriate event signature filters, you can retrieve every trade event emitted since the contracts were deployed.
This approach is more involved than the other methods — you will need to decode ABI-encoded event data to interpret the raw logs — but it is the only method that is guaranteed to have complete data with no gaps, since it reads directly from the chain rather than through an indexing layer. It is also the best approach for verifying that data from other sources is accurate.
The PolygonScan API’s free tier allows 5 requests per second, which is sufficient for most research but slow for bulk historical pulls. For large-scale data collection, consider running a local Polygon archive node — though this requires significant storage (several terabytes for a full archive).
Method 5: Third-Party Datasets — Kaggle, GitHub, and Community Sources
Several researchers have already done the heavy lifting of pulling and cleaning Polymarket’s historical data and shared it publicly. Searching Kaggle for “Polymarket” surfaces datasets with hundreds of thousands of resolved markets, pre-cleaned into CSV format with price histories, resolution outcomes, volumes, and market metadata.
GitHub repositories from the prediction market research community offer similar resources, often with accompanying analysis notebooks in Python or R. These pre-built datasets are particularly useful for:
- Getting started quickly without building a data pipeline
- Cross-validating results from your own data pulls
- Historical data from the platform’s early days (2020–2021) where API coverage can be sparse
- Benchmark datasets for comparing trading strategies
The main caveat with third-party datasets is staleness — a dataset last updated in 2024 misses all of 2025’s market activity. Use them as a historical baseline and supplement with fresh API pulls for recent data.
What Data Is Actually Available
Understanding the shape of available data sets realistic expectations for what analysis is possible. Polymarket’s historical record includes:
Price series. Continuous probability estimates at up to minute-level granularity for the duration of each market. This is the equivalent of OHLCV data in traditional markets — you can derive open, high, low, and close probabilities over any time window.
Trade records. Individual trades with taker and maker wallet addresses, trade price, trade size in USDC, and timestamp. This is the raw order flow data that powers calibration studies and smart-money tracking.
Market metadata. Question text, category, creation timestamp, resolution timestamp, resolution outcome (YES/NO), total volume, liquidity at various points, and market creator.
Wallet-level data. Complete trade history for any wallet address, allowing you to reconstruct portfolio performance, position sizing patterns, and market selection behaviour for any historical participant. The Polymarket charts guide covers how to visualise some of this data directly in the platform interface.
What is not natively available: order book snapshots at specific historical moments (the order book is not stored on-chain, only executed trades are), true tick data below minute granularity from the REST API, and pre-2020 data (the platform did not exist).
Data Access Methods Compared
Each method for accessing Polymarket historical data involves different trade-offs between ease of use, data completeness, and setup time. The table below summarises the key differences to help you choose the right tool for your research goal.
| Method | Interface | Granularity | Wallet-Level Data | Skill Required | Best For |
|---|---|---|---|---|---|
| Polymarket REST API | HTTP / JSON | Minute-level prices | Yes (trades endpoint) | Beginner–Intermediate | Price series, recent trade flow |
| The Graph Protocol | GraphQL | Individual trades | Yes (full history) | Intermediate | Per-wallet trade reconstruction |
| Dune Analytics | SQL | Daily / custom | Yes (aggregated) | Beginner (SQL) | Calibration studies, dashboards |
| Polygon Direct (PolygonScan) | Event logs / ABI | Tick-level (raw) | Yes (complete) | Advanced | Verification, full archive |
| Community Datasets (Kaggle / GitHub) | CSV / Parquet | Daily or trade-level | Varies by dataset | Beginner | Quick-start backtesting |
For most traders, the REST API covers 80% of use cases. Dune Analytics fills the gap for SQL-native researchers who want historical aggregations without writing Python. Direct Polygon queries are worth learning only if data integrity verification or very early (2020–2021) market coverage is critical to your research.
Practical Analysis Examples
Calibration Analysis
Take all Polymarket markets that resolved between January 2022 and December 2025. For each market, record the price 24 hours before resolution. Bin markets into 10-percentage-point buckets (0–10%, 10–20%, ..., 90–100%). Then calculate the actual YES resolution rate within each bucket.
A perfectly calibrated market produces a diagonal line: 10% markets resolve YES 10% of the time, 50% markets resolve YES 50% of the time, and so on. In practice, Polymarket shows slight overconfidence at the extremes — markets priced above 85% resolve YES slightly less often than 85%, and markets priced below 15% resolve YES slightly more often than 15%. This compression toward the mean at extremes is a documented pattern in prediction markets and a potential source of edge for contrarian traders.
Category Performance Comparison
Run the same calibration analysis separately for each of Polymarket’s major categories: politics, crypto, sports, science, and economics. Historical data reveals that category-level calibration varies significantly. Crypto price markets have historically shown stronger bullish bias than the eventual outcomes justify — likely reflecting the speculative psychology of the participants who trade those markets. Sports markets, by contrast, tend toward good calibration given the large community of sports bettors who bring external reference odds into the platform. For a practical overview of which categories offer the best trading opportunities, see the guide to the best Polymarket markets.
Time-of-Day Volume Patterns
Aggregate trade volume by hour of day (UTC) across all markets over a six-month window. The result is a clear daily rhythm: volume concentrates heavily during US East Coast trading hours (13:00–21:00 UTC), with a secondary spike during European morning hours (07:00–10:00 UTC). Overnight Asian hours show substantially thinner liquidity — which means wider bid-ask spreads and potentially larger mispricings for well-researched traders willing to act when the market is less efficient. For automated execution during off-peak hours, see the Polymarket trading bot guide.
Backtesting a Simple Strategy
To illustrate the backtesting process concretely, consider a mean-reversion strategy: buy any market that drops more than 20 percentage points in a single 24-hour period, hold until resolution, and sell if the position recovers to within 5 points of the pre-drop price.
Using three years of historical data:
- Filter the price history dataset for markets showing a single-day drop of 20+ points
- Record the entry price (close of the drop day) and the resolution outcome
- Calculate P&L per trade: (resolution price − entry price) for YES positions
- Aggregate across all trades to get total return, win rate, and Sharpe ratio
A backtest of this specific strategy on Polymarket historical data shows a positive expected value — large single-day drops are frequently caused by news events that the market overreacts to, and mean reversion is common. However, the variance is high and there are periods of significant drawdown. This illustrates both the value of backtesting (confirming positive expected value exists) and its limitation (historical performance does not guarantee future results, especially as more traders run similar strategies).
Data Quality Issues to Know About
Sparse early data. Polymarket’s first year (mid-2020 through 2021) had substantially lower volume and fewer markets than today. Statistical conclusions drawn from this period can be unstable given small sample sizes. For calibration analysis, 2022 onwards is a more reliable dataset.
Market splits and token migrations. When Polymarket migrates to a new smart contract version, historical markets on the old contract may not be automatically included in newer API endpoints. The Graph and direct Polygon queries are more reliable for cross-version historical coverage than the REST API alone.
Resolution disputes. A small percentage of markets — roughly 1–2% historically — are resolved via UMA’s dispute mechanism rather than the primary resolution source. These markets often show unusual price patterns in the days before resolution (erratic moves, wide spreads) that can distort analysis. Filtering them out of backtests typically produces cleaner results.
Liquidity and volume filters. Including very low-volume markets (under $500 total volume) in analysis tends to add noise rather than signal. These markets are often created by individual users testing the platform and their prices may reflect a single trader’s opinion rather than genuine crowd wisdom. Apply a minimum volume filter when building datasets for strategy research. More on navigating these nuances in our guide to Polymarket liquidity.
Frequently Asked Questions
Can I download all Polymarket historical data?
There is no official bulk download option from Polymarket itself. However, you can effectively download all historical data by paginating through the REST API or querying Dune Analytics with a SQL export. For a complete archive, querying directly via the Polygon blockchain (using PolygonScan or a node provider like Alchemy) gives you the most comprehensive dataset. Pre-built community datasets on Kaggle and GitHub cover much of the history for researchers who want a faster start.
Is there a CSV export button on Polymarket?
No. Polymarket does not provide a one-click CSV export from the platform interface. Exports require using the API, Dune Analytics, or The Graph as described in this guide. Dune Analytics is the most accessible option for non-developers — run a query and click the export button to download results as CSV without writing any code.
How far back does Polymarket data go?
Polymarket launched in mid-2020. On-chain data is available from the contract deployment dates — roughly July 2020 for the earliest markets. However, market volume before late 2021 was very low and the data is sparse. For research purposes, 2022 onwards offers the most reliable and statistically meaningful dataset. The 2024–2025 US election cycle in particular produced an exceptionally rich dataset with high volumes and diverse market types.
What is the best tool for beginners to explore historical data?
Dune Analytics is the most accessible starting point. It requires only basic SQL knowledge (or the ability to fork and modify existing queries), requires no local infrastructure, and has a library of community dashboards to learn from. The Polymarket REST API is the next step for those who want to build custom data pipelines in Python. The Graph and direct blockchain queries are for advanced users who need the most granular or complete historical coverage.
Can historical data give me an edge in current markets?
Yes — but with important caveats. Historical patterns are most reliable when they reflect structural features of the market (calibration biases, liquidity dynamics, resolution mechanics) rather than idiosyncratic events. A strategy that worked on 2022 data may not work identically today if market participants have become more sophisticated. Use historical data to identify candidate edges, then validate them on out-of-sample data before committing capital. For systematic edge exploitation at scale, see the guide to Polymarket API and on-chain data tools.