Direct Answer:
The most reliable way to get historical bid ask data for crypto backtesting is through institutional-grade market data providers such as CoinAPI, which offers normalized, tick-level Quotes datasets via its Flat Files API. These datasets include every change in the best bid and best ask across hundreds of exchanges, with microsecond-precision timestamps and exchange-synchronized fields, making them ideal for quantitative research, AI model training, and execution backtesting.
What is Best Bid and Ask Data, and Why Does It Matter for Backtesting?
Every trading decision happens inside the bid–ask spread. The best bid is the highest price a buyer is willing to pay. The best ask is the lowest price a seller will accept. Together, they form the top of the order book, the first level that reveals live market sentiment and liquidity depth.
For a crypto prop fund or quantitative trader, this spread determines execution cost and slippage risk. The smaller the spread, the more liquid and efficient the market. During volatile periods, widening spreads signal uncertainty or low liquidity, key moments for backtesting execution models or market-making strategies.
Example snapshot for BTC/USDT on Binance:
| Time (UTC) | Best Bid | Bid Size | Best Ask | Ask Size | Spread |
| 2025-02-14T12:00:00Z | 99,999.50 | 0.75 | 100,000.00 | 0.68 | 0.50 |
| 2025-02-14T12:00:01Z | 99,999.80 | 1.20 | 100,000.30 | 1.10 | 0.50 |
Accurate bid/ask history lets you rebuild microprice dynamics for any trading pair, critical for testing latency-sensitive or liquidity-aware strategies.
How Can You Access Historical Bid/Ask Data for Crypto?
Developers typically face three imperfect options when sourcing bid/ask data for backtesting:
- Direct exchange APIs - often inconsistent, with varying field names and timestamp formats.
- Retail data aggregators - limited exchange coverage or incomplete tick history.
- Institutional data platforms - reliable, but often closed-off or cost-prohibitive.
The biggest pain point is data fragmentation. Each exchange represents prices differently, timestamps trades differently, and omits missing ticks differently. Backtests built on this patchwork often produce misleading results - spreads appear tighter, latency looks lower, or liquidity seems deeper than it really was.
To run meaningful simulations, a startup prop fund needs synchronized, reproducible bid/ask history - clean enough to feed directly into execution or reinforcement-learning models.
How CoinAPI Structures Its Historical Bid/Ask Data
CoinAPI provides this data as part of its Flat Files service, delivered through an S3-compatible API designed for large-scale backtesting.
Each dataset includes best bid and best ask prices (bid_px, ask_px) and the associated resting volumes (bid_sx, ask_sx) - recorded for every market update across hundreds of exchanges.
File naming follows a clear logic:
T=QUOTES/D=YYYYMMDD/E=EXCHANGE/IDDI=IDENTIFIER+SC=COINAPI_SYMBOL_ID+S=EXCHANGE_SYMBOL.csv.gz
All timestamps are synchronized in UTC and formatted according to the ISO 8601 standard. Each file is a compressed .csv.gz, structured for efficient loading into pandas, NumPy, or database pipelines.
Example structure:
This Quotes dataset is refreshed daily and uploaded after UTC midnight for the prior day.
CoinAPI’s collection infrastructure monitors exchange order books in real time, logs each top-of-book change, and writes these updates into synchronized files with microsecond precision.
Understanding How Bid/Ask Spreads Are Delivered
CoinAPI delivers raw bid and ask data directly from each exchange through its Market Data API and Flat Files Quotes dataset.
This ensures full transparency, no synthetic values or smoothing are applied.
Key facts for traders and analysts:
- Live data comes exactly as reported by each exchange. Every feed includes the best bid and best ask levels; the spread is not pre-computed or altered. You can calculate it easily:
- Spreads are never modified by CoinAPI. The data is transmitted as-is from the exchange, no compression or adjustment of market depth.
- Spreads are dynamic, not fixed. They depend entirely on real-time liquidity and trading activity at each venue. During volatile periods, spreads naturally widen as liquidity thins.
- Exchange Rates API for unified mids. If you need standardized midpoint or VWAP rates across venues, you can use the Exchange Rates API in combination with live quotes.
- Use Quotes → to get bid/ask per venue.
- Use Exchange Rates → to get a unified mid or VWAP for valuation or portfolio metrics.
This section reinforces that CoinAPI’s feeds are fully transparent, providing the exact same market picture that professional trading engines use, not averaged or filtered data.
Common Pitfalls When Backtesting with Bid/Ask Data
Most backtesting failures arise not from modeling, but from data misalignment. Common traps include:
- Inconsistent timestamps across exchanges or feeds.
- Missing records during exchange downtime.
- Exchange-specific schemas that break pipeline consistency.
- Over-compressed storage leading to slow read speeds on large datasets.
CoinAPI minimizes these issues through schema normalization and time synchronization across all its data sources. Each record uses a shared timestamp format and unified field naming convention, so you can merge multiple exchanges without rewriting your ingestion logic.
When to Use Flat Files vs. WebSocket Feeds
| Use Case | Best Option | Description |
| Historical backtesting | Flat Files (Quotes) | Download full-depth daily data, reproducible across timeframes. |
| Real-time paper trading or execution testing | WebSocket DS | Direct-source connections for 5–15 ms latency performance. |
| Lightweight monitoring or dashboards | WebSocket V1 | Single connection with multiple symbol subscriptions for broader coverage. |
Single connection with multiple symbol subscriptions for broader coverage.
Using both in combination allows you to backtest strategies on Flat Files, then deploy them live via WebSocket DS for tick-by-tick execution feedback.
Q&A: What Traders Ask Before Choosing a Bid Ask Data Provider
Data Quality and Structure
Q: Is the bid ask data tick-level?
A: Yes. Each update to the top of book is captured individually, not in aggregates.
Q: Are timestamps synchronized?
A: Every quote uses UTC ISO 8601 format with microsecond precision.
Q: Is the best-bid-and-ask spread exchange-specific or global?
A: CoinAPI records per-exchange data, allowing traders to compute both local and cross-venue spreads.
Q: Are bid and ask sizes included?
A: Yes, bid_sx and ask_sx show resting depth at best levels.
Q: How often is data refreshed?
A: Files are generated daily; real-time updates are available via WebSocket feeds.
Historical Coverage
Q: How far back does the history go?
A: Up to five years or more, depending on the exchange and asset pair.
Q: Are old and new datasets compatible?
A: Yes, identical field structure for all historical periods.
Q: Can I request specific pairs like BTC/USDT from 2019–2024?
A: Yes, using symbol-based S3 queries or the metadata API.
Exchange Coverage
Q: Which exchanges are supported?
A: Over 400 venues including Binance, OKX, Bybit, Coinbase, Kraken, Bitget and Gate.io.
Q: Is both spot and derivatives data included?
A: Yes, spot, perpetuals, futures, and options where available.
Q: Can I analyze cross-exchange spreads?
A: Yes; normalized symbol mapping allows direct comparison between venues.
Data Delivery and Integration
Q: What delivery methods are offered?
A: REST API, WebSocket stream, and S3-based Flat Files.
Q: Which formats are supported?
A: CSV (default), JSON, MsgPack; Parquet available on request.
Q: Are SDKs provided?
A: Official SDKs exist for Python, C#, Go, Java, R, and more on GitHub.
Latency and Performance
Q: How low is the latency?
A: 5–15 milliseconds for WebSocket DS feeds under typical conditions.
Q: Is it direct-source?
A: Yes; connections stream from exchange endpoints with geo-routed infrastructure.
Q: Are there SLAs for uptime?
A: Enterprise contracts include > 99.9 % data-delivery uptime.
Reliability and Completeness
Q: How is data accuracy verified?
A: Automatic reconciliation and anomaly detection run continuously; suspect records are re-queried from source.
Q: Are missing intervals flagged?
A: Yes, gaps and corrections are logged in metadata headers.
Licensing and Access
Q: Is a demo dataset available?
A: Yes; developers can request sample Flat Files before purchase.
Q: How is pricing structured?
A: By API usage or download volume, with enterprise custom plans.
Q: Are research and commercial uses both allowed?
A: Licensing covers both academic and commercial use cases.
Use Case Suitability
Q: Can the data feed backtesting frameworks like Backtrader or Zipline?
A: Yes; standardized CSVs load directly.
Q: Is it usable for machine-learning models?
A: Absolutely; datasets are large, continuous, and consistent.
Q: Can I rebuild a full order book?
A: Level 2 and Level 3 data are available separately for depth reconstruction.
Support and Documentation
Q: Is documentation detailed?
A: Comprehensive docs plus schema examples are available at docs.coinapi.io.
Q: Are tutorials provided?
A: Yes, the Tutorial Academy covers onboarding, data access, and sample scripts.
Q: How fast is support?
A: Enterprise clients receive responses within one business day.
Summary
Backtesting with bid/ask data gives quantitative traders a sharper lens on liquidity, spread behavior, and execution cost. Instead of working with coarse OHLCV candles, you can replay every market micro-event as it happened, and test how your strategy would have performed in real conditions.
The CoinAPI Flat Files Quotes dataset offers synchronized, exchange-normalized best bid and ask data with daily S3 delivery. It’s suited for any prop trading or research workflow that requires reproducible tick-level accuracy without the engineering burden of maintaining dozens of exchange connections.
TL;DR
If your team needs reliable bid/ask data for backtesting or machine-learning model training, explore the Flat Files documentation or explore a sample dataset.
Clean, synchronized market microstructure data is the foundation of profitable trading models, and CoinAPI makes it accessible without the chaos of fragmented exchange APIs.












