← Back to dashboard

About & Methodology

What this dashboard measures and how the numbers are computed.

What this is

Aporetic is a calibration tool for prediction markets. It compares the temperature implied by prediction market prices (Kalshi) against the official National Weather Service forecast for the same day, then — once the day resolves — scores which one was closer to the observed temperature.

Over time it answers a simple question: how well-calibrated are prediction markets on weather, and in what conditions do they diverge from expert forecasts?

Data sources

  • Kalshi — daily NYC high-temperature bucket markets (series KXHIGHNY), fetched hourly via the public Kalshi Trade API.
  • NWS / NOAA — hourly forecast for Central Park from the National Weather Service API, fetched daily.
  • Ground truth — the NWS Daily Climate Report (CLI) for Central Park station KNYC, the exact source Kalshi uses to resolve its contracts.

All data is snapshotted into a database on a schedule; the dashboard reads from those snapshots (with a live Kalshi fetch on page load when available).

Implied temperature

Kalshi markets split the day's high temperature into buckets (e.g. 82–83°F). Each bucket's price is an implied probability. To turn the full set of buckets into a single point estimate:

  1. Pull prices for every bucket in the event.
  2. Normalize probabilities to sum to 1 — raw prices typically sum to slightly more than 100% due to market overround.
  3. Assign a midpoint temperature to each bucket: the range center for bounded buckets, and roughly 3°F beyond the boundary for the open-ended tail buckets.
  4. Compute the probability-weighted average: E[Temp] = Σ (probability × midpoint).

When a single tail bucket holds a large share of the probability, the point estimate is dominated by the tail-midpoint assumption and becomes unreliable — the dashboard flags those cases.

The NWS comparison

Kalshi's market resolves on the highest temperature observed in the full calendar day (midnight to midnight ET) — which can occur overnight, not just in the afternoon. To match that, the dashboard uses the NWS hourly forecast and takes the maximum across all 24 hours of the resolution date, not the daytime-period forecast.

The gap shown on the dashboard is the market implied temperature minus the NWS 24-hour max forecast. Positive means the market is pricing a warmer day than NWS predicts.

Accuracy scoring

Markets converge to the correct answer as resolution approaches, so scoring a market at close time is meaningless. Instead, each day is scored at a fixed horizon: the market snapshot closest to 24 hours before resolution — a genuine prediction made with real uncertainty remaining. (The actual window is 22–25 hours; the scoreboard labels it "24-hour horizon" for simplicity.)

After the NWS Climate Report publishes the observed high, both the market's implied temperature and the NWS forecast are scored by absolute error. Errors within 0.5°F of each other count as a tie.

Resolution sources & caveats

  • Kalshi resolves against Central Park (KNYC) observations via the NWS Daily Climate Report.
  • Other cities measure at the station named in each market's settlement rules, which is often an airport rather than the city center: Chicago at Midway (KMDW), and Los Angeles and San Francisco at the LAX and SFO airports respectively. Airport readings can differ noticeably from downtown temperatures (LAX, for example, runs cooler than downtown LA due to the marine layer).
  • Polymarket's NYC markets (planned addition) resolve against Weather Underground at LaGuardia (KLGA) — a different station that can read a few degrees differently. Accuracy scoring always tracks which ground truth each market uses.
  • NWS point forecasts are not certainties — 1-day forecast errors are roughly normal with σ ≈ 3°F. The distribution chart models NWS as a normal curve on that basis.
  • Late in the day, market prices increasingly reflect the temperature already observed rather than a forecast — the dashboard notes this after 5 PM ET.