— A LETTER

I build things
that have to actually run.

Toronto / Thunder Bay · 2026

I did an Honours BSc in Mathematics & Physics at the University of Toronto because I wanted to know what counts as proof. The rest has been me figuring out where, outside the textbook, that question still gets a real answer. Mostly it lives in the gap between technical work and the messy world — so that's where I work.

Here's what I'm like.

A score with no source is decoration. The thing that did the most for trust on FengShui Blueprint wasn't the engineering-blueprint visual treatment — it was that every rule in the engine cites a specific classical text. The 24-mountain orientation comes from 《青囊奥语》 and 《天玉经》. The six-sha taxonomy is 《阳宅十书》, 《阳宅撮要》, 《鲁班经》. The four-emblem analysis is 《葬书》 and 《雪心赋》. The narrator can only describe what the engine computed, never the reverse. The rules package itself is a dependency-free TypeScript module: pure functions, 55 Vitest tests, deterministic. I'd rather ship a smaller engine I can footnote than a larger one I can't.

I almost shipped a 3-model TSFM ensemble (Chronos-2, TimesFM 2.5, FlowState) for the Champions League winner market. Then I ran an 83-tie, 5-season backtest against a tuned Club Elo + Poisson scoreline + 50K Monte Carlo baseline, with injury-weighted ratings and quarterfinal xG folded back into the second-leg prior. The TSFMs added no point-prediction skill. So Elo went to production and the TSFMs moved behind a --with-tsfm ablation flag. The live edges I take to Polymarket are sized at half-Kelly with explicit confidence intervals — and only when the model's posterior disagrees with the market by more than the noise of the posterior itself. The version of me that needs the model to work is not the version I let press deploy.

TaskMarket's settlement layer is one Solidity file. Four states (None / Funded / Released / Refunded). No upgrade path, no proxy, no third destination address. The arbiter can resolve a dispute only by routing funds to one of the two existing parties — that's the operator's only on-chain privilege. Fee is 5%, hardcoded cap is 20%, single-shot per taskId. The Toronto rules engine is a dependency-free TypeScript package with exhaustive tests. Both are deliberately smaller than they could be. I want the load-bearing piece of any system small enough that a stranger can hold it in their head in one read, and small enough that the test suite covers the state space.

Tensor Proxies has been serving paying customers since 2022 on real infrastructure. lakebbs runs a live Chinese-language community forum — Next.js 16 + Drizzle ORM + MySQL 5.7 behind nginx on a Toronto VPS, deployed by rsync, uptime-monitored. hypebot-rs is a Rust rewrite of a perps + spot trading bot, managing actual positions on Hyperliquid. fengshui-web deploys to Cloudflare Workers via @opennextjs/cloudflare; the API worker uses D1 for the assessment archive, KV for geocode caching with a daily Google Maps quota guard, R2 for share-card storage. None of these have a "demo mode." If they break, someone notices.

I keep ending up at the same shape: a continuous score, a position, a measured edge.

hypebot-rs is a Rust rewrite of a Hyperliquid trading bot — perps + spot, real positions, signal pipeline + execution + risk in a single runtime. fin-forecast-arena is a benchmark harness for time-series foundation models on equities, which is the hostile case for TSFMs: short horizons, regime breaks, no obvious seasonality. The writeup is honest about where the foundation-model story breaks down. The UEFA and World Cup Oracles are Elo + Poisson + Monte Carlo systems shipping live edges against Polymarket's billion-dollar tournament markets, with half-Kelly sizing and injury / xG-blended priors. TaskMarket's USDC escrow is itself a market primitive — the smallest piece of settlement code that turns a chat into a marketplace. market-option-monitor watches derivatives flow.

The thread isn't "I love trading." It's that financial systems are the cleanest place to test forecasting and incentive design end-to-end, because the loss function is honest. You can lie to a benchmark; the market doesn't care what you wanted to be true.

Measure theory teaches you where you're allowed to be imprecise. Statistical mechanics teaches you how much noise a problem can absorb. Both make you skeptical of dashboards that report three decimals on data that doesn't deserve them. On any forecasting project, most of my time goes into figuring out what the honest precision is — calibration plots, isotonic regression on outputs, holdout splits that respect time order, walk-forward validation. The model is the last thing I touch. I'd rather report a wide interval honestly than a point estimate I can't back up.

Find the smallest legible unit of trust, then build the system around it. That's the only thesis I'd put my name on right now.

  • Statistical Rethinking — Richard McElreath
  • Designing Data-Intensive Applications — Martin Kleppmann
  • Advances in Financial Machine Learning — Marcos López de Prado
  • The classical canon: 葬书 · 阳宅十书 · 河图
  • Working in Public — Nadia Eghbal

— S.