Edward Yu personal website

Hello, I'm Edward Yu. I'm a builder at heart, and the founder of Variational Research, a technology firm building foundational infrastructure for digital assets. I'm interested in Bayesian statistics, machine learning/AI, quantitative trading, and decentralized applications.

The cities in which I spend the most time these days are (in no particular order): New York, NY; Berkeley, CA; and Grand Cayman, Cayman Islands.

Contact me at [email protected]

Professional stuff


Bayesian Neural Networks with Soft Evidence

Presented at the 2021 International Conference on Machine Learning workshop on Uncertainty & Robustness in Deep Learning

Bayes’s rule deals with hard evidence, that is, we can calculate the probability of event A occuring given that event B has occurred. Soft evidence, on the other hand, involves a degree of uncertainty about whether event B has actually occurred or not. Jeffrey’s rule of conditioning provides a way to update beliefs in the case of soft evidence. We provide a framework to learn a probability distribution on the weights of a neural network trained using soft evidence by way of two simple algorithms for approximating Jeffrey conditionalization. We propose an experimental protocol for benchmarking these algorithms on empirical datasets and find that Jeffrey based methods are competitive or better in terms of accuracy yet show improvements in calibration metrics upwards of 20% in some cases, even when the data contains mislabeled points.

Tree-based Pipeline Optimization Tool

In 2021, I contributed some features to this open source library for AutoML.

TPOT is a Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming. TPOT will automate the most tedious part of machine learning by intelligently exploring thousands of possible pipelines to find the best one for your data.

A Bayesian Ensemble for Unsupervised Anomaly Detection

2016 arxiv preprint, research was done (and implemented into production!) while I was at Facebook

Methods for unsupervised anomaly detection suffer from the fact that the data is unlabeled, making it difficult to assess the optimality of detection algorithms. Ensemble learning has shown exceptional results in classification and clustering problems, but has not seen as much research in the context of outlier detection. Existing methods focus on combining output scores of individual detectors, but this leads to outputs that are not easily interpretable. In this paper, we introduce a theoretical foundation for combining individual detectors with Bayesian classifier combination. Not only are posterior distributions easily interpreted as the probability distribution of anomalies, but bias, variance, and individual error rates of detectors are all easily obtained. Performance on real-world datasets shows high accuracy across varied types of time series data.


My investing philosophy: many diversified alphas and a healthy dose of leverage.

I’m an angel investor in various startups. I prefer seed stage or series A.