Work | George Ho

Software

PyMC3 and PyMC4
- PyMC3 is a popular Python framework for Bayesian modeling and probabilistic machine learning, focusing on Markov chain Monte Carlo (MCMC) and variational inference (VI) algorithms.
- I’m a member of the core development team, and contribute to the PyMC3 internals and documentation. I wrote a blog post on tips and tricks for Bayesian modelling using PyMC3.
Pyfolio and Alphalens
- Pyfolio and Alphalens are Python libraries for risk analysis and performance attribution of financial portfolios, and alpha factor research for algorithmic trading. Both libraries were fully integrated into the Quantopian platform.
- I developed the risk and performance attribution capabilities of Pyfolio (read more on my blog post here), and help maintain the library. I help develop new features, triage bug reports and troubleshoot issues for Alphalens.
Knead and Glaze
- Knead and Glaze are command line tools and Python libraries for preprocessing, manipulating, rendering and visualizing font files and algorithmically-generated typefaces. Both libraries were written as internal tools for The Font Bakers, to support their research and development workflow. I developed and maintain both projects.

cryptics.georgeho.org
- cryptics.georgeho.org is a dataset of cryptic crossword clues, indicators and charades, collected from various blogs and publicly available digital archives. It is the first and only such data set that is openly accessible and larger than any in extant literature, and is a significant work of crossword archivism.

Deep Learning for Algorithmic Type Design
- As a research project pursued in my last year at The Cooper Union, we researched a class- and attribute- conditional generative adversarial network capable of producing vector graphics, with potential applications in algorithmic type design. The generative model was to produce closed shapes, with counter spaces, defined by a variable number of quadratic Bézier curves. At the time, no generative model with vector graphic output has previously appeared in literature.
Hate Speech on Reddit
- As part of a project on Data Science for Social Good, I ran text clustering algorithms on well-known hateful and toxic subreddits, and collaborated with a cross-disciplinary team of artists, architects and engineers to present the findings at The Cooper Union 2018 End of Year Show. I also wrote a blog post on my results, and gave a talk on the data science that went into the project.