# Erik Bernhardsson

Head of Engineering at Better Mortgage

Bio: I like to work with smart people and deliver great software. After 5+ years at Spotify, I just left for new exciting startup in NYC where I am leading the engineering team. We're hiring like crazy – if you're a serial polyglot and like to build something big from scratch – drop me an email at erik@better.com! At Spotify, I built up and lead the team responsible for music recommendations and machine learning. We designed and built many large scale machine learning algorithms we use to power the recommendation features: the radio feature, the "Discover"​ page, "Related Artists"​, and much more. I also authored Luigi, which is a workflow manager in Python with 3,000+ stars on Github – used by Foursquare, Quora, Stripe, Asana, etc.

### Conversion rates — you are (most likely) computing them wrong

How hard can it be to compute conversion rate? Take the total number of users that converted and divide them with the total number of users. Done. Except… it’s a lot more complicated when you have any sort of significant time lag. Prelude — a story Fresh out of school I joined Spotify as the first data […]

### The Mathematical Principles of Management

I’ve read about 100 management books by now but if there’s something that always bothered me it’s the lack of first principles thinking. Basically it’s a ton of heuristics. And heuristics are great, but when you present heuristics as true objectives, it kind of clouds the underlying objectives (and you end up with weird proxy […]

### The eigenvector of “Why we moved from language X to languag...

I was reading yet another blog post titled “Why our team moved from to ” (I forgot which one) and I started wondering if you can generalize it a bit. Is it possible to generate a N * N contingency table of moving from language X to language Y? Someone should make a N*N contingency […]

### Language pitch

Here’s a fun analysis that I did of the pitch (aka. frequency) of various languages. Certain languages are simply pronounced with lower or higher pitch. Whether this is a feature of the language or more a cultural thing is a good question, but there are some substantial differences between languages. Hertz (or Hz, or s−1s−1), […]

### Vector Models in Machine learning Part 2

This is a blog post rewritten from a presentation at NYC Machine Learning on Sep 17. It covers a library called Annoy that I have built that helps you do nearest neighbor queries in high dimensional spaces. In the first part, I went through some examples of why vector models are useful. In the second […]

### Nearest Neighbor Methods and Vector Models – part 1

This is a blog post rewritten from a presentation at NYC Machine Learning. It covers a library called Annoy that I have built that helps you do (approximate) nearest neighbor queries in high dimensional spaces. I will be splitting it into several parts. This first talks about vector models, how to measure similarity, and why […]

### The Half-life of Code & the Ship of Theseus

As a project evolves, does the new code just add on top of the old code? Or does it replace the old code slowly over time? In order to understand this, I built a little thing to analyze Git projects, with help from the formidable GitPython project. The idea is to go back in history […]

### Are Data Sets the New Server Rooms?

This blog post Data sets are the new server rooms makes the point that a bunch of companies raise a ton of money to go get really proprietary awesome data as a competitive moat. Because once you have the data, you can build a better product, and no one can copy it (at least not […]

### Pareto efficiency

Pareto efficiency is a useful concept I like to think about. It often comes up when you compare items on multiple dimensions. Say you want to buy a new TV. To simplify it let’s assume you only care about two factors: price and quality. We don’t know what you are willing to pay for quality […]