Friday, January 31, 2014

where I mostly write

It's been a while since I updated this blog, mainly because I've been using Facebook as a mini-blog. (You don't need to friend me in order to read my posts.)

I've also been contributing to many of the posts on Dan Morris's music theory blog.

Saturday, November 03, 2012

Hotelling's Law

Hotelling's Law is often cited as the reason politicians running against each other end up taking similar positions.  For example, look at Obama and Romney on this chart.  The classical example is two vendors selling ice cream to patrons along a boardwalk, where each patron will buy ice cream from the nearest vendor.  In this scenario, each vendor is incentivized to move right next to the other, towards the middle of the boardwalk, in order to maximize profit.

I started thinking about some extensions to this.  Namely, what happens if you add more vendors, and what happens in different spaces.  You might guess that if you add more vendors, they all cluster at the middle, but you would be wrong.

(I should add that I ran simulations but didn't work out any of the math, so it's possible some of my results here are not correct.)

Three Vendors
What happens with three vendors?  The middle vendor has no incentive to do anything and takes a random walk, but the outer vendors want to squeeze the middle one.  Once they all meet, they move around together, and don't necessarily stay at the center.

Four Vendors
It's tempting to think that four vendors would also fight over the same spot, but this did not happen in my simulations.  I instead got two vendors sitting right next to each other at 1/4, and the two others at 3/4.  Each vendor then gets 1/4 of the patrons, and none has any incentive to move.

N Vendors for Even N
This plays out just like four vendors, with two vendors at 1/N, two vendors at 3/N, two vendors at 5/N, etc., with each vendors getting 1/N of the patrons.

N Vendors for Odd N
This seems pretty chaotic.  I can't come up with any simple way to describe what happens, and it's not always the same.

Two Vendors, Circular Boardwalk
Now imagine a circular boardwalk.  In this scenario the vendors' positions do not matter at all, as they will split the profits equally.

Three Vendors, Circular Boardwalk
Eventually, you get a stable situation where two vendors are right next to each other serving 1/4 of the patrons each, and the other vendor is on the other side of the circle serving the other 1/2 of the patrons.

N Vendors for Even N, Circular Boardwalk
Plays out just like it does on a linear boardwalk, with pairs of vendors at N/2 equally spaced positions on the circle.

Two Vendors, Two Dimensions
Just like in one dimension, they move to the center.

Two Vendors on a Sphere
Same as a circle, position doesn't matter.

If I get a chance I'll try to get some of these simulations running on the web somewhere.  Also, someone please address this more rigorously than I just did.

Thursday, May 17, 2012

RescueTime productivity feedback

RescueTime is a useful tool for tracking your computer time, but it would be nice to get constant real-time feedback so you can see your scores plummet as you surf Facebook all day.

I hacked together a simple Windows application that lives in the system tray and shows your RescueTime productivity, updating every minute.  You can get it here.  You'll also need to install Python and PyWin32.  You also need to enable a RescueTime API key here and then add that key on line 16 of  If you want the application to run all the time, create a shortcut and set it as a startup task.

Yeah, I didn't make it easy for you.  If I get a chance I'll put together a simplified cross-platform version.

Tuesday, November 15, 2011

don't break the only way to do something

It is impossible to stay up-to-date with all of the blogs and news sites I read.  For infrequently-updated sites with a high percentage of good stuff, RSS solves the problem.  For frequently-updated sites with a lower percentage, it does not.  I don't want to sift through hundreds or thousands of articles manually each day.

Up until a few weeks ago, I knew of exactly one solution to this problem: "sort by magic" in Google Reader, which sorted the RSS entries in my feed by, well, magic.  Unfortunately, Google's much-maligned Reader update has killed the magic.  The "sort by magic" option is still there, but they should probably change its name to "sort by angry illiterate moose".  I used to see a nice mixture of posts from all of my followed sites, where posts from infrequently-updated sites usually showed up near the top, along with only the "best" posts from frequently-updated sites.  (I don't know what "best" means, but it seemed to do just fine.)  Now, I just see hundreds of posts from TechCrunch.  This is not magic.

So, now I have zero solutions to this problem.  I can no longer follow blogs like Marginal Revolution with several posts per day.  TechCrunch is out of the question.

Any suggestions?

Sunday, October 30, 2011

does playlist order matter?

Does playlist order matter?  Many people seem to think so (see High Fidelity), but I remain unconvinced.  I was inspired to resolve this issue in my mind after seeing a poster at ISMIR 2011 evaluating several algorithms for automatic playlist generation.  It turns out that there's a free dataset containing around 30,000 actual user-created playlists, covering over 200,000 songs.

What does it mean for playlist order to matter?  There are a few simple things we could look for:

  1. Some pairs of songs should occur frequently in one order, but not the other.
  2. Some pairs of songs should occur close together (in either order) more frequently than predicted by chance alone.
There's a problem with looking at pairs of songs, though: the above dataset contains over 47 billion possible song transitions and only 30,000 playlists, containing around 20 songs on average.  Possibly worse, only 9627 of the 545,867 pairs of consecutive songs (1.8%) appear more than once.
To address this lack-of-common-pairs issue, we can think about modeling a playlist as a list of artists rather than a list of songs, as this should increase the amount of repetition.  Now, instead of counting the number of times Even Flow by Pearl Jam comes after Rusty Cage by Soundgarden, we count the number of times any Pearl Jam song comes after any Soundgarden song.  This ameliorates the problem somewhat, as there are 128,778 consecutive artist pairs (23.6%) that occur at least twice in the data set.

Now, let's address #1: are there pairs of artists for which one is more likely to come first?  Here's a simple way to test this:
  • For each pair of artists, compute the probability that they appear in one order (say, Pearl Jam then Soundgarden) when they appear consecutively.  Then find the pairs of artists for which this probability is highest.  (It's necessary to use something akin to Bayesian Rating here, to avoid pairs of artists that appear a few times in one direction and zero in the other.)

  • Randomly shuffle each playlist and recompute the probabilities, again looking at the highest-probability pairs.  Do this several times.  The highest probabilities you see will give you a good sense of what can happen based on chance alone, since there is obviously no meaningful order to the randomly shuffled playlists.

  • Are there any pairs of artists with ordering probability higher than anything you saw when using the randomly shuffled playlists?  If so, we can say this pair of artists is likely to appear in one order over the other with some degree of significance.
It turns out that running this experiment yields no pair of artists that are significantly likely to appear in one order over the other.  There's still a possibility of some asymmetry, however — maybe Pearl Jam is likely to appear two songs before Soundgarden, or three, etc.  To address this, we can re-run the above test, but without the requirement that the songs are adjacent.  That is, we can compute the probability that Pearl Jam occurs earlier than Soundgarden in a playlist where both appear.  When we do this, we find that there's still no pair of artists with a preferred ordering.

Let's look at #2, then: are there pairs of artists that appear next to each other more than chance would predict?  We can adapt our above experiment for this case: just compute the probability that two artists appear consecutively (in either order) when they both show up in the same playlist, and do the same for the randomly shuffled playlists.  It turns out that now there are a few pairs of artists that tend to appear consecutively with some significance (keep in mind these playlists are from the early 2000s):
  • Ben Folds Five and Barenaked Ladies
  • Radiohead and Bj√∂rk
  • Blink 182 and Weezer
  • Radiohead and Pink Floyd
So, I guess I will have to acknowledge that playlist ordering is something, and not nothing.

Monday, January 24, 2011

reverse lottery

Of course you know about the lottery, where you can pay some small amount of money for a low probability of winning a large amount of money.  And lots of businesses have promotions along the same lines: e.g., if you buy our slightly-overpriced french fries, you might win a yacht.

However, I have never seen the reverse: we're giving out free french fries, except there's a small chance you'll have to pay us $1000.  Now, a potential customer might not have $1000, so let's lower the values here to get a simpler and fairer comparison between two potential offerings:

  1. French fries cost $1.25, but one out of every five customers gets his for free.
  2. French fries are free, but one out of every five customers has to pay a $5.00 fine.
There must be a reason why businesses never offer the second type of deal.  Is it just logistical?  Or is it that people would hate this, and a 1 in 5 chance of losing $5 is worse than just giving away $1.25.  It seems that this sort of deal would not be too difficult to arrange, so I have to imagine that businesses don't offer it because people don't like it.

This brings me to my main point: a lot of law enforcement works like this "reverse lottery", and I think we're underestimating just how much people hate this.  Our goal should be punishments that exactly offset the value of the crime, and a 100% chance of getting caught.  This makes enforcement prohibitively costly, but serves as an ideal.  (The other extreme, where you get away with criminal behavior almost all of the time, but if you get caught you and your family are tortured for 20 years then killed, sounds awful to anyone.)  Certainty itself has value, and this needs to be traded off against the cost of enforcement.

Disclaimer: I don't know anything about crime statistics, and I would hope there are experts who do know what happens as you vary the level of punishment and probability of getting caught.  I'm betting that, while keeping level of punishment times probability of getting caught constant, good things happen as punishment goes down and probability of getting caught goes up, and the only reason not to move further in this direction is cost of patrol.

This blog post arose from a discussion with Alex Jaffe.

(Another way to look at things is that in my ideal scenario, there is no crime — you simply pay for the damage you do.)

Sunday, December 26, 2010

resolutions: 2011 edition

The last time I made resolutions, it was the end of 2008.  I've since gone 2 for 4 on those resolutions, and one of them took me an extra year.  Here's what I resolve to do in the coming year (in no particular order):

  1. Take piano lessons and do at least one public performance (Rewind shows don't count).
  2. Participate in at least one submission wrestling tournament.
  3. Re-learn and improve the 540 kick (see this video).
  4. Take a dance class.
  5. Finish (version 1 of) the awesome chord-listening iPhone app I'm working on.
I think I can go 5 for 5 this time.  These are all extremely doable, like your mom.  Notice that I did not officially resolve to post more often to this blog, or learn to snowboard, or learn to ride a bike, or learn Portuguese, or learn Chinese, or read more books, or play more video games, even though these are all things I want to do.  You have to make cuts somewhere.

I also didn't include any work-related resolutions, because they are top-secret.  And because we already have a similar process, except they actually hold you to it.

Thursday, June 17, 2010

generalized bullshit

I haven't read Harry Frankfurt's 1986 philosophical treatise On Bullshit.  But here's a summary I like from its Wikipedia page:

In the essay, Frankfurt defines a theory of bullshit, defining the concept and analyzing its applications.  In particular, Frankfurt distinguishes bullshitting from lying: while the liar deliberately makes false claims, the bullshitter is simply uninterested in the truth.  Bullshitters aim primarily to impress and persuade their audiences.  While liars need to know the truth to better conceal it, bullshitters, interested solely in advancing their own agendas, have no use for the truth.  Thus, Frankfurt claims, "...bullshit is a greater enemy of the truth than lies are."
This does seem to be the defining characteristic of bullshitting - the lack of regard for truth.  And it does seem to be more dangerous than lying, for the same reason that a machine learning classifier with 50% accuracy is far less useful than one with 10% accuracy.

However, bullshit can describe many things beyond statements with a truth value.  For example, we talk about bullshit jobs.  What exactly is a bullshit job?  Is a bullshit job just one that requires you to generate bullshit statements?  I think there's more to it than that.  There's a whole book on bullshit jobs, and it provides us with some useful examples: cheese artisan (sculptor of supermarket cheese), feng shui consultant, economist, aromatherapist, life coach, and so on, though some unfortunately stray from bullshit territory into mere unpleasantness (roadkill collector) or overt detriment to society (patent troll).

What do these jobs have in common with Harry Frankfurt's definition, the lack of regard for truth?  Certainly some may require explicit truth-disregarding, but I think a more fundamental similarity is the idea of a broken link to what really matters.  For bullshit statements, the link between a statement and its truth value has been severed.  Again, this is different from false statements, in which the link is present but set to "reverse".  And bullshit jobs, then, are jobs which provide no direct benefit to society, and for which the indirect benefit requires a link which has been severed or never really existed at all.

What "really matters"?  I'd say something that makes a person happy, though you can use whatever rule you want, as long as it isn't "nothing".  Or, I suppose, if you say nothing matters, then you should conclude that everything is bullshit, which might be a defensible position.  And keep in mind that there's lots of non-bullshit that is pretty far removed from mattering, like designing uniforms for train conductors, but that remain non-bullshit as long as the intermediate links hold up.  In many cases it may be difficult to test all of these intermediate links, and in fact the increasing specialization in modern society may be the cause of the proliferation of bullshit.  I'd guess that hunter-gatherer societies are able to get away with considerably less bullshit, because the links to usefulness of any activity are easily testable by anyone.

Looking it at this way, a large fraction of what we do all day is bullshit.  From personal experience, most of what happens in academia is bullshit.  99% of research papers, including (especially?) the ones by me, are bullshit, even if every single statement in them is intentionally true.  The business world, notorious for its bullshit, actually seems a little bit better.  From what I can tell, tech startups seem to have relatively low levels of bullshit.  At least, the good ones do - some have made my life noticeably better.  And things like baking and selling bread are about as far from bullshit as you can get.