Elo is a system used to dealing with competitive professionals. We can safely assume that while videogames have a smattering of intelligent, highly rational and competitively skilled players, playerbases are largely comprised of amateurs either aspiring to competitive play or simply trying to exist in an amateur bracket.
What Microsoft's Xbox 360 and Windows PC matchmaking system Trueskill will do varies, as it is often combined with a specific matchmaking system and perhaps has to deal with other variables when it is integrated into any videogame, variables that are not present in Elo (as it is not hooked up to anything, but rather runs separately as a sort of tracking database in the real world). But it is more or less similar in the way it operates.
This presents some problems, some of which I've seen here at Relic. I want to make it clear that this is not a complaint or a rant against Trueskill, but an attempt to point out what it's doing to players. My understanding of the system is a surface understanding, one of resulting effects more than a technical one -- please feel free to correct and add to this conversation as you see fit (my flame retardant suit is always zipped up tight, so don't be afraid to poke holes).
1) It accounts for player skill based on matchups -- win against an opponent of the same Trueskill and you will gain a little skill, probably remain in your bracket. Lose against someone worse than yourself and you will lose a little skill and probably remain in your bracket. However if you win or lose against someone above or below your bracket, your Trueskill will be changed accordingly at a much higher rate.
This assumes a competitive consistency in play that many players do not practice, a check at home in Major League Baseball or Chess but not within a pool of thousands of players who may range from teenagers to adults practicing this game in their spare time.
2) As the only way to check for skill is to look at win/loss ratio and the only way to improve is to match against a higher skill bracket, what you get is a matchmaking system that at times can simulate the absence of a matchmaking system.
What this means is that a player of base skill with zero games played will be matched against more advanced players to check their skill. The system wants to make sure it can judge the player's skill -- but the player assumes he will be matched against another base level opponent. Instead he accrues loss after loss until Trueskill has judged: this is a poor player.
There is a sweet spot where Trueskill shines. Currently in Warhammer 40,000: Dawn Of War II, I reside in the 25 Trueskill bracket, which is of mid-range skill (I believe it is exactly "average" if the max Trueskill value is 50). For almost a year I have only fluctuated by a couple points up and down, ie: down to 24 then up to 27, then down to 25. This has lead to games against players of equal skill (enjoyable games), players of considerable skill (tense games) and still sometimes players of almost no skill (games in which I feel like a jerk).
What is the solution then? First we have to ask what most players want, and I think that is usually "I want to play a fair game against someone my 'own size'". This means defocusing the competitive matching and making things more user controlled.
Perhaps a player wants to play against players of a low bracket of skill; let him select the low skill bracket. If he feels he has improved, let him select medium and dream of selecting high. This does open up the potential for griefing (high skill players joining low skill matchmaking to cream unskilled players for fun), but that's a separate issue.
Of course, this is exactly like choosing difficulties in a single player game. As I write this it occurs to me that I heard a coworker discussing such a thing this morning, and it has most definitely seeped into my thoughts -- it's a good idea, and a step towards more standardization.
What would such a thing mean for actual competitive play? Nothing at all. When I played Soldier of Fortune 2 and Call of Duty competitively, our clan used community ladder sites and their rules, such as OGL and Team Warfare League (more pedestrian variants of CAL).
We were ranked accordingly and able to judge skill accurately by ladder position, able to work our way up to the number one spot and jostle between first and second place, fighting off challenging teams that were not good enough and losing our spot to superior teams.
There is no math behind this, but rather social organization allows for players to heuristically judge skill through matching together.