Skip to content

Rating System

Frontier tracks competitive skill using a per-ladder rating system. Two algorithms are available: Glicko-2 (the default) and Showdown K-Scaling. Each ladder can use a different one.

Developed by Mark Glickman (2013). The algorithm tracks three values per player per ladder:

ValueWhat it representsDefault
Rating (R)Skill estimate on a 0—9999 scale.1500
Rating Deviation (RD)Uncertainty in the rating. Lower = more confident.350
Volatility (sigma)Expected fluctuation in performance.0.06

After each match, the algorithm compares the expected outcome (based on both players’ ratings and RDs) against the actual result (1.0 for a win, 0.0 for a loss). Then it updates all three values for both players.

Bigger rating swings happen when:

  • The player’s RD is high (uncertain rating — the system is still learning).
  • The opponent’s RD is low (confident opponent — beating a known quantity means more).
  • The result is unexpected (upset victories move the needle fast).

Smaller rating swings happen when:

  • The player’s RD is low (well-established rating).
  • The opponent’s RD is high (uncertain opponent — less informative match).
  • The result is expected.

RD drifts upward when a player stops playing. The system gets less confident about a dormant player’s skill over time. When they return, their rating moves more aggressively until RD settles back down.

With regular play, RD drops toward rdFloor (default 50), signaling a well-established rating.

rating {
algorithm = GLICKO2
startRating = 1500.0
startRD = 350.0
startVolatility = 0.06
tau = 0.5
rdFloor = 50.0
minRating = 0.0
maxRating = 9999.0
}
FieldTypeDefaultDescription
startRatingDouble1500.0Initial rating for new players.
startRDDouble350.0Initial rating deviation.
startVolatilityDouble0.06Initial volatility.
tauDouble0.5Controls how quickly volatility changes. Higher = more reactive to upsets. Recommended range: 0.3—1.2.
rdFloorDouble50.0Minimum RD. Prevents ratings from becoming “too certain.”
minRatingDouble0.0Floor for the rating value itself.
maxRatingDouble9999.0Ceiling for the rating value.

A simpler alternative inspired by Pokemon Showdown’s ladder. Elo-style rating with a K-factor that scales based on games played.

  1. New players start with a high K-factor (kMax = 50), so their rating moves quickly.
  2. As they play more games, K decreases linearly toward kMin = 20 over kScalingGames = 20 matches.
  3. Established players have smaller rating swings, creating more stable leaderboards.

The K-factor formula:

K = kMax - (kMax - kMin) * min(gamesPlayed, kScalingGames) / kScalingGames

For a player with 10 games played (defaults): K = 50 - (50 - 20) * 10/20 = 35.

rating {
algorithm = SHOWDOWN_K_SCALING
startRating = 1500.0
startRD = 350.0
kMax = 50.0
kMin = 20.0
kScalingGames = 20
rdFloor = 50.0
minRating = 0.0
maxRating = 9999.0
}
FieldTypeDefaultDescription
kMaxDouble50.0K-factor for brand-new players. Large swings early on.
kMinDouble20.0K-factor floor for experienced players.
kScalingGamesInt20Number of games over which K scales from max to min.

Prevents rating farming by repeatedly matching the same opponent. Rating changes are multiplied down for repeat matchups within a time window.

antiBoost {
windowHours = 24
match1Multiplier = 1.0
match2Multiplier = 0.5
match3PlusMultiplier = 0.1
blockSameIp = false
}
FieldTypeDefaultDescription
windowHoursInt24Time window for tracking repeated matchups.
match1MultiplierDouble1.0Rating change multiplier for the first match against an opponent. Full value.
match2MultiplierDouble0.5Second match in the window. Half value.
match3PlusMultiplierDouble0.1Third and subsequent matches. 10% value.
blockSameIpBooleanfalseBlock matchmaking entirely between players on the same IP.

The multiplier applies to the rating change, not the rating itself:

actualChange = calculatedChange * antiBoostMultiplier
newRating = oldRating + actualChange

For each player on each ladder, Frontier tracks:

StatDescription
RatingCurrent rating on the ladder.
Rating DeviationCurrent RD (uncertainty).
VolatilityCurrent volatility (Glicko-2 only).
Highest RatingPeak rating ever achieved.
Wins / LossesTotal win and loss counts.
Current Win StreakConsecutive wins (resets on loss).
Current Loss StreakConsecutive losses (resets on win).
Best Win StreakLongest win streak ever achieved.
Last Match TimestampWhen the player last played on this ladder.

Frontier displays a confidence label in GUIs based on the player’s current RD:

LabelRD RangeMeaning
StableRD ≤ 50Rating is well-established. This player has played enough recent matches for a reliable estimate.
UncertainRD 50—100Rating is moderately certain. Could shift noticeably with a few matches.
VolatileRD > 100Rating is highly uncertain or provisional. Expect large swings.

These labels are configurable in messages.conf under rdDescriptions.