How to rise in college rankings?

Tomas Dvorak
8 min readSep 13, 2022

--

It is very difficult. This does not (and probably should not) stop institutions from trying: from Northeastern’s valiant effort, to Rowan’s hot sauce mailings, to Columbia’s likely misreporting, institutions spend a great deal of effort to appear favorable to the ranking methodology. U.S. News has a side business of helping institutions understand their ranking. Institutions advertise their achievements in the Chronicle to boost their reputation among peers. It is clear that despite the criticism, colleges care about where they stand in the rankings.

In this post I examine the implications of the U.S. News ranking methodology for resource allocation within an institution. The ranking combines many criteria, and figuring out which criteria have the best chance of affecting the ranking is not straightforward. In particular, given the normalization used by U.S. News, the weights of each factor depend not just on the stated weight but also on the standard deviation of the factor among the ranked schools.

How persistent are college rankings?

Very. The figure below shows the percentage of institutions that stayed in the same decile of their last year’s ranking (i.e., within about 20 spots for national liberal arts colleges and about 40 spots for national universities). On average, 69 percent of national universities stay in the same decile as do about 74 percent of national liberal arts colleges. When broken down by decile of the last year’s ranking, it is clear that there is more mobility in the middle of the rankings than at the top and bottom.

The reason for more mobility in the middle is that the scores that determine the ranking have a bell-shaped distribution — as shown in the figure below, there are more schools with similar scores around the middle than at the edges.

How is the score determined?

It’s complicated. However, U.S. News is pretty transparent about how scores are calculated. Since the 2021 ranking, the score has been determined by aggregating 17 different indicators ranging from peer assessment to graduation rate to faculty compensation. Since each indicator is measured in different units (e.g. graduation rate as a percent, faculty compensation in dollars), the indicators are normalized before adding them together. The normalization subtracts the mean of the indicator and divides by the standard deviation. The weighted sum of normalized indicators determines the final score. To make the score on a 0 to 100 scale, U.S. News subtracts the minimum value and divides the result by the maximum value, multiplying the result by 100, and rounding to the nearest integer.

Can U.S. News overall score be replicated?

Yes. It requires the values of each of the 17 indicators for all of the schools included in the ranking. While obtaining the values of many of the indicators is straightforward (e.g. graduation rate or peer assessment score), others require some manipulation that is in some cases intentionally obscured. For example, in calculating the class size index, U.S. News does not specify the weights assigned to shares of classes of different size. In other cases, complex calculation is required such as when test scores need to be converted to percentiles and weighted by shares of SAT and ACT takers. A big question mark is how missing values are treated. U.S. News says that many values are estimated and used in the calculation of the final score, but the estimated indicators are not reported. It is not clear if the estimated values are included in the calculation of the mean and standard deviation of each indicator.

The chart below compares the reported scores with those calculated by me. Faithfully replicating U.S. News methodology leads to a pretty good fit.

The advantage of replicating the calculation is that we can see exactly how each indicator contributed to each school’s score. The chart below shows the weighted z-scores for each school and each indicator. Naturally, schools with high scores tend to have positive z-scores, and vise versa. Different categories of indicators are marked in different colors.

Do the published weights reflect the importance of the indicator?

No. The normalization of indicators imparts weights beyond the published weights. The lower the standard deviation of an indicator, the higher the implicit weight of that indicator. The overall score is affected by how many standard deviations an indicator moves. If an indicator has a low standard deviation, a one unit increase in an indicator will have a bigger effect than a one unit increase in an indicator with a high standard deviation.

It is worth noting that the U.S. News’ method of aggregating indicators is not the only way of combining indicators into one score. For example, Chris Tofallis convincingly argues that a product, rather than a sum, has many desirable properties. For example, with a product, a one percent increase in an indicator will raise the score by the weight of that indicator. That is not the case with the sum. Worse, normalizing by standard deviation means that adding a school to a ranking can change the standard deviations and therefore the weights. This means that adding a school can unexpectedly alter the ranking of schools that should not be affected.

The chart below shows the published weights and the effect of a ten percent increase in the value of each indicator. The ten percent change is calculated as a ten percent change from the mean of that indicator. For example, for national universities, the mean of the six-year graduation rate is 64%, thus the ten percent improvement is 6.4 percentage points. Combined with the 17.6% weight on this indicator, the effect of this improvement on the overall score is two points.

The chart shows that indicators with relatively low published weights can have a substantial effect on the overall score. For example, the graduation rate performance has less than half the weight of the six-year graduation rate, but a ten percent improvement in either has the same effect on the overall score. This is because the coefficient of variation of graduation rate performance is about half of the coefficient of variation for the six-year graduation rate. The first-year student retention rate also punches above its published weight as its coefficient of variation is relatively low. As a result, improving the first-year retention rate by eight percentage points (a ten percent improvement) increases the overall score by more than one point. In contrast, spending per student has a relatively low effect on the overall score despite its ten percent published weight. This is because U.S. News takes the logarithm of spending per student, and the variation in the log of spending per student is relatively modest. Alumni giving rate is another example of an indicator that punches below its published weight — there is a lot of variation in alumni giving rate. Therefore, it takes a substantial improvement to affect the overall score.

The chart also makes clear that the effect of a ten percent improvement has different effects for a liberal arts school than a national university. For example, there is very little variation in class size across liberal arts colleges — most of them have a very similar structure. In contrast, there is a lot more diversity in class size among national universities. This means that improving the class size index has a much bigger effect at a liberal arts school than a national university (even though the published weight on class size is the same for both rankings).

What does this mean for resource allocation?

Focus on graduation, retention and class size. These indicators have not only relatively high published weights, but also have a relatively low standard deviation. They punch above their published weight and naturally reinforce each other (e.g. retention is part of graduation, and class size likely affects both).

In general, the efficiency of focusing on a particular indicator depends on the balance of how many resources it takes to change the indicator and on how much total weight the criterion has in the rankings. Of course it is difficult to assess how many resources it will take to improve, say, the graduation rate, and the answers probably vary across institutions.

It is also worth noting that the additive nature of U.S. News’ methodology means that the optimal resource allocation does not necessarily focus on indicators where a school is under-performing. If an institution can improve an already high indicator by another standard deviation, it will have the same effect on the overall score as improving a low indicator by one standard deviation.

As a concrete example, consider money used for faculty compensation vs for financial aid. For liberal arts colleges, the standard deviation of average faculty compensation is about 20 thousand dollars per faculty member. The standard deviation of indebtedness is about five thousand dollars per student. With a faculty ratio of 12 students per faculty, we have a ratio of three graduating students per faculty. Thus, we only need to spend 15 thousand dollars per faculty member to reduce indebtedness by one standard deviation, whereas we need to spend 20 thousand dollars per faculty member to increase faculty compensation by one standard deviation. However, indebtedness has only a three percent weight, whereas faculty compensation has seven percent. Thus, assuming no secondary effects on other indicators (e.g. on the percent of students with debt, or even on retention and graduation which may likely increase with more financial aid), spending on faculty compensation seems more beneficial.

Jeongeun Kim argued that rankings set off an arms race of spending — and it is true that more resources will almost always lead to a rise in rankings. While far from straightforward, understanding the nature of how the rankings are constructed can help institutions decide where to allocate their existing resources.

--

--

Tomas Dvorak
Tomas Dvorak

Written by Tomas Dvorak

I am a Professor of Economics at Union College in Schenectady, NY. I spent my last sabbatical on the data science team at a local health insurer.

No responses yet