Note: This post was written in response to a prompt from Brenda Romero and Ian Schreiber asking designers to share some of their favorite techniques for working in Excel. Check out the links to the original thread(s) for other useful pointers.
[Cross-posted from jameystevenson.com]
I have done a fair amount of economy design using Excel, and by far the most common activity I need to perform when configuring and tuning a game economy inside of a spreadsheet is to set up a particular data column so that each individual cell value is automatically calculated based on a formula that factors in one or more external variables (for instance, calculating the cost of each weapon based on a combination of how much damage it deals and how much space it takes up in the player's inventory).
Typically, the variables that are used for the various factors in these formulas are other properties of the same object, with each individual object occupying its own row within a table and the values for other properties of an object being located in the other columns of the corresponding row.
At a high level, these are the techniques I use in Excel to accomplish this common task in a way that keeps my economy organized and ensures that it is easy to maintain and update.
There are three general categories of columns in my economy spreadsheets. I keep the top row of each column in the economy sheet color-coded to indicate whether the values in that column are intended to be edited directly, or whether they are based on a formula. The three types of columns are:
- Green - Values in these columns are intended to be set manually, and are likely to require changes more often than other columns.
- Yellow - These columns contain equations that get applied to a normalized range of values (typically contained in the neighboring column just to the left) to alter the distribution of those values within a range. These formulas are more likely to require adjustments than other formula types, but still tend to change less frequently than the values in the green columns.
- White - Values in these columns are calculated based on formulas. These formulas are the least likely to require direct changes, because the values they produce can typically be controlled via the yellow distribution columns, the associated control variables (i.e. min-max ranges), and the factor weights found in the global variables table. Therefore, the formulas in these columns should only be altered as a last resort.
Normalized Value Ranges
For columns containing data that are either affected by or likely to affect the values in another column, I add new entries into my global variables table to store the min and max values in that column. Then, I add a new column to store the normalized values for each corresponding cell in the original column.
For example, say that Column D is a green column in the Weapons table storing the amount of damage each weapon inflicts. After filling out these values manually, I would add a new, named variable to my global variables table called 'WeaponDamageMax' and set it to "= MAX(Weapons[D:D])", along with a variable called 'WeaponDamageMin' set to "= MIN(Weapons[D:D])".
I would then create a new column in the Weapons table, Column E, to store the normalized damage values using the formula
"= (D1 - WeaponDamageMin) / (WeaponDamageMax - WeaponDamageMin)".
As mentioned above, I use yellow headings to denote a special type of column that I use to apply equations that take normalized data as input, and perform some mathematical operation in order to alter the distribution of values across the normalized range. This is done as a way to assert more control over the value spread and provide a method for breaking up clusters of values which can commonly occur.
For example, if a designer adds a new, ultimate weapon that deals 5,000,000 damage and rest of the weapon damage values in Column D are in the 1-200 range, the single out-of-proportion outlier value would ordinarily skew the other values into the lower portion of our 0-1 range when they are divided by WeaponDamageMax in order to produce the normalized damage values in Column E. This may be desirable in some cases, but there are other times when you want to break up those data clusters and spread them out more evenly across the entire range.
This is where distribution equations come in handy. Normalized values can easily be transformed to achieve the desired value spread by applying a distribution curve to the existing data. In this case, we would create a new yellow column, Column F, to store the adjusted values from Column D.
Although this technique is powerful and flexible, visualizing and generating the curve equations is much simpler with some external assistance. These tasks can be performed within Excel, but I personally prefer to use the following tools in order to visualize and alter my distribution equations:
Visualization - Graphing calculator
- Zoom in to the 0-1 range
- Copy formula from curve adjustment column into "y1 =" box
- Replace the normalized values column ID with "x"
- Press the "graph" button to see the adjustment curve
Alteration - Easing function generator
- Move the control points until the curve looks like you want it to
- In the function box, copy the portion after "b+c*" on the last line of the function (the part in parentheses)
- Paste the copied equation into the curve adjustment column
- Replace all instances of "t" with normalized values column ID (nvcID), then all instances of "ts" with nvcID^2, and all "tc" with nvcID^3
In the case of the normalized weapon damage values in Column E, we might try to stretch the value cluster in the lower portion of our range out using an initial formula for Column F like
"= (-6.885*(E1^3)*(E1^2) + 19.885*(E1^2)*(E1^2) + -18.395*(E1^3) + 3.825*(E1^2) + 2.57*E1)".
This equation was originally obtained using the graphical easing function generator listed above, and can be tweaked and adjusted as needed until the distribution is suitably even. Again, this formula would be applied to the rest of the rows in Column F using the typical method of dragging the selected formula cell down along the rest of the cells in the column to increment E1 accordingly for each row.
Whatever distribution equation you apply, be sure that the resultant values remain within the 0-1 range.
Range Control Variables
For values that are going to be automatically calculated based on a formula (i.e. the cost of each weapon in this example), I find it helpful to be able to directly control the minimum and maximum values of that property in order to keep things bounded within a desired range. To this end, I add adjustable min/max variables that correspond to each of the white, formula-based property columns that are populated with automatically generated values.
In the case of weapon cost, I would add two new named variables within the global variables table: WeaponCostMin and WeaponCostMax. These can be manually changed at any time - for instance, if the WeaponCostMax is 90,000 but the player can only collect a maximum of 50,000 coins, you could easily bump down the maximum end of the range by setting WeaponCostMax to 50,000 to ensure that all weapons can be purchased at some point during the game.
Formula-based property values can be calculated using one or more input values, which I refer to as the "factors" that contribute toward determining the final, generated value. In the example of calculating the cost of a weapon, the factors are:
- The amount of damage the weapon deals (higher damage = higher cost)
- The amount of inventory space the weapon requires (more space = lower cost)
At this stage, we assume that each individual factor has already been normalized and adjusted using a distribution equation (i.e. Column D held the original weapon damage values, Column E held the normalized weapon damage values, and Column F holds the final, distribution-adjusted weapon damage values that will be used in our cost determination formula).
Let's assume that our final weapon damage values are in the yellow Column F, and our final inventory space requirement values are stored in yellow Column U. We now want to fill in a formula for white Column C, which will store the automatically calculated purchase cost for each weapon. One of the simplest and most straightforward formulas that would take both of our desired factors into account would be to perform a basic average of the two factors by setting Column C to
"= WeaponCostMin + ((WeaponCostMax - WeaponCostMin) * ((F1 + (1.0 - U1)) / 2.0))".
Note that, once again, we'll need to drag the formula down after entering it to apply it to the rest of the cells in Column C.
This will produce a cost value for each weapon that falls within our specified min-max range, and is based on the corresponding amount of damage and inventory space requirements of that weapon. However, it also means that both of these factors are equally influential in determining the final cost, which is often undesirable when basing a calculated value on multiple input factors.
A quick, easy and flexible solution to this issue is to convert your equation into a weighted mean, and add a manually configurable weight value for each factor to the global variables table. This allows you to adjust the relative influence of each individual factor at will.
In this case, we would add two new named variables to the global variables table: WeaponCost_DamageFactorWeight and WeaponCost_SpaceFactorWeight. These values are just relative integer weights, so if I want weapon damage to be more influential than the weapon's inventory space requirement with regard to determining the weapon's cost, I could try setting WeaponCost_DamageFactorWeight to "= 5" and WeaponCost_SpaceFactorWeight to "= 1".
Finally, our updated formula for calculating the weapon cost in Column C using a weighted mean would look like this:
"= WeaponCostMin + ((WeaponCostMax - WeaponCostMin) * (((F1 * WeaponCost_DamageFactorWeight) + ((1.0 - U1) * WeaponCost_SpaceFactorWeight)) / (WeaponCost_DamageFactorWeight + WeaponCost_SpaceFactorWeight)))"
Note that the weighted mean equation only works if the input values for damage and inventory space are constrained within the 0-1 range, so this is yet another reason to normalize our input data before performing any additional operations on it.
The method outlined above is by no means perfect, but I am sharing it anyway in the hope that at least some aspect of it will prove useful to others. Although these techniques are fairly rudimentary, there is always room to simplify and I would welcome suggestions from other designers regarding how this approach could be further streamlined.
For instance, I have found that the process of adjusting the distribution curves can be a stumbling block for designers attempting to put these techniques into practice without much prior experience with game balancing. If you find that working with curve equations is too cumbersome, you may wish to explore alternative methods such as linear segment graphs instead. For an example of utilizing a linear segment graph for balancing, check out Wolfgang Graebner's detailed, reverse-engineered analysis of the time monetization formulas used in Clash of Clans.