Anthem | BioWare | Electronic Arts

Psychobabble: Using psychology to design the perfect loot system

Or: how to reward players for their investment.

By William Johanson @whatwillplay Mar 11, 2019 16:00 PM
One of the biggest trends in modern gaming right now is the "loot shooter." They are an evolution of the role-playing game that seeks to add some of the more typical RPG elements into a shooter-type: character customization, experience and leveling systems, and treasure. Lots and lots of treasure.
Some of the most popular game communities are those playing loot shooters: Destiny, The Division, Warframe, and Anthem. (Honorable mention here is Borderlands, but the world is looking out for the third installment at this point.)

If you were instantly triggered by the mere mention of one of those titles, you're not alone. The gaming world seems to have been turned upside down by the current loot debacle surrounding Anthem, BioWare's most recent role-playing game and its first foray into the semi-massive multiplayer online realm.
(If you are unfamiliar with Anthem, think of it as a Mass Effect: Andromeda 2.0 - community outrage included. If you are unfamiliar with Mass Effect: Andromeda, and the rest of the Mass Effect franchise, think of it as The Division in Iron Man suits. If you're unfamiliar with The Division ... it's Diablo with guns. Hopefully you know what Diablo is.)

Twice since Anthem's release (February 22, because the release schedule was another source of contention), BioWare seems to have identified and "fixed" bugs that were inadvertently increasing item drops. For BioWare, the fix probably seemed like a no-brainer - there was something in the code having an unintended outcome on the state of their game. But from the players' point of view, this was a harsh and personal punishment. And it happened twice.
There have been many posts and articles, both ridiculing BioWare and offering guidance or advice on how to "solve the loot problem." Travis Day, who worked on Diablo 3 and is highly regarded as the game's saving grace when he helped remodel its loot system, wrote an article on the game's subreddit addressing many of the concerns he had regarding Anthem's reward structure. But the obstacles BioWare has before it are ones that have plagued multiplayer role-playing games since ... well, forever. And if there is a game out there has done it right, it's not being talked about.

Reinforcement - psychology's "reward for doing a thing"



While many of us are familiar with the concept of classical condition due to Ivan Pavlov's research with salivating dogs (which, if you are unfamiliar with this, taken out of context sounds very strange). However, few are familiar with Edward Thorndike and B. F Skinner and their theories on operant conditioning.

In developmental and behavioral psychology, operant conditioning is a method of teaching/learning that is typically utilized to help stamp out problem behaviors - effectively, school-age children that exhibit behaviors that interfere with their own or others' learning. While it can seem cold and scientific in a way that makes the child seem like a test rat or guinea pig, when operant conditioning is used correctly, it can be effective - and fun.

There are two basic concepts in operant condition. First, and one most of us are probably familiar, the idea of reinforcement vs. punishment. In its simplest form, reinforcement is something good, something that we want to gain, and punishment is the opposite, something painful or harmful, something we want to avoid.
In the loot shooter, loot is reinforcement.

The second concept is the idea of positive vs. negative. We use these terms colloquially in a way that can sometimes make it difficult to understand how it's used in the sciences. Positive is something that is added or introduced. For example, when someone gets an MRI, if it comes back positive, that means the MRI picked up on something that probably should not be there. A positive MRI is probably a bad MRI.

Likewise, negative means something that is removed or taken away. Using the MRI example again, should you get results back that are negative, that's is very likely a good sign - "normal" MRIs are negative, meaning nothing out of the ordinary has been spotted.

Apply the same terms to reinforcement and punishment, and you end up with the four pillars of operant conditioning: positive reinforcement, negative reinforcement, positive punishment, and negative punishment.


Positive, negative, positive - wait, what?



Let's explore these terms in a little more detail. That should help us identify and define some of the problems and possible solutions for our loot shooter dilemma.

Positive and Negative Punishment

Punishment aims to reduce a behavior and is often viewed as "bad." Not only does the one getting punished not appreciate it (usually, but that's a topic for another day), it can have some serious negative side-effects. In the child-raising world, this is very common (it was common for me, and it was probably common for you). One of the reasons for this is because it's such an easy method of conditioning to implement. Punishment can be instant and extremely effective.

As a child, if I was misbehaving, I might have had my toys taken away from me. This would be referred to as negative punishment - something I like and enjoy is being removed. If I was really acting up, or just putting on a show in front of some of my parents' guests, I might have been spanked. This would be referred to as positive punishment - something I did not like and definitely did not enjoy being introduced.

In video game terms, punishment is commonplace and typically involves what could be considered "natural consequence" (for a virtual game, anyway). For example, if you don't move your character away from a glowing, fiery area on the ground, you might explode. You have been punished for your inaction!
Games like Dark Souls require players to collect items that they run the risk of losing if they aren't careful. This is a form of negative punishment - sometimes being greedy or sloppy can result in having to start over.

Very rarely do games introduce punishment mechanics for loot. Though, it is worth mentioning that old-school Diablo players might remember having their items fall to the ground whenever they were knocked out and sent back to town. This would be a form of negative punishment.

Positive and Negative Reinforcement

Reinforcement is the polar opposite of punishment. Its aim is to increase a behavior and is oftentimes regarded as "good." Reinforcement in the "real world" can be much more difficult to do right, especially during the years of childhood. For us adults, a "natural" form of reinforcement is our paycheck, but even that example as its limitations.
Negative reinforcement is the removal of something we don't want or like. In video games, the most common version of this is what's known as debuffs. Without going off on a whole tangent about what debuffs are, how they work, etc., suffice it to say that a debuff is like a sickness or an ailment that players usually use items or perform actions to remove.

Positive reinforcement is the additions of something we want or enjoy. Using the paycheck example from before, timesheets and hourly compensation is the most accurate version of this, as the amount earned can be directly tied to the work (or behavior) done. In video games, positive reinforcement is experience points, items, progression - most of what we identify as a video game is a network of positive reinforcement systems living off one another. The truth is, it's probably why video games are so popular (again, a topic for another day).

Schedules of reinforcement



According to the theory of operant conditioning, there are several ways in which positive reinforcement can effectively be delivered. This is the part where the psychology portion can get a bit out of hand. For the purposes of this article, we're going to try to keep it simple, because there is something very specific about these schedules that applies directly to video games - and the loot shooters in particular.

In general, reinforcement (and in our case, we are focusing primarily on positive reinforcement) can be delivered at different frequencies, which can be further broken down into "how much time has passed" and "how many times has the target behavior been performed."

In loot shooters, the most common reinforcement schedule is a type of "variable-ratio schedule." The variable-ratio schedule is a method of reinforcement where the player would be rewarded after performing a particular behavior (e.g., killing a boss) an unpredictable number of times. The first rewarded might drop after the first kill of a monster, for example, but then not again until the fifth kill, and then not again until the seventh kill.

The problem with this schedule is two-fold. First, it can be frustrating, and can actually have the opposite affect on the behavior. If your goal as a game developer is to keep players coming back for more, then a system that is unpredictable and has the potential to go long periods of time without delivering a reward can start to feel like a system of punishment.

Second, the way variable-ratio scheduling is implemented in video games does not take the individual players into consideration. Instead, it looks at the playerbase as a single entity and assumed that what is good for the community of players is good for the individual player. For example, if a monster has a 1% chance to reward a player with a legendary sword, that 1% is calculated through simulating extensive trials. For any one player to "predictably" get that piece of loot (assuming life and math are fair, which they aren't), they would need to slay that monster one hundred times. But if one hundred players did it, one person could walk away from their first battle with "sweet loot" and the experiment could be considered a success.

In theory, this could be a simple and effective reinforcement schedule for a multiplayer game with thousands of people playing and even ways to share items with one another, but for individual players that are usually looking out for themselves first, this could be a reason to leave.

Token economy - take my money!



Some multiplayer games have begun implementing "token economies." This is another type of reinforcement schedule that typically takes the form of "fixed-ratio scheduling." In a fixed-ratio schedule, players would be reinforced for a certain number of times they perform a particular action. To use the previous example of fighting monsters, in a fixed-ratio schedule that uses a ratio of 1:1 (which is what a token economy typically does), every time the player defeats the monster, they might get a medal. Then, let's say after collecting 25 medals, the player could cash them in for the "sweet loot" they've been after.

Token economies are flexible and help eliminate the sense that randomness could be "unfair" to individual players. Certain difficult monsters, for example, could reward more tokens for a job well done. The tokens could also be specific to particular areas, shops, or dungeons (or seasons, downloadable content, etc.), allowing developers to guide the playerbase into certain activities or events.

There are two major complaints against token economies, though. The first isn't necessarily a problem with the system, but rather its implementation. Token economies, just like the real world, can suffer from inflation and confusion. After time, particular tokens or token sub-systems might lose their value, whether as a result of new content being introduced (which is a critical, non-negotiable component of online gaming), or an evolving playerbase - they get smarter and stronger, identifying weaknesses in systems and in battle encounters that allow them to gather tokens more and more efficiently).

The other complaint is psychological in nature, and perhaps the number one reason why a token economy alone cannot solve the loot problem. Token economies take away a sense chance or luck that many could argue is necessary for keeping the mysticism of the game alive. Tokens turn an activity that could be fun and exciting ("Will it drop for me this time?") into a job, shattering the fantasy element and creating a grind many players are trying to escape from in their daily "outside" lives.

The perfect loot system CAN exist!



If you have played World of WarCrat at any point, especially during the Wrath of the Lich King expansion, you may have recognized that they used both variable-ratio and fixed-ratio schedules to reinforce players. Most quests in World of WarCraft utilized a variable-ratio schedule when looking for "Panther Pelts" or "Murloc Fins" or whatever other monster trash. At the raiding level, Blizzard implemented a token economy ("Badge of Valor") to help offset some players' "bad luck."

The problem is that the game had an identity problem as a result of trying to use two reinforcement schedules to solve a singular problem: how to make the hunt for loot feel more rewarding.

The main reason why BioWare's Anthem is catch so much flak right now is because the item drop rates are so low - which, based on the variable-ration schedule, means that some players could be seeing a lot of items and others could be seeing hardly any, resulting in an average that might simulate well, but leaves players fuming.

Diablo's "Loot 2.0" remodel solved its problem by increasing drop rates and reducing the amount of "wasted" statistics (points on the items that might be useless to specific characters, or were just too low to pay any mind). Those most vocal on Anthem's subreddit are demanding drop rate increases, going so far as to protest the game by turning it off en masse.

This would help alleviate some of the stress of players feeling as though they are wasting their time, but in terms of psychology, there is a better way.

Ideally, developers might want to consider a modified form of variable-ratio scheduling. Developers want players to keep coming back to play their games, but too often they cater to compulsion rather than desire. Many players will feel as though they have to keep playing and searching for a particular item because the psychology of sunk cost - they have already invested so much time and energy looking for it, they might as well keep up the search, because it has to show up at some point, right?

Wrong. Variable-ratio scheduling does not take player investment into consideration. And that is where Anthem (and many other role-playing and loot shooter games) are failing. Players must be rewarded for their investment, and their behaviors/actions must be reinforced, or the playerbase will move on from fatigue.

If you code it, they will play.



If Anthem truly wants to create the perfect loot system, it must find a way to account for player investment.

Variable-ratio scheduling is the most powerful method of reinforcement for modifying behavior. Case in point, it's the basic principle behind casino slot machines (and video game loot boxes). But we aren't trying to breed gamblers, we are trying to create a positive gaming experience - and a positive gaming culture, especially if "positive" means "rewarding." And we can do this by compounding it with a fixed-ratio schedule.

Anthem's loot system can be modified in one of two ways, or both if they are feeling adventurous:

(1) To account for and reward player investment, player behaviors can be used to adjust loot drop rates. For example, in Anthem, completing World Events could add an addition fraction of a percentage chance of getting a good (Masterwork or Legendary) item drop. Also, each enemy killed while in Freeplay could increase the chance. Once a good (Masterwork or Legendary) item drops, the counter can reset.

(2) To reward players for completing certain challenges, it makes sense to utilize a form of fixed-ratio scheduling. Anthem already has a version of guaranteed loot drops (from bosses, from Legendary Contracts), but the current iteration forces players into activities they might not like just to find a particular item. Players want to enjoy the game their own way, at their own pace, and not feel forced to do something they don't enjoy. Instead, guaranteed item drops should be more diverse, and they need to better match the challenge. For example, allowing the Tyrant Mine boss or Legendary Contracts to reward any type of Masterwork or Legendary item would mean players can decide on their own terms what activity they would like to do.

As it stands now, if a player doesn't need an ability upgrade, there's almost no reason for them to do a Stronghold. Likewise, if they don't need a component, they might altogether avoid Legendary Contracts. This is the opposite of what BioWare is probably trying to accomplish. The best way to "control" these drops would be to put limits on how many times the activity can be run or how many guaranteed loot drops there are in a day/week. It could be that players can get an automatic Masterwork/Legendary drop up to three times a day from any Stronghold boss, and then be subjected to the variable-ratio schedule discussed earlier (as an example). It could also be that each Stronghold will drop one guaranteed item a day, giving players incentive to venture into each Stronghold, as well as log on daily to play with others (as another example).

There are many ways this form of reinforcement scheduling can be implemented, but it must have some sort of guarantee at the end of an event or challenge that is supposed to be challenging - otherwise, why would anyone bother?

Fin.



Please keep in mind that reinforcement scheduling can get really complex - fun, whacky, but also out of hand. For BioWare's sake, keeping it simple for themselves first and foremost should be the key, that way they can continue to monitor and measure their system easily, and make adjustments as needed.

It should also be simple so that the players can understand it, at least on the surface. If frustration is the quickest way to drive away your players, then the quickest way to frustrate is to throw things at them that are too complicate to understand or figure out.

At the end of the day, players want to have fun, not work, and the easiest way to implement fun in a video game is by creating a sense of player agency that is immediate and fulfilling. In short, players want to get something back for what they are giving in, and there's a better way of doing it than simply flipping the "loot shower" switch.