5

Damaging move or status move: that is the question

 3 years ago
source link: http://www.zhengwenjie.net/pokemon/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
battle

Introduction

In all Pokemon’s main series (i.e., from Red/Green/Blue/Yellow to Sun/Moon), at the very beginning of the gameplay, once the player gets his very first Pokemon (e.g., Pikachu), he is invited to battle against his rival (an AI-controlled NPC), who also acquires his first Pokemon (e.g., Eevee). The result of this battle is not critical, but the player can get a bonus if he wins the battle.

In this battle, the player’s Pokemon has a damaging move and a status move. The former deals damage, while the latter changes the statistics (aka stats) of the Pokemon (e.g., raise self’s attack/defense/evasion, reduce the opponent’s attack/defense/accuracy). Meanwhile, the rival’s Pokemon has similar moves. A natural question coming into every player’s mind is whether it is better to spam the damaging move or to look for some smarter tactic. This article uses the optimal stopping theory to answer this classic question.

Solution

If the status moves of both Pokemon cancel each other, it will be useless to use the status move: If the player’s optimal tactic is to use the status move, then the rival’s optimal tactic will surely be to use his status move to cancel the benefit generated by the player’s status move. Luckily, the rival’s AI is not very intelligent, and it is thus reasonable to suppose that the rival spams the damaging move without using the status move. If we can beat the rival’s damaging move spam, then we are likely to beat many of the rival’s playing styles. Consequently, the original question is transformed to another one: encountering such a rival, whether the player can win via the status move if he cannot win by simply spamming the damaging move. The answer is yes: the status move does help in winning the battle.

Let PP be the probability of the player’s winning the battle, which is a function of both Pokemon’s statistics (S1,S2)(S1,S2) and the player’s tactic AA. The task of the player is to maximize the following optimization problem maxA∈AP(S1,S2,A),maxA∈AP(S1,S2,A), where A={All tactics}A={All tactics}.

Reason tells us that there is no point in using the status move once one has started using the damaging move. Otherwise, one may just reverse the order of these two moves, which will increase the objective function value. Therefore, the above optimization problem is equivalent to the following one maxA∈BP(S1,S2,A),maxA∈BP(S1,S2,A), where B={All tactics that stop using status moves before using damaging moves}B={All tactics that stop using status moves before using damaging moves}. The formulation above is exactly an optimal stopping problem: when to stop using status moves and to start damaging moves?

In this article, I will talk about three cases with different types of status moves:

  1. raising self’s attack or reducing the rival’s defense,
  2. raising self’s defense or reducing the rival’s attack,
  3. raising self’s evasion or reducing the rival’s accuracy.

Without loss of generality, we can suppose that

  1. both Pokemon have equal initial statistics, except for the Hit Point (HP), which are denoted as S1S1 for the player and S2S2 for the rival;
  2. each successful damaging move will reduce the HP by exactly 1;
  3. the player plays first.

Raising self’s attack / reducing the rival’s defense

Every status move can be used 6 times (6 stages). Each degree of buff or debuff is not proportional. For the buff, the stage multipliers from +1 to +6 are 3/2, 4/2, 5/2, 6/2, 7/2, 8/2. For the debuff, the stage multipliers from -1 to -6 are 2/3, 2/4, 2/5, 2/6, 2/7, 2/8. For example, if a Pokemon gets +1 stage on attack, it will produce one and a half of the original damage; if it further gets +2 stage, it will produce twice as much as the original damage; if it gets -1 stage, it can only produce 2/3 of the original damage. The defense statistic is the opposite: If it gets +1 stage on defense, it will only receive 2/3 of the original damage.

At the first round, if the player uses the damaging move, the HPs of the player and the rival will become (S1−1,S2−1)(S1−1,S2−1) after the rival’s move. If the player uses the status move, he will deal 3/2 damage per round afterwards, or equivalently, the rival’s Pokemon will have 2/3 of its original HP, and the player will deal 1 damage per round afterwards. The consequence of the status move will thus be (S1−1,23S2)(S1−1,23S2). If the player uses the status move again, it will become (S1−2,24S2)(S1−2,24S2)…

Obviously, if 23S2<S2−123S2<S2−1, it will be beneficial to use the status move. If 23S2≤⌈S1⌉−123S2≤⌈S1⌉−1, then the player can stop the status move right now and win the battle by spamming the damaging move afterwards. Otherwise, he can further check whether 24S2<23S2−124S2<23S2−1… The player keeps using the status move until the ceiling of his remaining HP is no less than his rival’s effective HP. Then, the spam of the damaging move will win him the battle. If he cannot enter this scenario within six status moves, he will lose.

Mathematically, Pt(S1,S2,A⋆)=max{⌈S1⌉≥S2,Pt+1(S1−1,2+t3+tS2,A⋆)},∀t≤6Pt(S1,S2,A⋆)=max{⌈S1⌉≥S2,Pt+1(S1−1,2+t3+tS2,A⋆)},∀t≤6 P7(S1,S2,A⋆)=0,P7(S1,S2,A⋆)=0, where the subscript tt denotes the turn number. The condition expression equals to 1 if true and to 0 otherwise. P0P0 is what we look for.

Moreover, with simple calculation, the winning domain, where the player wins over the rival, is represented by {(S1,S2):maxt=0,1,…,62+t2(⌈S1⌉−t)≥S2}{(S1,S2):maxt=0,1,…,62+t2(⌈S1⌉−t)≥S2}

Raising self’s defense / reducing the rival’s attack

The situation is similar to the previous case with the exception that the status move immediately takes effect. At the first round, if the player uses the status move, the HPs will become (32S1−1,S2)(32S1−1,S2). If the player continues at the second round, the HPs will become (43(32S1−1)−1,S2)(43(32S1−1)−1,S2).

Mathematically, Pt(S1,S2,A⋆)=max{S1>⌈S2⌉−1,Pt+1(3+t2+tS1−1,S2)},∀t≤6Pt(S1,S2,A⋆)=max{S1>⌈S2⌉−1,Pt+1(3+t2+tS1−1,S2)},∀t≤6 P7(S1,S2,A⋆)=0.P7(S1,S2,A⋆)=0. Again, the fraction in front of S1S1 is due to non-proportional stage multipliers.

With simple calculation, the winning domain is represented by {(S1,S2):maxt=0,1,…,62+t2S1−t∑i=12+t2+i>⌈S2⌉−1)}.{(S1,S2):maxt=0,1,…,62+t2S1−∑i=1t2+t2+i>⌈S2⌉−1)}.

Raising self’s evasion / reducing the rival’s accuracy

The stage multipliers for evasion and accuracy are less aggressive. From -6 to +6, they are 3/9, 3/8, 3/7, 3/6, 3/5, 3/4, 3/3, 4/3, 5/3, 6/3, 7/3, 8/3, 9/3. For example, if a Pokemon gets +1 evasion or its opponent gets -1 accuracy, this Pokemon will have only 3/4 chance to be hit by the opponent’s damaging move; status moves are not affected.

Like the previous case, the player gets the benefit of these stage multipliers at once, but the consequence is a bit more subtle than that. The player could use the status move against stronger rivals, over whom he cannot win otherwise. However, the winning rate is not 100%. Therefore, let us maximize his probability of his winning.

Mathematically, Pt(S1,S2,A⋆)=maxa∈0,1(34+t−aPt+1−a(S1−1,S2−a,A⋆)+1+t−a4+t−aPt+1−a(S1,S2−a,A⋆)),∀t≤6Pt(S1,S2,A⋆)=maxa∈0,1(34+t−aPt+1−a(S1−1,S2−a,A⋆)+1+t−a4+t−aPt+1−a(S1,S2−a,A⋆)),∀t≤6 P7(S1,S2,A⋆)=0,P7(S1,S2,A⋆)=0, where a=1a=1 means using the damaging move, otherwise using the status move. In order to solve it, we need dynamic programming. The code is shown in the next section.

Demonstration

By using the following code,

import numpy as np
import matplotlib.pyplot as plt

n = 30
m = 7

### 1st case
x = np.zeros((m, n))
for t in range(m):
    x[t, :] = (2 + t) / 2 * (np.arange(n) - t)
x = np.max(x, axis=0)

### 2nd case
y = np.zeros((m, n))
y[0, :] = np.arange(n)
for t in range(m - 1):
    y[t + 1, :] = (3 + t) / (2 + t) * y[t, :] - 1
y = np.max(y, axis=0)

### 3rd case
z = np.zeros((m+1, n, 3*n))
for j in range(3*n):
    for i in range(n):
        for t in np.arange(m-1, -1, -1):
            if i >= j:
                z[t,i,j] = 1
            elif i == 0:
                z[t,i,j] = 0
            else:
                z[t,i,j] = max((3*z[t,i-1,j-1]+t*z[t,i,j-1])/(3+t), (3*z[t+1,i-1,j]+(t+1)*z[t+1,i,j])/(4+t))
z = z[0,:,:]

plt.subplot(1,2,1)
plt.plot(x)
plt.plot(y)
plt.plot(np.arange(n), 'y')
plt.imshow(z.T, cmap=plt.cm.Reds)
plt.colorbar()

plt.subplot(1,2,2)
plt.plot(x[0:10])
plt.plot(y[0:10])
plt.plot(np.arange(n/3), 'y')
plt.imshow(z[0:10, 0:30].T, cmap=plt.cm.Reds)
plt.colorbar()
plt.show()

we get this figure

figure.png

The right subplot is a zoom-in of the left one. The x-axis represents the player’s initial HP, while the y-axis the rival’s HP. The curves in the above figure show the winning domain boundary. The area above the yellow line is the winning domain of spamming the damaging move. The area above the blue curve is the winning domain in the 1st case. The green curve is for the 2nd case. The red heatmap represents the player’s winning probability in the 3rd case.

From this figure, we observe that status moves do help the player increase the winning chance. Moreover, the 2nd type status moves are stronger than the 1st type. This is not surprising: In the 2nd case, the player gets the benefit immediately. Asymptotically, the 3rd type is weaker, for its stage multipliers are less aggressive, but, around the origin, the 3rd type can be very useful.

Conclusion

Throughout the article, I supposed that the rival spammed the damaging move, which might not be the case in practice, but it does be one of the many possibilities. In this scenario, I proved that the status move could help the player win the battle that he cannot win otherwise. Therefore, we can safely draw the conclusion that status moves are not a decoration in Pokemon’s gameplay, and that the player is invited to use it wisely.

You may also like


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK