1) Effects of Varying Learning Rate:In this experiment,the
mutation rate is set to zero and the learning rate is varied from
0.1 to 1.0 in steps of 0.1.The data of the mean scores,standard
deviation,and winning percentages over 5000 games are
omitted for conciseness.Instead,the results based on the two
criteria described in Section III-A are graphically presented in
Fig.7(a).The difference in winning percentage (red
line) should be minimal.The number of draws (black line)
should also be minimal because a high number of drawn games
is deemed as more frustrating than fun.The difference between
the mean scores (blue line) should be minimal.The
higher of the two scores (green line) should be
maximal because a high average score indicates a competitive
and fast paced game that is deemed to provide more satisfaction
to the player.
It is observed from Fig.7(a) that the general trend of increasing
the learning rate is a gentle increase in mean score
differences (blue) and also a large increase in winning percentage
difference (red).This is because a large learning rate
will quickly saturate the chromosome values to either 0 or 1.
The resulting fluctuations in the chromosome values produce
erratic behaviors that are unable to adapt and track the progress
of its opponent during the game.At low learning rates,the
score differences (blue) and winning percentage differences
(red) are smaller and the AUC is able to match its opponent
in both criteria.From numerical data,a learning rate of 0.1
obtained the best result for seven out of ten evaluation criteria
(two evaluation criteria for each of five static controllers).It
is also the dominant learning rate for three out of five static
controllers,namely,HC,NNC,and PSC.Although the learning
rate of 0.1 did not obtain the best result for either evaluation
criteria against the PFC,the mean score difference of 0.01 and
winning percentage difference of 0.84 are considered within
acceptable range.Therefore,a learning rate of 0.1 is chosen as a
good general rule of thumb that can be used in situations where
opponents are varied and unknown.This value of learning rate
will also be used as a default value in the experiment of varying
mutation rate in Section V-C2.
It is also worth noting that from the numerical data,for
,the mean score of the adaptive controller is higher than that
of the static controllers,but the winning percentage of the AUC
is lower than that of the static controllers.This is likely caused
by the AUC losing frequently by small margins but winning by
large margins.This exemplifies that higher mean scores do not
directly imply higher winning percentages.
2) Effects of Varying Mutation Rate:In this experiment,the
learning rate is set to 0.1 by the observations in the previous
section and the mutation rate is varied from 0.1 to 1.0 in steps
of 0.1.
1)变学习速率的影响:在本实验中,themutation率被设置为零,并且学习速率是变化from0.1步长为0.1到1.0.数据的平均分,standarddeviation,并赢得超过5000为简洁areomitted游戏百分比.相反,结果基于对twocriteria的描述在第III-A以图形方式呈现于图.图7(a).胜率(红线)的差异应该不大.期(黑线)的数量也应该是最小的,因为大量的绘制gamesis的视为更令人沮丧的比乐趣.(蓝线)之间的差异平均得分应该是最小的.Thehigher两个分数(绿线),应bemaximal,因为一个高平均得分表示一个competitiveand的的快节奏的游戏,被视为提供更多的satisfactionto观察到的player.It从图.7(A)的总趋势的increasingthe学习率是一个温柔的平均scoredifferences增加,(蓝色),并大幅增加在的获奖percentagedifference(红色).这是因为一个大的学习ratewill迅速饱和的染色体的值是0或1的染色体产生的波动值无法适应和跟踪progressof其对手在游戏过程中的produceerratic行为.在学习速率低,犯满离场差异(蓝色)和胜率的差异(红色)更小,AUC是可以,到符合其opponentin的这两个条件.从数值数据,学习率的0.1obtained的最好成绩为7的10个评价标准(评价标准为每五个静态控制器).ITIS学习率也占主导地位的三5 staticcontrollers,即,HC,NNC,PSC.虽然0.1的learningrate的没有获得最好的结果任evaluationcriteria对PFC的平均得分差异的0.01 andwinning的百分比差值为0.84认为是withinacceptable的范围.因此,学习率为0.1选择一个好的经验法则,一般可以使用的情况下whereopponents是多种多样的和未知的.学习ratewill此值也可用于在第varyingmutation速率在实验中作为默认值的V-C2.It也是值得注意的,从数值数据,自适应控制器,平均得分高于thatof静态控制器,但胜率的AUCis低于静态控制器.这可能是在AUC失去了经常小的利润但赢得bylarge利润的causedby.这充分体现了更高的平均得分,做notdirectly意味着更高赢得percentages.2)效果变诱变评估:在这个实验中,thelearning率设置为0.1,由观测在previoussection和突变率是从0.1到1.0之间变化stepsof 0.1 .