[双语]用机器学习解密皇马过去十年的西甲成绩为何如此惨淡

翻译:Gammy 校对:Clara
RM字幕组翻译作品,未经授权严禁转载
This is a slightly shorter version of a longer paper, a link to the full paper is here. A link to the code and the datasets used is here. To see the final results of this paper, skip to the Conclusion.
这是一篇较长的论文的一个略短的版本,全文链接在这里(https://docs.google.com/document/d/1n_wVq_SgYqU_9QUemZGjnySbKROs29gx8iH9gaP2ydU/edit)。
论文涉及的代码和数据集的链接在这里(https://github.com/bdominique/Real-Madrid)。要查看最终结论,请跳到文末的结论部分。
Introduction
简介
While there have been legendary Galácticos at Real Madrid in the past, perhaps none have been more impactful to the club than Cristiano Ronaldo. Signed in the summer of 2009 as part of the answer to arch-rival Barcelona’s dominance of Spain and Europe (They were the first team to win six trophies in one year), Ronaldo would stay at Madrid for almost a decade. The hope was that Ronaldo would be able to bring Madrid into a new era of dominance like the one that Barcelona was witnessing, and while it did take some time, eventually success started to come. Madrid was able to win the Copa del Rey in Ronaldo’s second season (The club’s first trophy in three years), La Liga the next year (The first title since the 2007-08 season) and the Champions League two years later (The first title in 12 years). The defining moments of the Ronaldo Era at Madrid would almost all come in the Champions League, as Madrid went to the semifinals of the competition 8 years in a row and was able to win the entire thing 4 times in the span of 5 years; to say that this is anything other than legendary is an understatement.
虽然皇马曾经拥有传奇的银河战舰一期,但也许没有人比C罗对俱乐部的影响更大。在2009年夏天签约C罗是皇马对抗死敌巴萨(年度六冠王)的战略的一部分,C罗从此在皇马待了近十年时间。人们希望C罗能够将皇马带入一个像当时巴萨那样的霸主时代,虽然这需要一些时间,但最终皇马开始不断地收割成绩。皇马在C罗加盟的第二个赛季获得国王杯冠军(俱乐部三年来的第一个冠军),再一年后获得西甲冠军(2007/08赛季以来的第一个联赛冠军),再两年后获得欧冠冠军(12年来的第一个欧冠冠军)。C罗在皇马的高光时刻几乎都在欧冠赛场上,皇马连续8年杀入欧冠半决赛,并在5年内4次夺得欧冠冠军:如果这都不是传奇那什么才是传奇?
However, despite Madrid’s success in the Champions League, their struggles in domestic competition cannot be overlooked. In Ronaldo’s time at Madrid (nine seasons), Real Madrid was only able to win La Liga two times. In the same time period, Barcelona won the league title 6 times and city rivals Atletico de Madrid won the title once. While the Ronaldo Era won’t be looked back on as unsuccessful, the question remains: How can a team that was so dominant in one competition be so underwhelming in another?
不过,尽管皇马在欧冠赛场上取得了成功,但他们在国内比赛中的挣扎状态也不容忽视。在C罗效力皇马的9个赛季中,皇马仅夺得2次西甲冠军。同一时期,巴萨夺得6次西甲冠军,同城死敌马竞也夺得1次西甲冠军。虽然C罗时代的皇马不会被认为是不成功的,但仍让人充满疑问的是:一支在某个比赛中占据绝对优势的球队,怎么会在另一个比赛中如此不堪一击?
Today I’m going to attempt to answer this question by using Machine Learning. I’ve created a dataset for Madrid and Barcelona that lists every single time they’ve lost or drawn against an opponent in La Liga from the 2009-10 all the way until the 2017-18 season, which covers every season that Ronaldo was at Madrid. On each dataset, I’m going to use a Machine Learning technique called K-Means Clustering (K-Means or KMC for short) to find the average teams that Madrid and Barcelona are losing points to and see if there are any clear differences. K-Means Clustering will be done in a couple of different scenarios to get a better understanding of where exactly Madrid is struggling and Barcelona has been succeeding (For example, dividing the dataset up by manager and doing KMC for each manager).
这次,我就尝试用机器学习来回答这个问题。我为皇马和巴萨建立了一个数据集,包含了从2009/10到2017/18赛季,两队在西甲联赛中的每场失利或平局,该区间覆盖了C罗在皇马的每个赛季。在每个数据集上,我将使用一种名为K均值聚类(K-Means Clustering,简称为K-Means或者KMC)的机器学习技术来获取击败皇马和巴萨的球队的平均数据,并看看他们是否有明显的差异。为几个不同的场景生成KMC集群,以便更好地了解皇马到底在哪方面比较鸡肋,巴萨到底在哪方面相对成功(例如,将数据集按经理划分,为每个经理做KMC)。
The Dataset
数据集
The dataset for each team contains a record of all of their draws and losses from the 2009/10 season until the 2017/18 season. The data is made up of the following stats (in the Machine Learning world these are called features):
每个球队的数据集包含了从2009/10到2017/18赛季的所有平局和输球记录。数据包含以下统计值(在机器学习领域,这些统计值被称为特征):
·Time of the season, which is represented by the matchday feature. With this feature we can get a better idea of how Madrid and Barca losses may be related to certain times of the season (i.e. the beginning when they’re still trying to get into form, the end of the season when they’ve both played around 60 games total).
·赛季的时间点,用比赛日特征(matchday feature)来表示。通过这个特征,我们可以更好地了解到皇马和巴萨的失利可能与赛季中某些时间段有关系(即赛季初还在努力进入状态,赛季末两队都打了60场左右的比赛)。
·The amount of rest that each team has had since their last game, represented by the days_since_last_game features. One thing to note is that often times the amount of rest that these two get compared to their opponents is skewed because of the international breaks; other teams in the league get to rest their players for usually 13-15 days during these periods, but in this same time frame Madrid and Barca players are playing multiple games for their country and travelling across the globe to get to these games.
·每支球队在上一场比赛后的休息天数,用days_since_last_game特征表示。有一点需要注意的是,由于国际比赛日的影响,很多时候这两支球队的休息时间与对手相比是有偏差的;联赛中的其他球队在这些时间段内通常有13-15天的休息时间,但在同一时间段内皇马和巴萨的众多球员要奔波于全球各地为自己的国家队打多场比赛。
·Whether the game was home or away, represented by the home_0_away_1 feature.
·主场还是客场比赛,用home_0_away_1特征来表示。
·The strength of the opposition in the form of where they placed in the league that year (final_league_position), as well as the form of the opposition going into the match (elo_opp). While most sites list form as an important factor in determining which team is going to win, there hasn’t been much research into quantitatively determining the form of a team outside of looking at the last 5 or so results. The website clubelo.com is the only source I was able to find that does this. It allows users to look at a team’s form (called elo, which is a statistical calculation of points your team has as a result of their results across decades), which gives us the information we need about the form of our opposition.
·对手的强度,由对手在该赛季的最终排名(final_league_position特征)和对手在本场比赛中的状态(elo_opp特征)这两部分组成。大多数网站都把球队状态作为决定哪支球队获胜的重要因素,但除了用最近5场比赛状态来评估之外,没有更多更好的办法来评估球队状态。网站clubelo.com是我能找到的唯一一个能做到这一点的网站。它允许用户查看一支球队的状态(称为ELO,代表某球队近几十年来的比赛结果的统计数据),从而为我们提供了对手球队的状态数据。
·The form of Madrid and/or Barca, depending on which team we’re looking at (elo_madrid or elo_barca)
·皇马/巴萨的状态,用elo_madrid/elo_barca表示。
·The difference in elo between the two teams (diff_elo)
·皇马/巴萨和比赛对手的状态差(用diff_elo表示)
·And finally the betting odds of a Madrid/Barca win (odds_of_madrid_win or odds_of_barca_win), which tells us how the team performed compared to what professional bookmakers expected.
·最后是皇马/巴萨赢球的投注赔率(用odds_of_madrid_win/odds_of_barca_win表示),它告诉我们球队的实际表现与专业博彩公司预期之间的差异。
·Data collected after the 2014/15 season contains an extra feature called xg_diff, which calculates the difference in xG by simply doing (the xG of Madrid/Barca in the specific match) minus (the xG of their opposition).
·2014/15赛季后的数据包含一个额外的特征叫做xg_diff,它等于皇马/巴萨的xG减去对手的xG)。[译者注:xG是expected Goals(预期进球),简单来说是用来评价球队创造进球机会能力]
This dataset also only considers games where the league title hasn't been won yet. The reasoning behind this is that it's hard to gauge the team's motivation since there's nothing left to play for; Madrid and Barca tend to let their starters rest and be ready for The Champions League and Copa Del Rey when they can’t win the league, so this opens up the door for lineups with bench players, new formations, and a general lack of motivation to win.
这个数据集也只包含了确定联赛冠军前的比赛。这背后的原因是在确定联赛冠军之后很难衡量球队对剩余比赛的积极性;当皇马和巴萨在无法赢得联赛冠军时,往往会让他们的主力球员休息为欧冠和国王杯做好准备,这为板凳球员上场和调试新阵容提供了机会,同时这样的战略也让球员普遍缺乏在联赛中赢球的动力。
The only additional feature that I listed as important but was not able to collect was the amount of starters that Madrid/Barca was missing during a game that they lost or drew. I was not able to collect data on this factor due to a lack of information about decisions on the starting lineup made by the manager. There are multiple reasons for why a player may or may not be included in the starting lineup (injuries, suspension, rest, preparing for a bigger game, the player is not playing well, etc.) and without this information for each player it’s hard to know whether a manager is fielding a full-strength team or not.
唯一一个我认为很重要但又无法收集到的特征是皇马/巴萨在输球或平局的比赛中缺席的主力球员的数量。由于缺乏主教练对首发的决策相关的信息,我无法收集到这方面的数据。一个球员入选或不入选首发阵容有多种原因(受伤、停赛、休息、备战更大的比赛、球员发挥不佳等),如果没有每个球员的这些信息,就很难知道主教练是否派了全主力出场。
As an example, Madrid in the 2013-14 season was known to use both a 4-4-2 formation and a 4-3-3 formation depending on what type of offensive/defensive scheme then-manager Carlo Ancelotti was looking to implement. Each formation used different players according to the needs of the team and the form of the players, so it’s hard to know whether a player wasn’t included in the starting lineup due to one of these factors, or something else like injury (which for the earlier seasons is especially hard to find information on). Because I couldn’t capture this data accurately, I decided it was best to not include this feature.
举个例子,2013/14赛季的皇马使用4-4-2和4-3-3-3阵型,这取决于当时的主教练安切洛蒂希望实施什么样的攻防方案。每个阵容根据球队的需要和球员的状态来选择首发阵容,所以很难知道某位球员没有被纳入首发阵容是由于这些因素之一,还是因为伤病等其它因素(特别对于前几个赛季来说,这方面信息特别难找)。因为我无法准确地获取到这些数据,所以我决定最好的方法是在数据集中排除这个特征。
K-Means Clustering
K-Means Clustering (KMC) is an unsupervised Machine Learning method that computes the average at multiple locations in numerical data. The k in KMC is the number of averages, or clusters, that are being computed; k is a number that’s greater than 1 but less than the number of rows in your dataset. The basic steps of KMC are as follows:
K-Means Clustering(KMC)是一种无监督的机器学习方法,可以计算数字型数据中多个位置的平均值。KMC中的k是正在计算的平均数、或者群集的数量;k是一个大于1但小于数据集中的行数的数字。KMC的基本步骤如下:
1.Pick k random points in your data to start the algorithm (For us, pick k rows of data). These points will serve as our initial clusters C1, C2, C3, …., CK.
在你的数据中选取k个随机点开始计算(在我们此次研究中,选取k行数据)。这些点将作为我们的初始集群C1、C2、C3、……、CK。
2.For each point Di in the data D, calculate the euclidean distance between Di and each cluster C. The C with the lowest euclidean distance is the cluster that Di gets grouped into. For two dimensional data for example, we would be getting the euclidean distance by doing the calculation sqrt((x2-x1)^2 + (y2-y1)^2) for each cluster we have, where (x1,y1) are the coordinates of one cluster and (x2,y2) are the coordinates of Di.
对于数据D中的每一个点Di,计算出Di与每个集群C之间的欧几里得距离, 欧几里得距离最小的C就是Di归属的集群。以二维数据为例,我们将通过计算sqrt[((x2-x1)^2+(y2-y1)^2)来得到欧几里得距离,其中(x1,y1)是一个群集的坐标,(x2,y2)是Di的坐标。
3.After each Di has been grouped to a cluster, calculate the average of each cluster. If a cluster C1 has 2 points of two dimensional data grouped to it, for example, then the coordinates for the new C1 would be ((x1 + x2)/2, (y1 + y2)/2)
所有Di都找到归属到群集后,计算出每个群集的平均数。例如,如果一个群集C1有2个归属于其的二维点,那么C1新的坐标为[(x1+x2)/2,(y1+y2)/2]。
4.Repeat steps 2 and 3 until the Di in each C remains the same.
重复步骤2和3,直到每个C中的Di保持不变。
The benefits of using KMC as opposed to a normal average calculation is clear. For example, pretend that we’re using KMC on the matchday feature in our data for one season, and for this season Madrid lost/drew against teams on matchdays 1,2,3,4,35,36,37 and 38. If we were to calculate the average normally we’d get 19.5, which would lead us to believe that this season Madrid lost a majority of its games near the halfway point, but we can obviously see this is wrong. Using KMC with number of clusters k = 2 would give us a much better representation of the data and tell us that on average, Madrid lost to teams in the beginning of the season (1 Cluster would contain the points 1,2,3,4) and the end of the season (The second cluster would contain the points 35,36,37,38).
使用KMC算法相对于一般的平均数计算算法的好处是显而易见的。比如说,假设在一个赛季中皇马在第1、2、3、4、35、36、37和38比赛日失利/平局(matchday特征值为1、2、3、4、35、36、37、38),如果我们使用一般的方法计算平均数的话,我们会得到19.5,这将导致我们认为本赛季皇马在半程附近输掉了大部分的比赛,但我们可以看出这是一个明显的错误;如果使用KMC的集群数量k=2,这个算法会更好地为我们展示数据并且告诉我们:平均来说皇马在赛季初输球(第1个集群包含了matchday 1、2、3、4)和赛季末输球(第2个集群包含了matchday 35、36、37、38)。
For more information on KMC, take a look at this video from StatQuest with Josh Starmer.
关于KMC的更多信息,请观看StatQuest与Josh Starmer的视频(https://www.youtube.com/watch?v=4b5d3muPQmA)。
One of the most important parts of KMC is deciding on how big or small k should be. KMC is clearly a very powerful algorithm, but when used with the wrong value for k then it becomes difficult to extract meaningful information from your data.Fortunately, there are many methods that have been developed to help determine the best k. For this analysis I’ll be using Silhouette Analysis, which involves measuring 1) the distance between each point in a given cluster (you want this value to be small since it measures how similar the data points in this cluster are to one another) and 2) the distance between each cluster (you want this value to be large since it measures how dissimilar this cluster is the others).
KMC最重要的部分之一就是决定k的大小。KMC显然是一个非常强大的算法,但是当使用错误的k值时,就很难从数据中提取有意义的信息。幸运的是,有很多方法可以帮助确定最合适的k值。在这篇文章中,我将使用轮廓分析法,该方法需测量1)给定的集群中的每个点之间的距离(这个值越小越好,因为它可以衡量这个群集中各个数据点之间的相似度)和2)每个集群之间的距离(这个值越大越好,因为它可以衡量这个集群所包含的数据点与其它集群数据点间的差异程度)。
More information on how these two values are calculated can be found here and I adapted the code from this tutorial to find the optimal k for my data.
两个值的计算方法的更多信息可以在这里(https://en.wikipedia.org/wiki/Silhouette_(clustering) )找到,
我修改了这个教程(https://scikit-learn.org/stable/auto_examples/cluster/plot_kmeans_silhouette_analysis.html)中的代码并使用它为我的数据找到了最佳的k值。
Methods, Results, and Analysis
方法、结果和分析
The plan for the data is as follows:
下面是数据的处理步骤:
1.Input each data file that we’re examining, and run the silhouette analysis code on it for k = 2-10 to find the ideal k to use in KMC.
输入我们要处理的每个数据文件,并在上面按不同的k值(从2到10)分别运行轮廓分析代码从而找出理想的k值。
2.Run KMC to obtain the center of each cluster, and analyze these centers to learn more about what types of team Madrid and Barcelona are losing to on average.
运行KMC算法来获得每个集群的中心值,并分析这些中心值从而了解从平均结果来看哪些特征导致皇马/巴萨的失利/平局。
This is the data we’re looking at for each team:
下面是我们为每个队伍收集的数据:
1.Losses and Draws from the 2009/10 season to the 2013/14 season (non-xG data)
2009/10到2013/14赛季的失利/平局的比赛(没有xG数据)
2.Losses and Draws from the 2014/15 season to the 2017/18 season (so that we can look at xG data as well)
2014/15到2017/18赛季的失利/平局的比赛(有xG数据)
3.Pt1 and 2 data combined (the xg_diff column from 2) will be removed)
上面1和2的数据合并(2中的xg_diff数据将被删除)
4.Data organized by manager. For this category, the manager needs to have at least 2 seasons of coaching either Real or Barca in this 9 season time period in order to make sure there’s enough data points to get meaningful information. For Real such coaches are Mourinho, Ancelotti and Zidane; for Barca it’s Guardiola and Enrique
按主教练整理的数据。这个入组数据要求,主教练需要在最近9个赛季内至少执教皇马或巴萨2个赛季,这样才能保证有足够的数据来获得有意义的信息。符合条件的皇马主教练有穆里尼奥、安切洛蒂和齐达内;符合条件的巴萨主教练有奥迪奥拉和恩里克。
5.Title Winning Seasons for each team
每个队伍夺冠的赛季
6.Title Losing Seasons for each team每个队伍没有夺冠的赛季
2009/10-2013/14
Silhouette Analysis on the Pt. 1 data for both teams suggests using 2 clusters. This average silhouette score for k =2 is much higher than any other value of k for each dataset.
对两队的数据1(译者注:上面第1项中的数据,下同)进行轮廓分析,最终得出k=2是最佳选择,在每个数据集中k=2的平均轮廓得分远远高于其它k值的得分。
Running KMC on this dataset produces the following clusters:
在这个数据集上运行KMC会产生以下集群。
Considering that there are only 10 games within Madrid’s 1st cluster and the Elo of the opposition is also higher than Madrid's Elo, it’s safe to assume that the data for each team divides itself into what seems like two categories: 1 cluster that’s made up of games played against each other, and another that includes the other teams in La Liga. This emphasizes how important it is for Real Madrid to win games against Barcelona if they want to have any chance of winning the league title. Of the 5 seasons that make up this category, 4 of them were within 5 points or less. Had Madrid been able to record more positive results against Barcelona, I’d expect them to have more than 1 league title in this time period. This dataset also includes the first season that Atletico was the winner of the league, so draws and losses against them are probably included in this cluster as well (just not as many as Barcelona).
考虑到皇马在第1集群内只有10场比赛,而且对手的Elo也比皇马的Elo高,所以我们可以把皇马/巴萨的数据分为两类:一类是由2队之间的比赛组成,另一类是两队分别和西甲其它球队的比赛。这个类别包含的5个赛季中,有4个赛季的冠亚军的分差在5分以内,这突显了如果皇马想要获得联赛冠军,那么拿下国家德比是多么重要。如果皇马能够在国家德比中得更好战绩,我预期他们在这个时间段内能多赢得至少一个联赛冠军。这个数据集还包含了马竞获得联赛冠军的赛季的数据,所以皇马对阵马竞的平局/输球可能也包括在这组数据里(只是没有对巴萨的那么多)。
While final_league_position shows an average team strength of 11th place, elo_opp’s value of 1705 is synonymous usually with a team in the 8th or 9th position in the league table. This suggests that Madrid typically loses to teams that are in a good run of form and may be outperforming their usual mid to low-table standards. What’s amusing about this is that Real Madrid and Barcelona fans will often complain that teams tend to "try harder" and give more effort when they play against them, and these two features seem to support that.
final_league_position特征显示对手的平均联赛最终排名为11,而elo_opp的平均值为1705,这通常是联赛积分榜第8/9位的球队的elo值。这表明皇马一般都会输给状态正佳的球队,处于这种状态的球队的表现通常会比普通中下游球队的表现更好。让人觉得有趣的是,皇马和巴萨球迷经常会抱怨自己主队在与这种对手比赛时往往本需要更缜密准备和付出更多努力,而这两个特征似乎也支持这一观点。
points_won for both Barca clusters ranges close to about 2/3rds; This essentially means that on average Barcelona tended to draw when they dropped points as opposed to lose. Madrid’s second cluster is within this range, but the first falls to less than 1/3rd. Madrid’s record in games against Barcelona during this 5-season window consists of mostly draws and losses, which would explain the low value for points-won. Another interesting difference in the 2 clusters of each team is the matchdays on which the points were dropped; for Madrid both clusters are centered near the end of the first half of the season, while for Barcelona it’s skewed a bit more towards the second half of the season. This tells us that Madrid is stronger in the second half of the season and weaker in the first, while for Barcelona the opposite is true.
巴萨在2个群集中的points_won值约等于2/3,这基本意味着以平均来看巴萨在丢分比赛中更倾向平局而不是失利;而皇马在第2集群中的points_won值也约等于2/3,但第1集群中的points_won值小于1/3。在这5个赛季里,皇马在对阵巴萨的比赛中基本是失利/平局,这也就解释了为什么皇马的points_won值比较低。另外一个有趣之处在于皇马/巴萨在matchdays特征上的差异,皇马在2个集群中的matchdays值都接近于上半赛季的末期,而巴萨的matchdays值更倾向于下半赛季的前期。这个数据告诉我们皇马强于下半赛季而弱于上半赛季,而巴萨恰恰相反。
2014/15-2017/18
K = 2 produces the highest average silhouette score for both datasets, but not by much; k = 3 is still within range, and it actually produces more evenly distributed clusters. However, for one cluster in each dataset (C1 in Barca data, C0 in Madrid data) there are points that produce a negative silhouette score, which doesn’t happen with any of the data with k = 2. The difference in points between C1 and C2 and C0 for k =3 is about 10 points, which is the same for the clusters in k = 2. For these reasons we choose k = 2:
两个数据集中k=2的平均轮廓得分是最高的,但领先的优势不大;k=3的得分仍在可接受范围内并且它会得到更均衡的群集。然而对于k=3的每个数据集中的某个群集(巴萨数据中的C1,皇马数据中的C0)里,都有一个点产生了负的轮廓得分,而在k=2的数据中没有出现这种情况。在k=3的情况下,C1/C2/C0两两之间的点差值约为10,而k=2的情况下,两两之间的点差值是相同的。基于这些原因,我们选择k=2:
And these are the clusters we get:
如下是我们获得的群集(k=2):
Barcelona’s clusters both seem to center on the fact that 1) they dropped points mostly at away games in the second half of the decade and 2) They tend to perform better against harder opposition. Oddly enough, Barca was able to pick up more points on average against teams closer to the top of the table than they were against teams that were midtable or lower.
巴萨的集群似乎都集中在以下两个方面:1)他们在过去十年中的丢分比赛几乎是下半赛季的客场比赛;2)他们往往在对阵更强的对手时表现更好。奇怪的是,巴萨在对阵积分榜上排名靠前的球队时平均能拿到更多的积分,而不是对阵排名中下游的球队。
Speaking of top of the table teams, in the data from 2009/10-2013/14, we saw that the final_league_position feature was much closer to 1 than it is now for both teams. The drop to 4.4 and 3.35 for Madrid and Barca, respectively, means that results against other teams besides each other helped more so than before to help determine who the league champion was going to be. While El Clasico is still important, it wasn’t the difference between winning or losing the league like it was in the earlier half of the decade.
说到排名靠前的球队,在2009/10-2013/14赛季数据中的final_league_position特征比近几个赛季的final_league_position更接近于1。近几个赛季皇马和巴萨的final_league_position分别降到了4.4和3.35,这意味着除了国家德比之外,对阵其它球队的成绩对决定联赛冠军归属是更重要的因素。虽然国家德比仍然很重要,但它并不像前半赛程的比赛那样是决定是否能拿下联赛冠军的关键,也就是说,它在近几年的重要性不如近十年内的前几年那样了。
Barca also saw their performance drop in the first half of the season, with C1 moving to the first half of the season and C2 moving to the middle of the season. The addition of xG for this data gives us a bit more insight into how Barcelona is performing in these games, and here xg_diff tells us that they typically outperformed their opponents in creating goal scoring chances - admittedly not by too much, but still a decent margin. The number of games in each category - 12 to 22 - says something about Barca’s performances in big games compared to smaller ones. While they typically performed well against other good teams in the league, they often let their guard down against weaker opposition and in turn performed worse.
可以看到巴萨在赛季前半段的表现也出现了下滑,其中C1中集中于赛季前半段,C2中集中到了赛季中段。在这个数据集中加入xG数据会让我们对巴萨在这些比赛中的表现有更多的了解,在这里xG_diff特征告诉我们巴萨在创造进球机会方面的表现通常要优于对手——虽然领先的不多,但仍拉开了一定的差距。每个类别中的比赛数量为12到22场,说明了巴萨在重要比赛和普通比赛中的表现有差异:虽然巴萨在对阵联赛的强队时表现出色,但在对阵实力较弱的对手时,他们往往会放松警惕表现反而更差。
Real Madrid still dropped a majority of their points in the first half of the season, according to the data. There was no clear advantage for Madrid to play Home or Away during these years, as they still seem susceptible to dropping points either way. Madrid’s cluster C1 is comparable to Barcelona’s cluster C2, since both seem to illustrate the struggles each had against teams that were midtable (notice that the number of games in the category for each team is almost equal as well). The difference in result, however, is illustrated by the xg_diff and points_won features; while Madrid tended to perform better against mid table teams in this time period (they had a higher xG than Barca), they often came away with a favorable result on a couple more occasions (points_won is 0.1 higher for Madrid than Barca). The matchday feature also tells us that Madrid was dropping these types of games earlier in the season, which most likely prevented them from getting off to strong starts and gave Barcelona a chance to build a lead. Madrid also performed poorly against other teams at the top of the table, represented by the negative xG in cluster C1. While Barcelona was able to earn draws from these types of games, Madrid was 1) losing these types of games, and 2) losing more often than Barcelona did.
数据显示皇马主要在前半赛季丢分。皇马的主客场的丢分情况没什么大差别,无论主客场他们就是容易丢分。皇马的集群C1与巴萨的集群C2相当,这说明两者在面对中游球队时表现都不佳(注意,两队在这个类别中的比赛场次是几乎相等的)。因此,可以从xG_diff和points_won特征来说明皇马和巴萨的不同之处。皇马在这个周期内对阵积分榜中游球队的表现往往更好(皇马的xG高于巴萨),能取得理想的结果(皇马points_won的比巴萨高0.1)。matchday特征也告诉我们,皇马在赛季初的时候经常在这类比赛中丢分,这往往让他们无法获得赛季开局优势,也给了巴萨建立领先优势的机会。皇马在对阵其它排名靠前的球队时表现也很差,集群C1中负数的xG体现了这点。巴萨往往能够从这个类型的比赛中拿到平局,而皇马在这个类型比赛中却是:1)经常输掉比赛,2)输的次数比巴萨多。
2009/10-2017/18
Silhouette analysis for both datasets gives us an average score that’s considerably higher for k = 2 than any other value, so there’s no debate about which one to use for this category. The results are shown below:
对这两个数据集进行轮廓分析后,我们得出k=2的平均分比其它k值的平均分都要高很多,所以使用k=2是没有任何争议的。结果如下图所示:
This category tells us the same things that we’ve been seeing already in the data, but it helps us to see that these are recurring trends with each team. Each feature tells its own decade-long story: matchday shows us how Madrid typically gets off to a poor start, while Barca start strong and can afford to take their foot off the gas pedal in the second half of the season; home_0_away_1 shows how Barcelona were able to make the Nou Camp a difficult place for top teams to come and win, while Madrid struggled both Home and Away to the same teams; odds_of_team_win was near-even for both teams, but it was Barcelona who was able to find a way to get some type of result from the bigger games while Madrid wound up losing these same games.
总体来说,这个数据类别展示给我们的结果在每个赛季各自分析的数据结果是基本一致的,但它能帮助我们看到这些既定问题在每个队伍里反复出现的趋势。每个特征都表现了它们这十年来的历史记录:matchday特征向我们展示了皇马通常是赛季开局不佳,而巴萨开局强势并在下半赛季表现有所下滑时候仍能保持积分优势;home_0_away_1特征显示巴萨能够让诺坎普成为顶级强队难求一胜的地狱,而皇马的主场和客场却没什么大区别;两队的odds_of_team_win特征很接近,但巴萨往往能够赢得重要的比赛,而皇马却相反。
Mourinho, Ancelotti and Zidane
穆里尼奥,安切洛蒂和齐达内
The dataset for each manager produces the best silhouette score for k = 2; this isn’t surprising, considering that this is the same data as before but just broken up differently among seasons. Mourinho was in charge of the team for 3 seasons (2010/11-2012/13), Ancelotti for 2 (2013/14-2014/15) and Zidane for 2 ½ (second half of 2015/16-2017/18):
每位主教练的数据集都是在k=2下的轮廓分析结果最好,这个结果并不出人意料,因为这里的数据集是由前面的数据集按赛季拆分后得来的。穆里尼奥执教最前面的3个赛季(2010/11-2012/13), 安切洛蒂执教了2个赛季(2013/14-2014/15),齐达内执教了2.5个赛季(从2015/16赛季中接手到2017/18赛季)。
Mourinho boasts the best Home record of the 3 managers in this category; it’s well known that teams managed by Mourinho in the earlier part of his career tend to be very strong at home, which is evidenced by the home_0_away_1 feature being very close to 1 for C1. Mourinho also has the best record against other top teams; in a time when it was imperative to get results against Barcelona in order to win the league, Mourinho had a league record of 2W-2D-2L. What stopped Mourinho from winning more than one league title with Madrid most most likely 1) the lack of quality bench players that could get results while letting the first team rest (this was the case at the end of the 2010/11 season, when Madrid needed bench players to step up while the first team was preparing for 4 games in one month against Barcelona in the League, Copa Del Rey and Champions League) and 2) internal issues between him and senior members of the squad ruining the 201/13 season, where Madrid got off to one of their worst La Liga starts of the decade (This season is where the majority of Mourinho’s draws and losses come from).
在这个分类里,穆里尼奥的主场战绩在三位主教练中是最好的;众所周知,穆里尼奥在职业生涯前期所执教的球队往往在主场表现非常强势,这一点从C1的home_0_away_1特征非常接近1就可以看出。穆里尼奥对阵强队的战绩也是最好的,在国家德比决定联赛冠军的年代,穆里尼奥对阵巴萨的联赛战绩为2胜-2平-2负。但穆里尼奥带领皇马只赢得一个联赛冠军的原因很可能是:1)缺乏能够在主力球员轮休时候保持成绩的优秀替补球员(2010/11赛季末期就是这样,当时皇马在一个月里在联赛,国王杯和欧冠赛场上和巴萨有4场比赛,皇马急需替补球员能站出来帮助球队)和2)他和球队高层之间的内部问题毁掉了2012/13赛季,在这个赛季里,皇马取得了十年来最差的西甲开局之一(穆里尼奥的大部分平局和失利集中在此赛季)。
While Mourinho’s Madrid side was a very good team (elo_madrid is 2053 in his first cluster, and an elo of above 2000 usually translates to being one of the 3 best teams in the world), they boast one of the worst final_league_position features of 12, with an elo_opp of 1694 to match it. Both of those features being as low as they are suggests that Madrid was dropping points to some of the worst teams in La Liga. This is, of course, true: during the 2010/11 season for example, Madrid dropped points 5 times to teams that finished 13th or lower.
虽然穆里尼奥治下的皇马是一支非常优秀的球队(在他的第一个集群中elo_madrid特征值是2053,而Elo值超过2000的球队通常是世界上最好的3支球队之一),但他却也带来了最差final_league_position特征的赛季,这赛季elo_opp特征值为1694,这两个特征值都这么低,说明皇马在西甲弱队上丢了好多分导致了很差的联赛排名。这当然是事实:例如在2010/11赛季,皇马有5次在排名第13名或更低的球队上丢分。
Ancelotti represents the beginning of an era where Madrid found themselves consistently dropping points to not only Barcelona, but Atletico Madrid as well. Ancelotti’s 2 League seasons at Madrid were not disasters by any means (and perhaps should be seen as just unlucky), losing by 3 points in his first season and 2 in his second. What truly stands out as a blemish on his record during this time though is his poor performances against Barca and Atletico in the league; Real Madrid almost always lost these games and gave away valuable points to their direct competition. We know now that it’s essentially a requirement to beat your closest competition, so these games can be seen as the definitive moments of Ancelotti’s time in Madrid (as far as the league is concerned). Perhaps without injuries to key players during the second half of each season - Jese in 2013/14 and Modric in 2014/15 come to mind - Ancelotti’s time in Madrid would’ve ended differently. These key injuries also influence the matchday feature and skew it more towards the second half of the season.
安切洛蒂代表了一个时代的开端,在这个时代皇马发现自己不仅不断地丢分给巴萨,也丢分给马竞。但安切洛蒂在皇马的2个联赛赛季,无论如何都不能说是灾难(也许应该被认为只是运气不好),他的第一个赛季只差冠军3分,第二个赛季只差冠军2分。不过在这2个赛季真正让他的记录上留下污点的是他在联赛中对阵巴萨和马竞时的糟糕战绩;皇马几乎总是在这些比赛中输掉比赛把宝贵的分数让给了直接竞争对手。现在我们知道,本质上来说击败最接近自己的竞争对手是必须的,所以这几场比赛失利可以看成是安切洛蒂在皇马时期的决定性时刻(就联赛而言)。或许如果没有每个赛季下半程关键球员的受伤(2013/14赛季的赫塞和2014/15赛季的莫德里奇),安切洛蒂在皇马会有不同的结局。这些关键的伤病也影响了matchday 特征,让matchday 特征更偏向于下半赛季。
Losing to other good La Liga teams was also a trend, as the overall quality of the league was getting better during this time; keep in mind that Spanish teams were routinely going deep into both the Champions League and the Europa league (A competition for teams that were just a step below the quality needed for the Champions League). Teams like Valencia, Sevilla, Villarreal, Athletic Bilbao and Celta Vigo all had at least one successful season in the Europa League during the middle part of the decade, and more often than not these good performances in Europe translated to a better and more competitive Top 8 in La Liga. Ancelotti’s Madrid tended to drop points against these types of teams when they had to play them at their stadiums, which are known for being some of the toughest places to play in Spain. This is evidenced by elo_opp (1775) and final_league_position (between 7 and 8) being quite high, and home_0_away_1 being close to 1 for Ancelotti’s second cluster.
输给其它西甲强队也是一种趋势,因为在这段时间里联赛的整体水平越来越好;要知道西班牙强队经常在欧冠和欧联之间徘徊。像瓦伦西亚、塞维利亚、比利亚雷亚尔、毕尔巴鄂竞技和塞尔塔维戈这样的球队,在这十年的中段时间里都至少在欧联中交出了一个良好的赛季答卷,而同样的这些在欧洲赛场上表现出色的球队都是西甲联赛中更好、更有竞争力的八强球队。安切洛蒂的皇马在对阵这类球队时往往丢分,而这类球队的主场是西甲出了名的最难打的魔鬼主场。相当高的elo_opp特征(1775)和final_league_position(7和8之间),安切洛蒂的第二个集群中home_0_away_1特征接近1都证明了这点。
Zidane represents a deviation from the past two managers, as there wasn’t necessarily a clear-cut formula to winning the league in this era; when Madrid won the league in 2016/17 under the Frenchman for example, they had lost and drawn to Barcelona that season. While Madrid’s record against other top teams wasn’t amazing under Zidane during this time, I wouldn’t label it as the only reason that this team wasn’t able to win the league more than once in this 3 season time frame. Other issues, such as a poor start under Benitez in 2015/16 (Who was fired at the midpoint of the season and Zidane was brought in to replace him), a bigger focus on the Champions League, not having the squad depth and squad quality to compete for more than one competition, not being consistent enough to perform well over a 38-game period, all play a role in the lack of success. Like with Mourinho, it’s important to note that most of these losses came in one season: 2017/18. By no means is Zidane’s league record with the team poor, it just could be a bit better.
齐达内明显和前面两任主教练有着不同,因为他的治下皇马没有明确的联赛夺冠规划,比如说虽然2016/17赛季皇马在法国人的带领下夺得联赛冠军,而在那个赛季他们对阵巴萨的成绩仅仅是一负一平。显然皇马在齐达内的带领下,对阵其它强队的战绩并不惊人,但我不会单单因为这个就认定这完全导致了在这3个赛季的时间里球队只赢得一次联赛冠军。这样的结果还其它原因,比如2015/16赛季在贝尼特斯手下开局不佳(贝尼特斯在赛季中段被炒掉,齐达内中途接手球队),更看重欧冠,没有足够的阵容深度和阵容质量来多线作战,在38场联赛中表现不够稳定,这些都是导致球队成绩不理想的原因。需要注意的是和穆里尼奥一样,齐达内丢分比赛大多都是集中在一个赛季:2017/18赛季。这绝不是说明齐达内的联赛战绩差,只是他可以做得更好一点。
While the final_league_position is higher for Zidane’s first cluster, elo_opp is nearly the same as Mourinho’s first cluster. The reason for this is not entirely clear, but more research towards the correlation between league positions in La Liga and elo rank over time would most likely give an answer.
虽然齐达内的第一集群中的final_league_position特征比穆里尼奥的高,但elo_opp特征几乎与穆里尼奥的相同。这其中的原因并不完全清楚,但对西甲联赛的联赛排名与elo排名之间的相关性进行更多的研究,很可能会给出答案。
Guardiola and Enrique
瓜迪奥拉和恩里克
Silhouette analysis gives us k = 3 as a better option because there aren’t any points that produce a negative silhouette score and each cluster is more evenly spaced. For the Guardiola data, k =2 actually produces a cluster with a silhouette score lower than the average:
轮廓分析告诉我们k = 3是最好的选择,因为k=3时候没有任何一个点产生负的轮廓得分,而且每个集群的间距更均匀。对于瓜迪奥拉的数据,k=2的轮廓得分低于平均值:
With the following clusters:
有如下集群:
Guardiola’s first cluster only includes 2 games (presumably against Real Madrid), so there’s not too much analysis to be done for that cluster. More interesting insights are found in the other two, specifically from the features home_0_away_1 and points_won. Guardiola’s third cluster consists only of away games, and that cluster itself features teams mainly from the upper half of the table. Guardiola only lost one of these games during his last 3 years as a manager of Barca, which is wildly impressive. What’s even more astounding is that despite this dominance over the best teams in La Liga, Barca only won the league by 3 and 4 points in the 2009/10 and 2010/11 seasons, respectively - he ended up losing the league by 8 points in his final season.
瓜迪奥拉的第一集群只包含了2场比赛(估计是对阵皇马的比赛),所以这个集群没什么好分析的。另外两个集群有更多有趣地方,特别是第三集群中的home_0_away_1和points_won特征。瓜迪奥拉的第三集群中只有一场客场比赛并且此集群中的队伍主要是积分榜上半区的队伍。瓜迪奥拉在担任巴萨主帅的3年中,只输掉这些比赛中的一场,这让人大为吃惊。更让人难以理解的是,尽管巴萨在西甲联赛中占据了如此大的优势,但在2009/10赛季和2010/11赛季,巴萨在联赛中分别只以3分和4分的分差赢得冠军----他在最后一个赛季的联赛中最终以8分的分差丢掉了联赛冠军。
Similar to Mourinho’s Madrid, something that’s becoming a bit of a theme with Barcelona is dropping points to teams in the bottom half of the table. The data for 2009/10-2017/18 shows that while Barcelona dropped less points than Madrid in general to low/mid table teams (53 Games compared to 60), the final_league_position for Madrid was a bit higher than Barca’s (10.8 to 11.6). Here, it seems that Guardiola follows the same trend as Mourinho of performing worse against teams in the low/mid level of La Liga. While on average he rarely dropped points to teams close to Barcelona’s quality in the league (and never really lost to them, usually draws), he dropped far more points to these low-quality teams. Perhaps the typical defensive style of play of these teams (sit back, look to block off passing lanes and hit Barca on the counter) is too much to break down week in, week out as in contrast to how top teams typically tend to play (more open, attacking football).
与穆里尼奥的皇马类似,巴萨的某些趋势正在成为日常,那就是在积分榜下半区的球队身上丢分。2009/10-2017/18赛季的数据显示,虽然巴萨在中下游球队的丢分幅度比皇马的要小(53场对60场),但皇马的 final_league_position特征却比巴萨略高(10.8到11.6)。在这里,瓜迪奥拉似乎和穆里尼奥一样,对阵西甲中下游球队的表现更差。虽然平均来说,他在联赛中很少在实力接近巴萨的强队上丢分(从来没有真正输给过他们,通常都是平局),但他在下游球队上丢分更多。或许这些球队典型的防守风格(防守后腰,拦截传球,打巴萨的反击)与顶级球队通常的打法(更多的开放性、攻势足球)形成鲜明对比,周而复始的防守风格让巴萨更难应付。
Luis Enrique posts even smaller margins than Guardiola in his league titles, winning by 2 and 1 points in the 2014/15 and 2015/16 seasons, respectively. While not as stellar as Guardiola’s results, Enrique still posts a generally good record against other top competition in La Liga, securing mostly draws while on the road against these teams. Xg_diff for his first cluster stands at 0.716, suggesting that his teams not only got results but played well - at least on the attacking end of the pitch. Enrique also has the same problem that Guardiola did with games against the lesser teams of La Liga, recording most of his dropped points (and losses in general) to lower and mid table teams.
恩里克在2014/15赛季和2015/16赛季分别以2分和1分的优势夺得联赛冠军,分差优势比瓜迪奥拉的还小。虽然恩里克的成绩不如瓜迪奥拉那么出色,但他在对阵西甲其它强队的比赛中仍然取得了不错的战绩,在客场对阵这些球队时恩里克基本都能取得平局。他的第一集群中的Xg_diff为0.716,这表明他的球队不仅取得了好成绩,而且场上表现良好--至少在进攻端表现是出色的。恩里克也有和瓜迪奥拉同样的在较弱球队丢分的问题,他的大部分丢分集中在中下游球队。
Title Winning Seasons
夺冠赛季
Silhouette Analysis gives k = 2 as the optimal clustering for each dataset in this category. One thing to note is that the Real Madrid dataset for this section was only made up of 15 games since there were only 2 league titles in this time period for them, so any analysis is being done with a small sample size:
从轮廓分析可得出k=2是这个类别中每个数据集的最佳选择。有一点需要注意的是这个分析中皇马数据集只由15场比赛组成,因为在这十年皇马只赢得2个联赛冠军,所以这个类别中对皇马的分析都是以小样本量进行的。
Barcelona’s away performances really shine in this category. The data tells us that Barca typically dropped points away from home in title winning seasons, which is expected; what’s surprising is that they typically avoided defeat in these scenarios and were able to secure draws. The data is still slightly skewed towards the second half of the season, which again suggests that Barca starts their league seasons very strongly.
巴萨在这个类别中的客场战绩真是“出众”。这些数据告诉我们,巴萨在夺冠的几个赛季里,通常是在客场丢分,这并不出人意料;但令人惊讶的是,他们通常在这种情况下能避免输球并拿到平局的1分。(丢分)数据还是略微集中于下半赛季,这再次说明巴萨的联赛开局非常强势。
Looking at Madrid’s second cluster shows that even in their title-winning seasons, they still dropped points to other contenders. While this reads as an sentence that criticizes Madrid, I actually believe that this should be looked at as a source of optimism; earlier in this paper I talked about how crucial it is to win games against Atletico and Barca if Madrid want to have a chance of winning the league, but those games can lose their meaning if Madrid play well against the other teams in the league. Cluster 1 suggests exactly this: Madrid almost never dropped points, and when they did they were losses rather than draws. While it seems a bit obvious to suggest that to win the league Madrid should just play better against teams, it actually seems like a viable and worthwhile idea to explore. The only problem is that this conclusion is formed off this small sample size of two title winning seasons.
纵观皇马的第二集群,即使是在夺冠的赛季,他们仍在争冠对手那里丢分。虽然这句话读起来像是在批评皇马,但实际上,我认为这应该被看作是一种乐观的来源;在本文的早些时候,我曾谈到如果皇马想要获得联赛冠军,赢下对阵马竞和巴萨的比赛是多么的关键,但如果皇马在对阵其它球队的比赛中发挥出色,这些比赛就会失去意义。第一集群很精确的展示:皇马几乎没有丢过分,而当他们丢分的时候往往失利多过平局。很明显的可以看出皇马要想赢得联赛冠军就应该在对阵(争冠)球队时踢得更好,实际上这确实是一个可行的、值得探讨的想法。该结论唯一的问题是该分析仅仅基于两个夺冠赛季的小样本数据形成的。
Still though, imagine if Madrid won every single game they played against the bottom half of the table, or if they were able to get positive results from every away game against at least 3/4ths of the league, while Barca continued to drop points against these same teams as the data from previous sections has shown us. While it wouldn’t guarantee a title win for Madrid, it certainly would put them in a good position to end with around 90 points (assuming they perform how they normally do in the other games of the season). Securing points against the lesser teams of the league could save Madrid a lot of headache by the time the games against the bigger teams come around.
就如前几节的数据告诉我们的那样,如果皇马对阵积分榜下半区的球队都能取得胜利,或者说他们在至少3/4的联赛客场都能取得更好的成绩,而巴萨在对阵这些同样的球队时却不断地丢分——虽然这并不能保证皇马100%赢得冠军,但这肯定会让他们处于一个很好的位置,以90分左右的成绩结束赛季(假设他们在本赛季的其他比赛中的表现一般)。在对阵联赛弱球队时多获得积分,可以弥补对阵联赛强队时候的丢分,这可以让皇马省去很多麻烦。
Title Losing Seasons
未夺冠赛季
Silhouette Analysis gives k= 2 as the optimal clustering for each dataset in this category. Like the Real Madrid dataset for the previous section, the Barcelona dataset for this section was only made up of a few games since there were only 3 league title losses in this time period for them:从轮廓分析可得出k=2是这个类别中每个数据集的最佳选择。和上一节中皇马夺冠数据集一样,巴萨数据集也只是由少量比赛组成,因为在这十年巴萨只丢了3个联赛冠军。
The arguments I made in the previous sections lose some of their shine when you examine this dataset for Madrid. There’s a much larger pool of data to form our observations on, and here the overwhelming trend is losing games to the best and/or second-best teams in La Liga. Again, this gives me some ideas as to what Madrid should do to stop losing the league title. Home_0_away_1 averages out to 0.5 for the first cluster; perhaps the key is making their home ground, the Santiago Bernabeu, a place where they can always secure a win or draw so that they don’t drop as many valuable points to their direct rivals?
当研究皇马的这个数据集时,我在前几节中分析出的结论会显得有些无力,那毕竟在未夺冠数据分析里,皇马有着更大的样本量来支持我们的研究:明显的趋势是皇马输给了西甲最佳或第二名的顶尖球队。这再次让我对皇马应该怎么做才能不再丢失联赛冠军有了一些想法。
在第一集群中Home_0_away_1特征的平均值达到了0.5。也许其中的关键就是让皇马的主场伯纳乌成为“真正的”主场:保证赢球或至少平局,这样他们就不会把那么多宝贵的分数送给直接争冠对手?
Madrid also seem to be dropping points to a majority mid table teams, which isn’t something that’s irregular for either of them or Barca. What’s a common trend is that Madrid tends to lose these games a bit early in the season, which is shown in the matchday feature of the second cluster. This brings back some validity to my claim that Madrid should focus on winning games against teams in the bottom half of the table so that they have room to make errors against bigger teams. Still though, due to how closely nearly every league title has been contested this decade, it seems that either strategy - beating lower ranked teams or trying to beat their direct rivals - can pay off massively for Madrid.
皇马似乎也在大多数中游球队身上丢分,这都成了皇马和巴萨的日常。一个普遍的趋势就是皇马在赛季初往往会在这些(中游球队)比赛中输掉一些,这一点在第二集群的matchday 特征中可以看出。这让我的观点有了一定的支持,那就是皇马应该专注于对阵积分榜下半区球队的比赛,这样他们在对阵强队时才有犯错的空间。不过,由于这十年来几乎每一个联赛冠军的争夺都很激烈,所以不管是击败排名较低的球队还是击败争冠对手,都能给皇马带来巨大的回报。
Conclusion
结论
The biggest patterns included:
1) dropping points to their direct competitors both home and away
主客场都在争冠对手身上丢分
2) being unable to pull out results in away games against mid table teams and lower
在客场对阵中下游球队无法取得满意的成绩
3) creating enough offensive opportunities for themselves but not capitalizing on them (evidenced by their xg_diff being 1 in the first cluster of the 2014/15-2017/18 dataset)
创造了足够的进攻机会,但得分能力太差(2014/15-2017/18数据集的第一集群中,皇马的xg_diff为1就是证明了这一点)
4) not being able to start their league campaigns with good form.
不能以良好的状态开始赛季的征程。
Solving any one of these problems will be beneficial for Madrid in the long run, as the smallest or margins tend to make a difference in the title race. Being able to secure even 3 or 4 more points a season can quite literally be the difference between 1st or 2nd, so in the future Real will be doing themselves a favor to fix one of these 4 areas.
解决这些问题中的任何一个,从长远来看都会对皇马有利,因为很小的/边缘的因素往往会影响冠军归属。一个赛季多拿3或4分往往能决定球队是排第一名还是第二名,所以将来皇马应该在这四个方面中的一个进行改进。