首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >估计一场足球比赛中每一场比赛的平均持续时间,给出主队和客场队的隐含进球。

估计一场足球比赛中每一场比赛的平均持续时间,给出主队和客场队的隐含进球。
EN

Code Review用户
提问于 2022-09-19 23:03:09
回答 1查看 66关注 0票数 2

我的程序做什么

我试图估计平均一场足球比赛将在不同的比赛状态下进行多少分钟,这取决于两支球队隐含的进球。

问题域

就我们的目的而言,有三种可能的游戏状态:

  • 主队领先,比如1-0。
  • 两队都在抽签,例如2-2。
  • 客场队领先,如0-2。

隐含进球意味着平均预计有多少支球队进球,比如主场队1.80分,客场队1.45分。

为了简单起见,我们可以假设:

  • 球队的得分率不取决于目前的得分
  • 无论比赛的哪个部分,得分的概率都是相等的。
  • 如果一支球队在一分钟内得分,它不会影响另一队在同一分钟内得分的可能性。

一场典型的足球比赛包括90分钟的固定时间,通常是上半场1分钟的受伤时间和下半场的4分钟,总共95分钟。

选择方法

我使用以下算法来完成此任务:

  1. 为每支球队计算在一分钟内得分的概率。
  2. 为每支球队制作一个以分钟为单位的比赛持续时间的位图(在我们的例子中是95分钟),其中1表示球队在比赛的某一分钟得分,0表示没有得分。
  3. 按比赛的某一分钟计算一队的累积分数。
  4. 通过比较两队的累积分数,计算每个游戏状态的持续时间。
  5. 对所需的审判次数重复这一步骤。
  6. 计算每个游戏状态的平均持续时间。

我的问题

我对我的解决方案在概念上和实现上都有什么选择感兴趣。尽管我知道我的程序可以通过包含numpy来以数量级的速度加速,但在这种特殊情况下没有那么重要,因为运行它只需几秒钟,而且不应该在快速的连续过程中被大量调用。

代码语言:javascript
运行
复制
from dataclasses import dataclass
import random
from statistics import mean

from numpy import cumsum

MATCH_LENGTH = 95
TRIALS = 100_000

@dataclass
class MatchGameState:
    """
    Represents for how many minutes of a particular game:
      - home team was ahead
      - teams were drawing
      - away team was ahead
    """
    home_ahead: int
    draw: int
    away_ahead: int
    
MatchGameStates = list[MatchGameState]

def mean_game_state(home_implied_goals: float, 
                    away_implied_goals: float, 
                    match_length: int=MATCH_LENGTH, 
                    trials: int=TRIALS) -> tuple[float, float, float]:
    """
    Given match length in minutes and implied goals for home and away teams, 
    calculates for how many minutes per match in average home team will be ahead,
    there will be a draw and away team will be ahead.
    """
    random.seed()
    sims = [single_match_game_state(home_implied_goals, away_implied_goals, match_length) for _ in range(trials)]
    home_ahead_mean = round(mean(s.home_ahead for s in sims), 2)
    draw_mean = round(mean(s.draw for s in sims), 2)
    away_ahead_mean = round(mean(s.away_ahead for s in sims), 2)
    return home_ahead_mean, draw_mean, away_ahead_mean

def single_match_game_state(home_implied_goals: float, 
                            away_implied_goals: float, 
                            match_length: int=95) -> MatchGameState:
    """
    Given match length in minutes and implied goals for home and away teams,
    simulates teams scoring minute by minute for a particular game.
    Returns MatchGameState for the game.
    """
    # Probability to score in a given minute
    home_goals_per_min = home_implied_goals / match_length
    away_goals_per_min = away_implied_goals / match_length
    # For every minute in a game, 1 if a team scored in that minute, 0 otherwise
    home_outcomes = [random.random() for _ in range(match_length)]
    away_outcomes = [random.random() for _ in range(match_length)]
    home_goals_by_minute = [int(home_outcome < home_goals_per_min) for home_outcome in home_outcomes]
    away_goals_by_minute = [int(away_outcome < away_goals_per_min) for away_outcome in away_outcomes]
    # How many goals a team scored by a particular minute of the game
    home_cumulative_goals = cumsum(home_goals_by_minute)
    away_cumulative_goals = cumsum(away_goals_by_minute)
    home_ahead, draw, away_ahead = 0, 0, 0
    for home_cumulative_score, away_cumulative_score in zip(home_cumulative_goals, away_cumulative_goals):
        if home_cumulative_score > away_cumulative_score:
            home_ahead += 1
        elif home_cumulative_score == away_cumulative_score:
            draw += 1
        else:
            away_ahead += 1
    assert home_ahead + draw + away_ahead == match_length 
    return MatchGameState(home_ahead, draw, away_ahead)

if __name__ == '__main__':
    print(mean_game_state(2.15, 1.20))
EN

回答 1

Code Review用户

回答已采纳

发布于 2022-09-20 00:01:26

很明显,您一直在使用泛型Python质量,从这个角度来看,这段代码是相当不错的。它掉下来的地方是Numpy的用途。

从某种意义上说,良好的Numpy代码看起来并不像我们想象的那样是Pythonic代码。您的数据必须离开,您的"single_“方法必须消失。必须将"single_“的操作撤到"mean_”,并对试验次数进行矢量化。对内置数学和随机库的引用将消失。这样,您可以在更短的时间内运行代码,并且/或运行更多的测试。

即使您要跳过矢量化(您不应该这样做),您还可以更改其他一些小事情:

PEP8的函数间距不足;您需要两个空行。

MatchGameStates未使用,所以请删除它。

int=MATCH_LENGTH也是不符合PEP8 8的,需要间隔。

random.seed()不是mean_game_state的责任;它是一个顶级程序,关注的是结果是否应该是可重复的。类似地,round是一个显示关注点,不应该被放入您的业务逻辑函数中。

MatchGameState,而不是dataclass,比NamedTuple更简单。

即使没有Numpy,home_aheaddrawaway_ahead变量也可以在生成器上使用sum()计算。

建议

代码语言:javascript
运行
复制
import numpy as np
from numpy.random import default_rng

MATCH_LENGTH = 95
TRIALS = 100_000


rand = default_rng()


def mean_game_state(
    home_implied_goals: float,
    away_implied_goals: float,
    match_length_minutes: int = MATCH_LENGTH,
    trials: int = TRIALS,
) -> tuple[float, float, float]:
    """
    Given match length in minutes and implied goals for home and away teams,
    calculates for how many minutes per match in average home team will be ahead,
    there will be a draw and away team will be ahead.
    """

    implied_goals = np.array((home_implied_goals, away_implied_goals))

    # Probability to score in a given minute
    goals_per_min = implied_goals / match_length_minutes

    # For every minute in a game, 1 if a team scored in that minute, 0 otherwise
    outcomes = rand.uniform(size=(trials, match_length_minutes, 2))
    goals_by_minute = outcomes < goals_per_min

    # How many goals a team scored by a particular minute of the game
    home_cumulative_goals, away_cumulative_goals = goals_by_minute.cumsum(axis=1).T
    diff = home_cumulative_goals - away_cumulative_goals

    home_ahead = np.count_nonzero(diff > 0, axis=0)
    away_ahead = np.count_nonzero(diff < 0, axis=0)
    draw = np.count_nonzero(diff == 0, axis=0)

    return home_ahead.mean(), draw.mean(), away_ahead.mean()


if __name__ == '__main__':
    print(mean_game_state(home_implied_goals=2.15, away_implied_goals=1.20))
票数 2
EN
页面原文内容由Code Review提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://codereview.stackexchange.com/questions/279823

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档