首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >Python中的复杂数据提取

Python中的复杂数据提取
EN

Stack Overflow用户
提问于 2018-01-10 08:10:19
回答 1查看 230关注 0票数 1

我需要有人帮我启动一个程序。我每周都会在网上玩几场扑克比赛。事实证明,我使用的网站记录手的历史,并将它们保存到我的硬盘驱动器作为.txt文件。不幸的是,数据的格式有点粗糙。我想要创建一个程序,以每一只手,并告诉我多少我赢或输。我从下面的一只手上粘贴了一个样本,我想从每只手中提取以下信息。

  1. 百叶窗和安特。在你向下滚动的例子中,你可以看到“播放器8有小盲(250)”和“播放器1有大盲(500)”。上面提到的每一个玩家“玩家英雄(50)”的antes。在这种情况下,小盲= 250,大盲= 500,ante = 50。
  2. 我的堆栈大小。我把我的球员称为“英雄”。我的堆栈大小在第6行,上面写着“座位3:英雄(17595)”。在这种情况下,我的堆栈大小是17595。
  3. 我的手。在这个例子中,表示“玩家英雄接收卡: 10c;Player Hero接收卡:7H”。所以我的手是"10c7h“
  4. 的玩家数量。在的样本中,有8个播放器。
  5. 我的位置。这个可能很棘手。我决定从“大盲”开始,并将其赋值为0。小盲= 1,按钮= 2,这在某种程度上违反了“扑克逻辑”,但从编程的角度来看,对我来说更有意义,因为总有一个大盲,而其他一些位置将取决于有多少玩家在桌子上。
  6. 损益。--这是文本底部的“汇总”标签。“玩家英雄不显示cards.Bets: 50。收藏: 0。输:50。”在这种情况下,我的利润是-50 (即损失50),这意味着我支付了50赌注和折叠我的手。

下面是.txt文件的外观。注意这是一只手。在实际的.txt文件中,这只手将被几十个或数百个其他手跟随。开头总是用“游戏开始”来表示,最后一行总是表示“游戏结束”。

代码语言:javascript
运行
复制
Game started at: 2018/1/9 10:14:10
Game ID: 1094127759 250/500 $5,000 GTD, Table 4 (Hold'em)
Seat 7 is the button
Seat 1: Player1 (9650).
Seat 2: Player2 (19433).
Seat 3: Hero (17595).
Seat 4: Player4 (8900).
Seat 5: Player5 (12670).
Seat 6: Player6 (11187).
Seat 7: Player7 (11300).
Seat 8: Player8 (17603).
Player Player8 ante (50)
Player Player1 ante (50)
Player Player2 ante (50)
Player Hero ante (50)
Player Player4 ante (50)
Player Player5 ante (50)
Player Player6 ante (50)
Player Player7 ante (50)
Player Player8 has small blind (250)
Player Player1 has big blind (500)
Player Player8 received a card.
Player Player8 received a card.
Player Player1 received a card.
Player Player1 received a card.
Player Player2 received a card.
Player Player2 received a card.
Player Hero received card: [10c]
Player Hero received card: [7h]
Player Player4 received a card.
Player Player4 received a card.
Player Player5 received a card.
Player Player5 received a card.
Player Player6 received a card.
Player Player6 received a card.
Player Player7 received a card.
Player Player7 received a card.
Player Player2 folds
Player Hero folds
Player Player4 raises (1000)
Player Player5 folds
Player Player6 folds
Player Player7 folds
Player Player8 folds
Player Player1 folds
Uncalled bet (500) returned to Player4
Player Player4 mucks cards
------ Summary ------
Pot: 1650
Player Player1 does not show cards.Bets: 550. Collects: 0. Loses: 550.
Player Player2 does not show cards.Bets: 50. Collects: 0. Loses: 50.
Player Hero does not show cards.Bets: 50. Collects: 0. Loses: 50.
*Player Player4 mucks (does not show cards). Bets: 550. Collects: 1650. Wins: 1100.
Player Player5 does not show cards.Bets: 50. Collects: 0. Loses: 50.
Player Player6 does not show cards.Bets: 50. Collects: 0. Loses: 50.
Player Player7 does not show cards.Bets: 50. Collects: 0. Loses: 50.
Player Player8 does not show cards.Bets: 300. Collects: 0. Loses: 300.
Game ended at: 2018/1/9 10:14:52

任何帮助都是非常感谢的。甚至只是一些关于我如何去做这件事或者我应该学到什么东西的一些想法。在我看来,输出应该如下所示:

代码语言:javascript
运行
复制
HandNumber = 000001
BigBlind = 500
Ante = 50
Players = 8
StackSize = 17595
Hand = 10c7h
Position = 6    # small blind = 1; add 5 since I'm 5 positions removed
Profit = -50

我的经验水平:--我已经学习了大约6个月的开发、数据科学和SQL的在线课程。我对课程有些熟悉,但对自己的课程却没有太多的经验。我设计了一些程序来帮助使用正则表达式从财务报表中提取数据。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-01-10 10:33:31

这将是最容易解决的使用正则表达式分裂不同的游戏,然后更多的正则表达式提取信息。我会上一堂课来保存所有这些信息。然后可以使用db或json存储这些信息。

代码语言:javascript
运行
复制
def split_file(file_handle):
    pat_str = '''\
^Game started at: (?P<game_start>.*?)
(?P<game>.*?)
^------ Summary ------
(?P<summary>.*)
^Game ended at: (?P<game_end>.*)$\
'''
    pat = re.compile(pat_str, flags=re.MULTILINE|re.DOTALL)
    text = file_handle.read()
    for game in pat.finditer(text):
        yield game

class Pokergame:
    def __init__(self, game_info, playername = 'Hero'):
        self.game_start = datetime.datetime.strptime(game_info['game_start'], "%Y/%m/%d %H:%M:%S")
        self.game_end = datetime.datetime.strptime(game_info['game_end'], "%Y/%m/%d %H:%M:%S")
        self.game_info = _parse_game(game_info['game'], playername)
        self.summary = _parse_summary(game_info['summary'], playername)

def _parse_game(game_str, playername):
    pattern_seat = f'Seat (\d+): {playername} \((\d+)\).'
    seat_match = re.search(pattern=pattern_seat, string=game_str)
    if seat_match:
        seat, stack = seat_match.groups()
    pattern_cards = f'Player {playername} received card: \[(?P<card>\w+)\]'
    cards = tuple(i['card'] for i in re.finditer(pattern_cards, game_str))

    result = {
        'seat': seat,
        'stack': stack,
        'cards': cards,
        'text': game_str,
    }
    return result   

def _parse_summary(summary_str, playername):

    return summary_str


games = []
with StringIO(hand_text) as file_handle:
    for game_info in split_file(file_handle):
        games.append(Pokergame(game_info))

我使用StringIO来模拟open(file)。您将不得不充实__init___parse_...,但这将使您走上正确的轨道。

如果您有多个文件,可以使用itertools.chain连接游戏。

games.game_info

代码语言:javascript
运行
复制
{'cards': ('10c', '7h'),
 'seat': '3',
 'stack': '17595',
 'text': "Game ID: 1094127759 250/500 $5,000 GTD, ...\nPlayer Player4 mucks cards"}
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/48182881

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档