专栏首页python3Python边学边用--BT客户端实现之

Python边学边用--BT客户端实现之

BitTorrent文件使用bencode编码,其中包括了4种数据类型:

'd' 开头表示是dict类型,'e'表示结束

'l' (小写字母L)开头表示是list类型,'e'表示结束

'i'开头表示是integer类型,'e'表示结束,可以表示负数

以数字开头表示string类型,数字为string长度,长度与string内容以':'分割

默认所有text类型的属性为utf-8编码,但是大多数BitTorrent包含codepage 和 encoding属性,指定了text的编码格式

BitTorrent的标准参见:http://www.bittorrent.org/beps/bep_0003.html

以下是自己写的Python实现,初学Python,代码写起来还都是C/C++风格,慢慢改进吧。

torrent_file.py

import os
from datetime import tzinfo
from datetime import datetime

import bcodec

_READ_MAX_LEN = -1

class BTFormatError(BaseException):
    pass
    
class TorrentFile(object):
    
    __metainfo = {}
    __file_name = ''
    
    def read_file(self, filename):
        
        torrent_file = open(filename, 'rb')
        data = torrent_file.read(_READ_MAX_LEN)
        torrent_file.close()
        
        data = list(data)
        metainfo = bcodec.bdcode(data)
        if metainfo and type(metainfo) == type({}):
            self.__file_name = filename
            self.__metainfo = metainfo
        else:
            raise BTFormatError()
           
    def __is_singlefile(self):
        return 'length' in self.__metainfo.keys()
    
    def __decode_text(self, text):
        encoding = 'utf-8'
        resultstr = ''
        if self.get_encoding():
            encoding = self.get_encoding()
        elif self.get_codepage():
            encoding = 'cp' + str(self.get_codepage())
        if text:
            try:
                resultstr = text.decode(encoding=encoding)
            except ValueError:
                return text
        else:
            return None
        return resultstr
    
    def __get_meta_top(self, key):
        if key in self.__metainfo.keys():
            return self.__metainfo[key]
        else:
            return None
    def __get_meta_info(self,key):
        meta_info = self.__get_meta_top('info')
        if meta_info and key in meta_info.keys():
                return meta_info[key]
        return None
    
    def get_codepage(self):
        return self.__get_meta_top('codepage')
    def get_encoding(self):
        return self.__get_meta_top('encoding')
    
    def get_announces(self):
        announces = []
        ann = self.__get_meta_top('announce')
        if ann:
            ann_list = []
            ann_list.append(ann)
            announces.append(ann_list)
        announces.append(self.__get_meta_top('announce-list'))
        return announces
    
    def get_publisher(self):
        return self.__decode_text(self.__get_meta_top('publisher'))
    def get_publisher_url(self):
        return self.__decode_text(self.__get_meta_top('publisher-url'))
    
    def get_creater(self):
        return self.__decode_text(self.__get_meta_top('created by'))
    def get_creation_date(self):
        utc_date = self.__get_meta_top('creation date')
        if utc_date is None:
            return utc_date
        creationdate = datetime.utcfromtimestamp(utc_date)
        return creationdate
    def get_comment(self):
        return self.__get_meta_top('comment')
          
    def get_nodes(self):
        return self.__get_meta_top('nodes')
    
    def get_piece_length(self):
        return self.__get_meta_info('piece length')
    
    def get_files(self):
        
        files = []
        pieces = self.__get_meta_info('pieces')
        name = self.__decode_text(self.__get_meta_info('name'))
        piece_length = self.get_piece_length()
        
        if not pieces or not name:
            return files
        
        if self.__is_singlefile():
            file_name = name
            file_length = self.__get_meta_info('length')
            if not file_length:
                return files
            
            pieces_num = file_length/piece_length
            if file_length % piece_length:
                pieces_num = int(pieces_num) + 1
            if 20*pieces_num > len(pieces):
                return  files
                           
            file_pieces = []
            i = 0
            pn = 0
            while pn < pieces_num:
                file_pieces.append(pieces[i:i+20])
                i += 20
                pn += 1
            
            files.appen({'name':[file_name], 'length':file_length, 'peaces':file_pieces})
            return files
        

        folder = name
        meta_files = self.__get_meta_info('files')
        if not meta_files:
            return files
        
        total_length = 0
        for one_file in self.__get_meta_info('files'):
            
            file_info = {}
            path_list = []
            path_list.append(folder)
                        
            if 'path' not in one_file.keys():
                break
            for path in one_file['path']:
                path_list.append(self.__decode_text(path))
            file_info['name'] = path_list
            
            if 'length' not in one_file.keys():
                break
            
            file_info['length'] =  one_file['length']
            
            piece_index = int(total_length / piece_length)
            total_length += one_file['length']
            pieces_num = int(total_length / piece_length) - piece_index
            pieces_num = int(file_info['length']/piece_length)
            
            if total_length % piece_length:
                pieces_num += 1
            
           # print  (piece_index+pieces_num)*20, len(pieces),pieces_num,file_info['length'], self.get_piece_length()
            
            if (piece_index+pieces_num)*20 > len(pieces):
                break
            
            file_info['pieces'] = []
            
            pn = 0
            while pn < pieces_num:
                file_info['pieces'].append(pieces[piece_index*20:piece_index*20+20])
                pn += 1

            files.append(file_info)
            
        return files
    
if __name__ == '__main__':
    #filename = r".\huapi2.torrent"
    #filename = r".\mh5t3tJ0EC.torrent"
    filename = r".\huapi2.1.torrent"   
    torrent = TorrentFile()

    print "begin to read file"
    try:
        torrent.read_file(filename)
    except (IOError,BTFormatError), reason:
        print "Read bittorrent file error! Error:%s" %reason
     
    print "end to read file"

    print "announces: " , torrent.get_announces() 
    print "peace length:", torrent.get_piece_length()
    print "code page:" , torrent.get_codepage()
    print "encoding:" , torrent.get_encoding()
    print "publisher:" ,torrent.get_publisher()
    print "publisher url:", torrent.get_publisher_url()
    print "creater:" , torrent.get_creater()
    print "creation date:", torrent.get_creation_date()
    print "commnent:", torrent.get_comment()
    print "nodes:", torrent.get_nodes()
    torrent.get_files()
    for one_file in torrent.get_files():
        print 'file name:', '\\'.join(one_file['name'])
        print 'file length:', one_file['length']
        print 'pieces:', list(one_file['pieces'])

bcodec.py

  1 '''
  2 Created on 2012-9-30
  3 
  4 @author: ddt
  5 '''
  6 def bdcode(data):
  7     data = list(data)
  8     return _read_chunk(data)
  9     
 10 def _read_chunk(data):
 11     
 12     chunk = None
 13     
 14     if len(data) == 0:
 15         return chunk
 16     
 17     leading_chr = data[0]
 18                      
 19     if leading_chr.isdigit():
 20         chunk = _read_string(data)
 21     elif leading_chr == 'd':
 22         chunk = _read_dict(data)
 23     elif leading_chr == 'i':
 24         chunk = _read_integer(data)
 25     elif leading_chr == 'l':
 26         chunk = _read_list(data)
 27 
 28     #print leading_chr, chunk
 29     return chunk
 30                            
 31 def _read_dict(data):
 32     
 33     if  len(data) == 0 or data.pop(0) != 'd': 
 34         return None
 35     
 36     chunk = {} 
 37     while len(data) > 0 and data[0] != 'e':
 38         
 39         key = _read_chunk(data)
 40         value = _read_chunk(data)
 41         
 42         if key and value and type(key) == type(''):
 43             chunk[key] = value
 44         else:
 45             return None
 46         
 47     if len(data) == 0 or data.pop(0) != 'e':
 48         return None
 49     
 50     return chunk
 51 
 52 def _read_list(data):
 53 
 54     if  len(data) == 0 or data.pop(0) != 'l': 
 55         return None
 56     
 57     chunk = []
 58     while len(data) > 0 and data[0] != 'e':
 59         value = _read_chunk(data)
 60         if value:
 61             chunk.append(value)
 62         else:
 63             return None
 64         
 65     if len(data) == 0 or data.pop(0) != 'e': 
 66         return None
 67 
 68     return chunk
 69 
 70 def _read_string(data):
 71     
 72     str_len = ''
 73     while len(data) > 0 and data[0].isdigit():
 74         str_len +=  data.pop(0)
 75     
 76     if len(data) == 0 or data.pop(0) != ':':
 77         return None
 78     
 79     str_len = int(str_len)
 80     if str_len > len(data):
 81         return None
 82     
 83     value = data[0:str_len]
 84     del data[0:str_len]
 85     return ''.join(value)
 86 
 87 def _read_integer(data):
 88    
 89     integer = ''
 90     if len(data) < len('i2e') or data.pop(0) != 'i': 
 91         return None
 92     
 93     sign = data.pop(0)
 94     if sign != '-' and not sign.isdigit():
 95         return None
 96     integer += sign
 97     
 98     while len(data) > 0 and data[0].isdigit():
 99         integer += data.pop(0)
100     
101     if len(data) == 0 or data.pop(0) != 'e':
102         return None
103 
104     return  int(integer)

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • 从Numpy中的ascontiguousarray说起

    有的时候,在看别人代码时会时不时看到ascontiguous()这样的一个函数,查文档会发现函数说明只有一句话:“Return a contiguous arr...

    王云峰
  • 智慧园区可视化应用数据源接口问题

    DIX是CamBuilder中用于实时对接入数据进行处理后输出到指定目的地的系统。可以用来接入Mysql、ActiveMq,syslog等数据源中存储的数据,可...

    要不要吃火锅
  • 你不应该知道的知识之如何安装老版本的Python

    由于某些奇怪的原因(如项目中要用某个用Python3.4编译的库),你可能需要安装官方停止支持的Python版本(如Python2.5, Python2.6, ...

    王云峰
  • Ubuntu上安装Python3.7

    在有些情况下,如安装某个比较Cool的工具的时候,需要用到Python3.6+。这时候,可以选择从Python官网下载源代码,然后编译。不过编译可能会因为各种各...

    王云峰
  • 你不应该知道的知识之如何在Ubuntu 16.04上安装Pip2

    虽然2020年官方就不支持Python2系列了,不过有时你还是会用到Python2,这时候为了安装某个包(如numpy),你需要pip2。而pip2一般比较新的...

    王云峰
  • 根据分组依据对Java集合元素进行分组

    业务背景:在项目中有个“分账”功能,就是支付的钱一部分要根据不同商品的分账金额自动分给平台提供商。

    Dunizb
  • 以往的Python文章总结

    笔记;因为Python不像C语言那样的强结构语言,所以我学完C就开始学Python,脑袋嗡嗡的,不过还好,它的赋值很不一般,像C语言第一条应该是先申请一个变量然...

    天钧
  • SQL 高级查询 ——(层次化查询,递归)

    层次化结构可以理解为树状数据结构,由节点构成。比如常见的组织结构由一个总经理,多个副总经理,多个部门部长组成。再比如在生产制造中一件产品会有多个子零件组成。举个...

    Lenis
  • C语言直接实现开机密码修改!

    今天给大家带来一个比较实用的东西,那就是用C语言对电脑的开机密码进行修改,按照正常的方法修改一般会提示你输入原密码,我们今天的方法可以直接修改,话不多说,上代码...

    诸葛青云

扫码关注云+社区

领取腾讯云代金券