BitTorrent协议(二)之获取下载源

如何获取下载源

作为一个downloader开始下载资源首先要向tracker宣布自己的存在,同时获得其他downloader的地址。 这里的downloader我们统一称为对等节点(peer),因此接下来要做的事情就是与tracker通信,获取peers。

获取下载源的协议

有两种方法,

  • 方法一:基于HTTP的GET请求

官网文档 BEP3 中的trackers章节告诉我们可以使用HTTP /GET 来得到peer的信息。也就是发送一个GET请求。

  • 方法二:基于UDP的tracker协议

再对照如下tracker列表就可以明白了。

1
2
3
4
5
http://tracker.trackerfix.com:80/announce  # 方法一

udp://9.rarbg.me:2710/announce # 方法二

udp://9.rarbg.to:2710/announce # 方法二

方法一看起来是更简单的方法,但是在GFW保护下,这些采用方法一的服务端口都无法访问。因此只能尝试方法二。

基于UDP的Tracker协议

UDP协议本来也是一种更好的选择,因为传输的字节更少。官网文档UDP Tracker Protocol for BitTorrent 详细说明了原因。

Using HTTP introduces significant overhead. There’s overhead at the ethernet layer (14 bytes per packet), at the IP layer (20 bytes per packet), at the TCP layer (20 bytes per packet) and at the HTTP layer. About 10 packets are used for a request plus response containing 50 peers and the total number of bytes used is about 1206 [1]. This overhead can be reduced significantly by using a UDP based protocol. The protocol proposed here uses 4 packets and about 618 bytes, reducing traffic by 50%. For a client, saving 1 kbyte every hour isn’t significant, but for a tracker serving a million peers, reducing traffic by 50% matters a lot. An additional advantage is that a UDP based binary protocol doesn’t require a complex parser and no connection handling, reducing the complexity of tracker code and increasing it’s performance.

简单来说就是方法一基于TCP的,每个基于TCP的请求的头部自然要比基于UDP的请求大。

UDP协议包含三个过程,

  • CONNECT
  • ANNOUNCE
  • SCRAPE

我们关心的是发布(ANNOUNCE),接下来重点看如何实现。

实现发布

两个步骤,

  • CONNECT,获取connection_id
  • ANNOUNCE,拿着connection_id和info_hash等参数再请求。

参考了Bittorrent udp-tracker protocol extension 因为对入参和返回参数说明的更详细一些。用的tracker是一个热门的磁力链接。

ip = 'exodus.desync.com'  # get from tracker list
port = 6969  # get from tracker list

而info_hash是用uTorrent软件打开该磁力链接后获得的。

1aa4c13830b822c1375686d685a9fce23405f6ba

以下是代码实现。uTorrent软件打开磁力链接会拿到tracker列表。

announce的重要参数说明

  1. num_want : -1 # 希望获得的peer个数,则是默认的peer个数
  2. ip : 0 # 当前发送udp包的地址
  3. port : 监听的公网端口

一旦发布完成,该监听的端口会收到DHT协议的请求,例如FIND_NODE

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
# -*- coding: utf-8 -*-

__author__ = 'ym'
"""
Date : '2019/1/8'
Description : udp tracker protocol

"""
import socket
import struct


def ip_me():
import psutil
import socket

def get_ip_addresses(family):
for interface, snics in psutil.net_if_addrs().items():
for snic in snics:
if snic.family == family:
yield (interface, snic.address)

addrs = list(get_ip_addresses(socket.AF_INET))
for tup in addrs:
if tup[0] == 'en0':
return tup[1]


SOURCE_IP = ip_me()
SOURCE_PORT = 59876


def to_byte(integer, length=4):
return integer.to_bytes(length, byteorder='big')


def udp_send(ip, port, data):
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) # UDP
sock.bind((SOURCE_IP, SOURCE_PORT))
sock.sendto(data, (ip, port))


def udp_recv(ip, port):
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) # UDP
sock.bind((ip, port))
while True:
data, addr = sock.recvfrom(2048)
print("received message:", data)
return data


class UdpAnnouceRequest(object):
def __init__(self):
self.connection_id = None
self.action = None
self.transaction_id = None
self.info_hash = None
self.peer_id = None
self.downloaded = None
self.left = None
self.uploaded = None
self.event = 0
self.ip = None
self.key = None
self.num_want = -1
self.port = None
self.extensions = None

def get_bytes(self):
result = to_byte(self.connection_id, 8)
result += to_byte(self.action)
result += to_byte(self.transaction_id)
result += self.info_hash
result += self.peer_id
result += to_byte(self.downloaded, 8)
result += to_byte(self.left, 8)
result += to_byte(self.uploaded, 8)
result += to_byte(self.event)
result += to_byte(self.ip)
result += to_byte(self.key)
result += struct.pack(">i", self.num_want)
result += to_byte(self.port, 2)
return result


def udp_announcing(request: UdpAnnouceRequest, dst_ip, dst_port):
data = request.get_bytes()
udp_send(dst_ip, dst_port, data)


def connect(tracker_ip, tracker_port):
protocol_id = 0x41727101980 # magic number
action = 0 # connect
transaction_id = 123 # random

data = struct.pack(">Qii", protocol_id, action, transaction_id)
udp_send(tracker_ip, tracker_port, data)

reply = udp_recv(SOURCE_IP, SOURCE_PORT)
action, transaction_id, connection_id = struct.unpack(">iiQ", reply)
fmt = "connect reply:\n\taction:{}, transaction_id:{}, connection_id:{}"
print(fmt.format(action, transaction_id, connection_id))
return connection_id


def announce(connection_id, dst_ip, dst_port):
request = UdpAnnouceRequest()
request.connection_id = connection_id
request.action = 1
request.transaction_id = 1234 # random
request.info_hash = bytearray.fromhex('1aa4c13830b822c1375686d685a9fce23405f6ba') # compute from meta_info
request.peer_id = bytearray.fromhex('ab' * 20) # random
request.downloaded = 0
request.left = 1509949440 # from meta_info
request.uploaded = 0
request.ip = 0 # default
request.key = 314
request.port = 59601 #
# print(request)
udp_announcing(request, dst_ip, dst_port)
return udp_recv(SOURCE_IP, SOURCE_PORT)


def udp_process():
ip = 'exodus.desync.com' # get from tracker list
port = 6969 # get from tracker list
connection_id = connect(ip, port)
reply = announce(connection_id, ip, port)
action, transaction_id, interval, leechers, seeders = struct.unpack(">iiiii", reply[0:20])
fmt = "announce_reply:\n\taction:{}, transaction_id:{}, interval:{}s, leechers:{}, seeders:{}"
print(fmt.format(action, transaction_id, interval, leechers, seeders))
peers = []
for i in range(20, len(reply), 6):
if i + 6 <= len(reply):
data = reply[i:i + 6]
n1, n2, n3, n4, peer_port = struct.unpack(">BBBBH", data)
peer_ip = "{}.{}.{}.{}".format(n1, n2, n3, n4)
peers.append((peer_ip, peer_port))
print("get {} peers:{}".format(len(peers), peers))


if __name__ == '__main__':
udp_process()

运行结果

运行日志如下,可以看到拿到了200个peers的ipv4地址和端口。

1
2
3
4
5
6
7
8
9
10
/usr/local/bin/python3.7 /Users/ym/charm/pytest/ym/bt/udp_announce.py
received message: b'\x00\x00\x00\x00\x00\x00\x00{\xd89\xa8\x9dv\xd3=]'
connect reply:
action:0, transaction_id:123, connection_id:15580669780121828701
received message: b'\x00\x00\x00\x01\x00\x00\x04\xd2\x00\x00\x07_\x00\x00\x00%\x00\x00\x01qi\xe8\xff\xd8XWl\x18-F\xbcd\x9e\xae}\xea\xaa[\xd41S\'\xcb\x0eb\x14\x8f\xcfbcg\xcd\xaf9\xe0\x1f-\xf7*\xa6\xa2\x86)P\xadl\xbc\xbf)\xb6{\xc4w\xaa)\xcd\xf0@cF--;M\xa6Wp\xce\xc7\xff\xfc\xcf\xbb\x9e\x18\x9c\xec\x96\xc4\xcf\x83PS\x06_\xb4zmc\xd3\x82i\xef\x0b\xeb\xd0\xaf ~d\xf3\x8c\xde\x9b?\xfc\xdb!1\x93W\x0b\xc3I\xad\xf4,)\xf0\x1c\xc5\xe0\x98\x80\xbbZ\xb4^UA#\'|\x95\x7fd\xc77p\xc8Z\xadN\xb5p\xc7K\xbc\xd31i\x9c\xe0\x7f0\xaeV\x02\xc2\xe3G.R\x92\xfc\x15\xc8\x05R\x18\x18I\x943H3~\r\xd3\x89=\x06\xe6\xbb86-y\xd1\xa6\xca\xcb)\xd7\r\x8aLo)\xcb\xda\x8e;K)Z<\x9c\xb0\x87);Q\xd0\xb2\x8b\x18\xa5\xc2T\xb8\xd7\x05\x90;\x83\xf1^\xca\x8eeB\xa1a\xb9\xe8\x15z\x83\xde\x93\n\xba\x91+\x9em\xb1\x9b\xd0V\x82Y\xf2\xb5\xad\xc7\rV\x06\xf8^k\x14R\x8eP\xb1\xfb\x15Rf\x1eV\xd0\x05R-\xd9\xf5\xc2"@\x86\xf0K\x00\x00@B\xdeFa*\x1f;\x02<\xa2\xa7\x18l\xed\xfa\xd2\xf4\x05k#\x81\xe8\xa5\x01\xban\x82\xcd\t\xd57\xb8\x8eQ\xdd\xd4\x1b\x13\xfalh\xd0\x83\xa5J\xa9\xb5\xce\xe7x:\xb8\xe1\xc9\xab\xc6\xf9\xa3:\xc5\xed\xc5\xe4\xd2c\xb6\xb9\x1cL\xc0\x19\x8b<\xbfn\x9a\xc2\x80\xcc\xf3\xc3?\xefn\x8dg\xa5=\x0cm\xb1^\xfcU\xf3[}\xdd\x90\x82yS\xe9o\xf3\xf0R@w\xc0\xf4\'\xae\xd0SR\xcc\xde@\xc5\xf8\xdb\x9e\x8a\x0f\xb3\x08_\x90\xb4\xc1\xa9=\xda$\x1a\xe1n\xe8V\r\xe5\xe1b\x8cr{1\xc3^\x05\xe8bR\x00Sn\xe5\x7fe\xe5G4V5I \x1bjQ\x9a\x80\x7f\xce\xe7y\xd5\xde]\xbc\x81D\xbd\xbf\xf1\xb8K\xdfZ\x8d\xe5\xa9=\xda#\x1a\xe1sE\xbb\xb8\xd4Oi\xa7\x10V\xe9\xffgG\x10\xf3\x8f\xde^\xea#S+%\\\xeeN_\xba\xbd3\x07g\xaf\xdb\xb3)\xa1:#\xe0G%\x98\xce\x84}\xe2\xd8ch\xeaN\xb2\xbe\xabp\xe3\xd2?\xb7\r\xcd\x9d\xf3\x8c\xa8\x01\x18I\xabH\x90\x84,A\x9a\x02}\xd1\\\xdd\x94bp\x8d:\xa7e:p\n\x17\x93\xb0\xb9h\x05\xa9)b\xdc\\\xed\x0f;]\x9dX\x10\x96\x98u\x92R\x18\x8a+\xc2LR\x0cH\xa7z\x06>\x96OI2\x9f.\xb6\xbax9\xee)\xbaS\x8b\x9b\x81)\x8b\xfa&m,)P\xe4\x95\xc8s%\x8e\x03+>\x1e\xc5\xf8\x92\xba\xb9\xca\xc4\xca\xa9\xba\xd3(\xc3Yxr\x07_\xbe:\x08\x08\xc8\x9c\xb2P\x14X\x7f\xcf\x9aIWU\xc6G\x80j\xa0\x9dRSx\xf4n\x07w\xe4m\xbe\xd4`\xc8\xd5g\xf4\xf2}\xe1\x01g5I%\xc1\xa1\\c@\x89\x99\x9bVSE\x95\xfb\xdeV\x12\xe0F\'gR\x17n\x90\x91\xf4O\xb7c\xa6\xd1\xfcOC9\xf0s\xee+\xf6\xc8u\xe1\xed+\xf1\x91DGD)Zo\x9e\xf3y\x18\xcb\x97\x9e=\'\xc4\xca\xa9\xba\xcbO\xa5\xe7(\x0e8\xc1\xa2\xf6k\x92\x14\x82q\x14b.\xe7oi\xe8X\x86\xe2\xbagF\xcc\xc2\xed\x83`+\xbdk\x99\xdfR\x07u\xa2\xbf\x8cBW\xcfc\x00\x00?\x8f\xecq\x00\x00)\xd2\n\xe3\xe9\xca\x1f-g\x8a\xe5y\xde\xb8\xe8\xad\xab\xc4\xd8\x97\xb8\x9f\xd0d\xd0g\xe9\xa2\xdb\xbd\xc4\xd1\xe9\xa4\x86\xc2\xc4\xcf\xb83\xe2\xaa\xc3\x93\xb5\xbe&\xfb\x9c\xde.\xdc\xd2\x14\x9aF9B\xa4{p\xcd\x9d\xf6\xe0\xcai\xb8\x99u\xf0+g{$\xb5\xb3LZ\xdbs\xab\x99~Qd\xff\xdf\xe8\xefP\'\x86\xc9:\x8aMD\xf3F\x88\xb8H\xe5%G\x84M1\x92\r~\xe9\x13)OB\xe1Vi\xdd\xcb\x16\xf53\xb3\xba\x96\x86\xaf\xba:\xb2\xa6!$\x7f\xdd|k\xc0+\x8a}{\xf5\x14\x8eAVy68&K\xb1tZ`\nY\xb8l\x04\xd7\xfb\x84\xb3h\xcfSb*\x93`\x17\xb5[\xebGV\x89\x15S\x91\xd1KA\xd7\xc4\xa3o?\xf5:\x9b\x83\x1d.\xd1,c\x05\x16\xb6\xba\xfd\xab97\xb2\x9d\xf9>g\xeb\x82i\xa4{G\xb7i\xe8V1\x91\xafg\xfc\xca\x96\xaf\x16g\x04\xf8\x8a\xda!_^\x9d\xaf\xd2-^\xaf\x94\x0c\xbc\x0f^L\x0f,\xc3\x00Y\xedbW\xf5\x84WFe\xc7\xaeAV\x19|\x9e\xb0\xcfH\xe6\xda\x88<\xdcFA\xe3\x15q\x08)\x8b\xf3:L\x1b)<\xe9\xf5\xaa[\x1f\xd7\x0e\x08Z\xb2\x01\t\x87,R\xd9\xcb\xd4\x9bSx"\xca\x8e[f`N\xc5\xed\x81\x96b\xdc\xc5\xe8\x1av#\'\xc5\xe8\x13n\xdbn\xc4\xf9g\xc4:\x15\xbbp|\xa6#\'\xb7W\xa9\x9b@5u\xc2\xdcQd\xef'
announce_reply:
action:1, transaction_id:1234, interval:1887s, leechers:37, seeders:369
get 200 peers:[('105.232.255.216', 22615), ('108.24.45.70', 48228), ('158.174.125.234', 43611), ('212.49.83.39', 51982), ('98.20.143.207', 25187), ('103.205.175.57', 57375), ('45.247.42.166', 41606), ('41.80.173.108', 48319), ('41.182.123.196', 30634), ('41.205.240.64', 25414), ('45.45.59.77', 42583), ('112.206.199.255', 64719), ('187.158.24.156', 60566), ('196.207.131.80', 21254), ('95.180.122.109', 25555), ('130.105.239.11', 60368), ('175.32.126.100', 62348), ('222.155.63.252', 56097), ('49.147.87.11', 49993), ('173.244.44.41', 61468), ('197.224.152.128', 47962), ('180.94.85.65', 8999), ('124.149.127.100', 50999), ('112.200.90.173', 20149), ('112.199.75.188', 54065), ('105.156.224.127', 12462), ('86.2.194.227', 18222), ('82.146.252.21', 51205), ('82.24.24.73', 37939), ('72.51.126.13', 54153), ('61.6.230.187', 14390), ('45.121.209.166', 51915), ('41.215.13.138', 19567), ('41.203.218.142', 15179), ('41.90.60.156', 45191), ('41.59.81.208', 45707), ('24.165.194.84', 47319), ('5.144.59.131', 61790), ('202.142.101.66', 41313), ('185.232.21.122', 33758), ('147.10.186.145', 11166), ('109.177.155.208', 22146), ('89.242.181.173', 50957), ('86.6.248.94', 27412), ('82.142.80.177', 64277), ('82.102.30.86', 53253), ('82.45.217.245', 49698), ('64.134.240.75', 0), ('64.66.222.70', 24874), ('31.59.2.60', 41639), ('24.108.237.250', 54004), ('5.107.35.129', 59557), ('1.186.110.130', 52489), ('213.55.184.142', 20957), ('212.27.19.250', 27752), ('208.131.165.74', 43445), ('206.231.120.58', 47329), ('201.171.198.249', 41786), ('197.237.197.228', 53859), ('182.185.28.76', 49177), ('139.60.191.110', 39618), ('128.204.243.195', 16367), ('110.141.103.165', 15628), ('109.177.94.252', 22003), ('91.125.221.144', 33401), ('83.233.111.243', 61522), ('64.119.192.244', 10158), ('208.83.82.204', 56896), ('197.248.219.158', 35343), ('179.8.95.144', 46273), ('169.61.218.36', 6881), ('110.232.86.13', 58849), ('98.140.114.123', 12739), ('94.5.232.98', 20992), ('83.110.229.127', 26085), ('71.52.86.53', 18720), ('27.106.81.154', 32895), ('206.231.121.213', 56925), ('188.129.68.189', 49137), ('184.75.223.90', 36325), ('169.61.218.35', 6881), ('115.69.187.184', 54351), ('105.167.16.86', 59903), ('103.71.16.243', 36830), ('94.234.35.83', 11045), ('92.238.78.95', 47805), ('51.7.103.175', 56243), ('41.161.58.35', 57415), ('37.152.206.132', 32226), ('216.99.104.234', 20146), ('190.171.112.227', 53823), ('183.13.205.157', 62348), ('168.1.24.73', 43848), ('144.132.44.65', 39426), ('125.209.92.221', 37986), ('112.141.58.167', 25914), ('112.10.23.147', 45241), ('104.5.169.41', 25308), ('92.237.15.59', 23965), ('88.16.150.152', 30098), ('82.24.138.43', 49740), ('82.12.72.167', 31238), ('62.150.79.73', 12959), ('46.182.186.120', 14830), ('41.186.83.139', 39809), ('41.139.250.38', 27948), ('41.80.228.149', 51315), ('37.142.3.43', 15902), ('197.248.146.186', 47562), ('196.202.169.186', 54056), ('195.89.120.114', 1887), ('190.58.8.8', 51356), ('178.80.20.88', 32719), ('154.73.87.85', 50759), ('128.106.160.157', 21075), ('120.244.110.7', 30692), ('109.190.212.96', 51413), ('103.244.242.125', 57601), ('103.53.73.37', 49569), ('92.99.64.137', 39323), ('86.83.69.149', 64478), ('86.18.224.70', 10087), ('82.23.110.144', 37364), ('79.183.99.166', 53756), ('79.67.57.240', 29678), ('43.246.200.117', 57837), ('43.241.145.68', 18244), ('41.90.111.158', 62329), ('24.203.151.158', 15655), ('196.202.169.186', 52047), ('165.231.40.14', 14529), ('162.246.107.146', 5250), ('113.20.98.46', 59247), ('105.232.88.134', 58042), ('103.70.204.194', 60803), ('96.43.189.107', 39391), ('82.7.117.162', 49036), ('66.87.207.99', 0), ('63.143.236.113', 0), ('41.210.10.227', 59850), ('31.45.103.138', 58745), ('222.184.232.173', 43972), ('216.151.184.159', 53348), ('208.103.233.162', 56253), ('196.209.233.164', 34498), ('196.207.184.51', 58026), ('195.147.181.190', 9979), ('156.222.46.220', 53780), ('154.70.57.66', 42107), ('112.205.157.246', 57546), ('105.184.153.117', 61483), ('103.123.36.181', 45900), ('90.219.115.171', 39294), ('81.100.255.223', 59631), ('80.39.134.201', 14986), ('77.68.243.70', 35000), ('72.229.37.71', 33869), ('49.146.13.126', 59667), ('41.79.66.225', 22121), ('221.203.22.245', 13235), ('186.150.134.175', 47674), ('178.166.33.36', 32733), ('124.107.192.43', 35453), ('123.245.20.142', 16726), ('121.54.56.38', 19377), ('116.90.96.10', 22968), ('108.4.215.251', 33971), ('104.207.83.98', 10899), ('96.23.181.91', 60231), ('86.137.21.83', 37329), ('75.65.215.196', 41839), ('63.245.58.155', 33565), ('46.209.44.99', 1302), ('182.186.253.171', 14647), ('178.157.249.62', 26603), ('130.105.164.123', 18359), ('105.232.86.49', 37295), ('103.252.202.150', 44822), ('103.4.248.138', 55841), ('95.94.157.175', 53805), ('94.175.148.12', 48143), ('94.76.15.44', 49920), ('89.237.98.87', 62852), ('87.70.101.199', 44609), ('86.25.124.158', 45263), ('72.230.218.136', 15580), ('70.65.227.21', 28936), ('41.139.243.58', 19483), ('41.60.233.245', 43611), ('31.215.14.8', 23218), ('1.9.135.44', 21209), ('203.212.155.83', 30754), ('202.142.91.102', 24654), ('197.237.129.150', 25308), ('197.232.26.118', 8999), ('197.232.19.110', 56174), ('196.249.103.196', 14869), ('187.112.124.166', 8999), ('183.87.169.155', 16437), ('117.194.220.81', 25839)]

Process finished with exit code 0