在本文档中,将解释创建自动绑定所需的以下步骤:
https://www.getpostman.com/
步骤0:动机和免责声明
Auto Tinder是一个纯粹出于娱乐和教育目的而创建的概念项目。绝不能滥用它来伤害任何人或向平台发送垃圾邮件。自动绑定脚本不应与您的绑定文件一起使用,因为它们肯定违反了绑定服务条款。
编写此软件的原因主要有两个:
步骤1:分析Tinder API
第一步是找出tinder应用程序如何与tender后端服务器进行通信。由于tinder提供其门户网站的网络版本,因此与访问tinder.com一样简单,它可以打开chrome devtools并快速查看网络协议。
上图中显示的内容是从请求发送到链接的请求,该请求是在tinder.com着陆页加载时发出的。显然,tinder具有某种内部API,正在使用它们在前端和后端之间进行通信。
https://api.gotinder.com/v2/recs/core
通过分析/ recs / core的内容,可以清楚地看到此API端点返回附近人员的用户个人资料列表。
该数据包括(在许多其他字段中)以下数据:
{
"meta": {
"status": 200
},
"data": {
"results": [
{
"type": "user",
"user": {
"_id": "4adfwe547s8df64df",
"bio": "19y.",
"birth_date": "1997-17-06T18:21:44.654Z",
"name": "Anna",
"photos": [
{
"id": "879sdfert-lskdföj-8asdf879-987sdflkj",
"crop_info": {
"user": {
"width_pct": 1,
"x_offset_pct": 0,
"height_pct": 0.8,
"y_offset_pct": 0.08975463
},
"algo": {
"width_pct": 0.45674357,
"x_offset_pct": 0.984341657,
"height_pct": 0.234165403,
"y_offset_pct": 0.78902343
},
"processed_by_bullseye": true,
"user_customized": false
},
"url": "https://images-ssl.gotinder.com/4adfwe547s8df64df/original_879sdfert-lskdföj-8asdf879-987sdflkj.jpeg",
"processedFiles": [
{
"url": "https://images-ssl.gotinder.com/4adfwe547s8df64df/640x800_879sdfert-lskdföj-8asdf879-987sdflkj.jpg",
"height": 800,
"width": 640
},
{
"url": "https://images-ssl.gotinder.com/4adfwe547s8df64df/320x400_879sdfert-lskdföj-8asdf879-987sdflkj.jpg",
"height": 400,
"width": 320
},
{
"url": "https://images-ssl.gotinder.com/4adfwe547s8df64df/172x216_879sdfert-lskdföj-8asdf879-987sdflkj.jpg",
"height": 216,
"width": 172
},
{
"url": "https://images-ssl.gotinder.com/4adfwe547s8df64df/84x106_879sdfert-lskdföj-8asdf879-987sdflkj.jpg",
"height": 106,
"width": 84
}
],
"last_update_time": "2019-10-03T16:18:30.532Z",
"fileName": "879sdfert-lskdföj-8asdf879-987sdflkj.webp",
"extension": "jpg,webp",
"webp_qf": [
75
]
}
],
"gender": 1,
"jobs": [],
"schools": [],
"show_gender_on_profile": false
},
"facebook": {
"common_connections": [],
"connection_count": 0,
"common_interests": []
},
"spotify": {
"spotify_connected": false
},
"distance_mi": 1,
"content_hash": "slkadjfiuwejsdfuzkejhrsdbfskdzufiuerwer",
"s_number": 9876540657341,
"teaser": {
"string": ""
},
"teasers": [],
"snap": {
"snaps": []
}
}
]
}
}
这里有几件事非常有趣(请注意,更改了所有数据以不侵犯此人的隐私):
通过分析内容标头,我们可以快速找到私有API密钥:X-Auth-Token。
通过复制此令牌并转到Postman,可以验证确实可以仅使用正确的URL和auth令牌与tinder API自由通信。
通过一点点单击tinders webapp,很快发现了所有相关的API端点:
第2步:在Python中构建API包装器
因此进入代码。为了方便起见,将使用python Requests库与API通信并在其周围编写一个API包装器类。
https://requests.kennethreitz.org/en/master/
同样,编写了一个小的Person类,该类从Tinder接收代表一个Person的API响应,并提供了一些与Tinder API的基本接口。
从Person类开始。它应接收API数据,tinder-api对象,并将所有相关数据保存到实例变量中。它还将提供一些基本功能,例如“喜欢”或“不喜欢”,这些请求向tinder-api发出了请求,这使能够方便地使用“ some_person.like()”来喜欢发现有趣的配置文件。
import datetime
from geopy.geocoders import Nominatim
TINDER_URL = "https://api.gotinder.com"
geolocator = Nominatim(user_agent="auto-tinder")
PROF_FILE = "./images/unclassified/profiles.txt"
class Person(object):
def __init__(self, data, api):
self._api = api
self.id = data["_id"]
self.name = data.get("name", "Unknown")
self.bio = data.get("bio", "")
self.distance = data.get("distance_mi", 0) / 1.60934
self.birth_date = datetime.datetime.strptime(data["birth_date"], '%Y-%m-%dT%H:%M:%S.%fZ') if data.get(
"birth_date", False) else None
self.gender = ["Male", "Female", "Unknown"][data.get("gender", 2)]
self.images = list(map(lambda photo: photo["url"], data.get("photos", [])))
self.jobs = list(
map(lambda job: {"title": job.get("title", {}).get("name"), "company": job.get("company", {}).get("name")}, data.get("jobs", [])))
self.schools = list(map(lambda school: school["name"], data.get("schools", [])))
if data.get("pos", False):
self.location = geolocator.reverse(f'{data["pos"]["lat"]}, {data["pos"]["lon"]}')
def __repr__(self):
return f"{self.id} - {self.name} ({self.birth_date.strftime('%d.%m.%Y')})"
def like(self):
return self._api.like(self.id)
def dislike(self):
return self._api.dislike(self.id)
API包装器不过是使用类调用tinder API的理想方式:
import requests
TINDER_URL = "https://api.gotinder.com"
class tinderAPI():
def __init__(self, token):
self._token = token
def profile(self):
data = requests.get(TINDER_URL + "/v2/profile?include=account%2Cuser", headers={"X-Auth-Token": self._token}).json()
return Profile(data["data"], self)
def matches(self, limit=10):
data = requests.get(TINDER_URL + f"/v2/matches?count={limit}", headers={"X-Auth-Token": self._token}).json()
return list(map(lambda match: Person(match["person"], self), data["data"]["matches"]))
def like(self, user_id):
data = requests.get(TINDER_URL + f"/like/{user_id}", headers={"X-Auth-Token": self._token}).json()
return {
"is_match": data["match"],
"liked_remaining": data["likes_remaining"]
}
def dislike(self, user_id):
requests.get(TINDER_URL + f"/pass/{user_id}", headers={"X-Auth-Token": self._token}).json()
return True
def nearby_persons(self):
data = requests.get(TINDER_URL + "/v2/recs/core", headers={"X-Auth-Token": self._token}).json()
return list(map(lambda user: Person(user["user"], self), data["data"]["results"]))
现在,可以使用API查找附近的人,并查看个人资料,甚至可以查看所有人。将API令牌替换为之前在chrome开发者控制台中找到的X-Auth-Token。
if __name__ == "__main__":
token = "YOUR-API-TOKEN"
api = tinderAPI(token)
while True:
persons = api.nearby_persons()
for person in persons:
print(person)
# person.like()
第3步:下载附近人的图像
接下来,要自动下载附近的人的一些图像,以用于训练AI。“有些”是指1500-2500张图像。
首先,通过一个允许下载图像的函数来扩展Person类。
# At the top of auto_tinder.py
PROF_FILE = "./images/unclassified/profiles.txt"
# inside the Person-class
def download_images(self, folder=".", sleep_max_for=0):
with open(PROF_FILE, "r") as f:
lines = f.readlines()
if self.id in lines:
return
with open(PROF_FILE, "a") as f:
f.write(self.id+"\r\n")
index = -1
for image_url in self.images:
index += 1
req = requests.get(image_url, stream=True)
if req.status_code == 200:
with open(f"{folder}/{self.id}_{self.name}_{index}.jpeg", "wb") as f:
f.write(req.content)
sleep(random()*sleep_max_for)
请注意,在这里和那里添加了一些随机睡眠,这是因为如果向垃圾邮件CDN发送垃圾邮件并在短短几秒钟内下载许多图片,可能会被阻止。
将所有人员档案ID写入名为“ profiles.txt”的文件中。通过首先扫描文档中是否已存在某个特定人员,可以跳过已经遇到的人员,并确保不会对人员进行多次分类(将在后面看到为什么会有这种风险)。
现在,可以遍历附近的人,并将图像下载到“未分类”文件夹中。
if __name__ == "__main__":
token = "YOUR-API-TOKEN"
api = tinderAPI(token)
while True:
persons = api.nearby_persons()
for person in persons:
person.download_images(folder="./images/unclassified", sleep_max_for=random()*3)
sleep(random()*10)
sleep(random()*10)
现在,可以简单地启动此脚本,并使其运行几个小时,以获取附近人员的一些Hundret个人资料图片。如果是tinder PRO用户,请立即更新位置,然后结识新朋友。
步骤4:手动分类图像
现在有很多图像可以使用,构建一个非常简单且丑陋的分类器。
它将仅循环遍历“未分类”文件夹中的所有图像,并在GUI窗口中打开该图像。通过右键单击某个人,可以将该人标记为“不喜欢”,而单击鼠标左键将该人标记为“喜欢”。这将在文件名后上表示:4tz3kjldfj3482.jpg将被更名为1_4tz3kjldfj3482.jpg如果纪念像为“像”,或0_4tz3kjldfj3482.jpg否则。标签like / dislike在文件名的开头编码为1/0。
使用tkinter快速编写此GUI:
from os import listdir, rename
from os.path import isfile, join
import tkinter as tk
from PIL import ImageTk, Image
IMAGE_FOLDER = "./images/unclassified"
images = [f for f in listdir(IMAGE_FOLDER) if isfile(join(IMAGE_FOLDER, f))]
unclassified_images = filter(lambda image: not (image.startswith("0_") or image.startswith("1_")), images)
current = None
def next_img():
global current, unclassified_images
try:
current = next(unclassified_images)
except StopIteration:
root.quit()
print(current)
pil_img = Image.open(IMAGE_FOLDER+"/"+current)
width, height = pil_img.size
max_height = 1000
if height > max_height:
resize_factor = max_height / height
pil_img = pil_img.resize((int(width*resize_factor), int(height*resize_factor)), resample=Image.LANCZOS)
img_tk = ImageTk.PhotoImage(pil_img)
img_label.img = img_tk
img_label.config(image=img_label.img)
def positive(arg):
global current
rename(IMAGE_FOLDER+"/"+current, IMAGE_FOLDER+"/1_"+current)
next_img()
def negative(arg):
global current
rename(IMAGE_FOLDER + "/" + current, IMAGE_FOLDER + "/0_" + current)
next_img()
if __name__ == "__main__":
root = tk.Tk()
img_label = tk.Label(root)
img_label.pack()
img_label.bind("<Button-1>", positive)
img_label.bind("<Button-3>", negative)
btn = tk.Button(root, text='Next image', command=next_img)
next_img() # load first image
root.mainloop()
将所有未分类的图像加载到“ unclassified_images”列表中,打开一个tkinter窗口,通过调用next_img()将第一张图像打包到其中,并调整图像的大小以适合屏幕。然后,单击鼠标左键和鼠标右键两次,并调用正/负函数,该函数会根据其标签重命名图像并显示下一张图像。
丑陋但有效。
第5步:开发预处理程序以仅裁剪图像中的人物
下一步,需要将图像数据转换为允许分类的格式。给定数据集,必须考虑一些困难。
通过以下方式应对这些挑战:
第一部分就像使用“枕头”打开图像并将其转换为灰度图一样容易。对于第二部分,将 Tensorflow对象检测API与mobilenet网络体系结构一起使用,在可可数据集上进行了预训练,该数据集还包含“人”的标签。
https://github.com/tensorflow/models/tree/master/research/object_detection
人检测脚本包含四个部分:
第1部分:打开预训练的mobilenet可可数据集作为Tensorflow图
可以在Github存储库中找到tensorflow mobilenet可可图的.bp文件。将其作为Tensorflow图打开:
import tensorflow as tf
def open_graph():
detection_graph = tf.Graph()
with detection_graph.as_default():
od_graph_def = tf.GraphDef()
with tf.gfile.GFile('ssd_mobilenet_v1_coco_2017_11_17/frozen_inference_graph.pb', 'rb') as fid:
serialized_graph = fid.read()
od_graph_def.ParseFromString(serialized_graph)
tf.import_graph_def(od_graph_def, name='')
return detection_graph
第2部分:以numpy数组的形式加载图像
使用枕头进行图像处理。由于tensorflow需要原始的numpy数组来处理数据,因此编写一个小函数将Pillow图像转换为numpy数组:
import numpy as np
def load_image_into_numpy_array(image):
(im_width, im_height) = image.size
return np.array(image.getdata()).reshape(
(im_height, im_width, 3)).astype(np.uint8)
第3部分:调用对象检测API
下一个函数获取图像和张量流图,使用它运行一个张量流会话,并返回有关检测到的类(对象类型),边界框和得分(确定正确检测到对象的确定性)的所有信息。
import numpy as np
from object_detection.utils import ops as utils_ops
import tensorflow as tf
def run_inference_for_single_image(image, sess):
ops = tf.get_default_graph().get_operations()
all_tensor_names = {output.name for op in ops for output in op.outputs}
tensor_dict = {}
for key in [
'num_detections', 'detection_boxes', 'detection_scores',
'detection_classes', 'detection_masks'
]:
tensor_name = key + ':0'
if tensor_name in all_tensor_names:
tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(
tensor_name)
if 'detection_masks' in tensor_dict:
# The following processing is only for single image
detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0])
detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0])
# Reframe is required to translate mask from box coordinates to image coordinates and fit the image size.
real_num_detection = tf.cast(tensor_dict['num_detections'][0], tf.int32)
detection_boxes = tf.slice(detection_boxes, [0, 0], [real_num_detection, -1])
detection_masks = tf.slice(detection_masks, [0, 0, 0], [real_num_detection, -1, -1])
detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
detection_masks, detection_boxes, image.shape[1], image.shape[2])
detection_masks_reframed = tf.cast(
tf.greater(detection_masks_reframed, 0.5), tf.uint8)
# Follow the convention by adding back the batch dimension
tensor_dict['detection_masks'] = tf.expand_dims(
detection_masks_reframed, 0)
image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')
# Run inference
output_dict = sess.run(tensor_dict,
feed_dict={image_tensor: image})
# all outputs are float32 numpy arrays, so convert types as appropriate
output_dict['num_detections'] = int(output_dict['num_detections'][0])
output_dict['detection_classes'] = output_dict[
'detection_classes'][0].astype(np.int64)
output_dict['detection_boxes'] = output_dict['detection_boxes'][0]
output_dict['detection_scores'] = output_dict['detection_scores'][0]
if 'detection_masks' in output_dict:
output_dict['detection_masks'] = output_dict['detection_masks'][0]
return output_dict
第4部分:将所有内容组合在一起以找到人
最后一步是编写一个获取图像路径的函数,使用Pillow将其打开,调用对象检测api接口,并根据检测到的人的边界框裁剪图像。
import numpy as np
from PIL import Image
PERSON_CLASS = 1
SCORE_THRESHOLD = 0.5
def get_person(image_path, sess):
img = Image.open(image_path)
image_np = load_image_into_numpy_array(img)
image_np_expanded = np.expand_dims(image_np, axis=0)
output_dict = run_inference_for_single_image(image_np_expanded, sess)
persons_coordinates = []
for i in range(len(output_dict["detection_boxes"])):
score = output_dict["detection_scores"][i]
classtype = output_dict["detection_classes"][i]
if score > SCORE_THRESHOLD and classtype == PERSON_CLASS:
persons_coordinates.append(output_dict["detection_boxes"][i])
w, h = img.size
for person_coordinate in persons_coordinates:
cropped_img = img.crop((
int(w * person_coordinate[1]),
int(h * person_coordinate[0]),
int(w * person_coordinate[3]),
int(h * person_coordinate[2]),
))
return cropped_img
return None
第5部分:将所有图像移动到分类文件夹中
最后一步,编写一个脚本来循环遍历“未分类”文件夹中的所有图像,并使用先前开发的预处理步骤检查它们是否在名称中具有编码标签,从而将图像复制到“已分类”文件夹中的图像中:
import os
import person_detector
import tensorflow as tf
IMAGE_FOLDER = "./images/unclassified"
POS_FOLDER = "./images/classified/positive"
NEG_FOLDER = "./images/classified/negative"
if __name__ == "__main__":
detection_graph = person_detector.open_graph()
images = [f for f in os.listdir(IMAGE_FOLDER) if os.path.isfile(os.path.join(IMAGE_FOLDER, f))]
positive_images = filter(lambda image: (image.startswith("1_")), images)
negative_images = filter(lambda image: (image.startswith("0_")), images)
with detection_graph.as_default():
with tf.Session() as sess:
for pos in positive_images:
old_filename = IMAGE_FOLDER + "/" + pos
new_filename = POS_FOLDER + "/" + pos[:-5] + ".jpg"
if not os.path.isfile(new_filename):
img = person_detector.get_person(old_filename, sess)
if not img:
continue
img = img.convert('L')
img.save(new_filename, "jpeg")
for neg in negative_images:
old_filename = IMAGE_FOLDER + "/" + neg
new_filename = NEG_FOLDER + "/" + neg[:-5] + ".jpg"
if not os.path.isfile(new_filename):
img = person_detector.get_person(old_filename, sess)
if not img:
continue
img = img.convert('L')
img.save(new_filename, "jpeg")
每当运行此脚本时,所有标记的图像都将被处理并移入“分类”目录中的相应子文件夹中。
步骤6:重新训练inceptionv3并编写分类器
对于重新训练部分,将仅将tensorflows retrain.py 脚本与inceptionv3模型一起使用。
https://github.com/tensorflow/hub/blob/master/examples/image_retraining/retrain.py
使用以下参数在项目根目录中调用脚本:
python retrain.py --bottleneck_dir=tf/training_data/bottlenecks --model_dir=tf/training_data/inception --summaries_dir=tf/training_data/summaries/basic --output_graph=tf/training_output/retrained_graph.pb --output_labels=tf/training_output/retrained_labels.txt --image_dir=./images/classified --how_many_training_steps=50000 --testing_percentage=20 --learning_rate=0.001
在GTX 1080 ti上,学习大约需要15分钟,对于标记的数据集,最终精度约为80%,但这在很大程度上取决于输入数据和标记的质量。
训练过程的结果是“ tf / training_output / retrained_graph.pb”文件中的重新训练的inceptionV3模型。现在必须编写一个分类器类,该类可以有效地使用张量流图中的新权重进行分类预测。
写一个Classifier-Class,它以会话形式打开图形,并提供带有图像文件的“ classify”方法,该图像文件将返回具有与标签“ positive”和“ negative”相匹配的确定性值的dict。
该类将图的路径以及标签文件的路径作为输入,它们都位于“ tf / training_output /”文件夹中。开发了用于将图像文件转换为张量的辅助函数,可以将其馈入到图形中,还提供了用于加载图形和标签的辅助函数,以及一个重要的小函数,用于在完成使用后关闭图形。
import numpy as np
import tensorflow as tf
class Classifier():
def __init__(self, graph, labels):
self._graph = self.load_graph(graph)
self._labels = self.load_labels(labels)
self._input_operation = self._graph.get_operation_by_name("import/Placeholder")
self._output_operation = self._graph.get_operation_by_name("import/final_result")
self._session = tf.Session(graph=self._graph)
def classify(self, file_name):
t = self.read_tensor_from_image_file(file_name)
# Open up a new tensorflow session and run it on the input
results = self._session.run(self._output_operation.outputs[0], {self._input_operation.outputs[0]: t})
results = np.squeeze(results)
# Sort the output predictions by prediction accuracy
top_k = results.argsort()[-5:][::-1]
result = {}
for i in top_k:
result[self._labels[i]] = results[i]
# Return sorted result tuples
return result
def close(self):
self._session.close()
@staticmethod
def load_graph(model_file):
graph = tf.Graph()
graph_def = tf.GraphDef()
with open(model_file, "rb") as f:
graph_def.ParseFromString(f.read())
with graph.as_default():
tf.import_graph_def(graph_def)
return graph
@staticmethod
def load_labels(label_file):
label = []
proto_as_ascii_lines = tf.gfile.GFile(label_file).readlines()
for l in proto_as_ascii_lines:
label.append(l.rstrip())
return label
@staticmethod
def read_tensor_from_image_file(file_name,
input_height=299,
input_width=299,
input_mean=0,
input_std=255):
input_name = "file_reader"
file_reader = tf.read_file(file_name, input_name)
image_reader = tf.image.decode_jpeg(
file_reader, channels=3, name="jpeg_reader")
float_caster = tf.cast(image_reader, tf.float32)
dims_expander = tf.expand_dims(float_caster, 0)
resized = tf.image.resize_bilinear(dims_expander, [input_height, input_width])
normalized = tf.divide(tf.subtract(resized, [input_mean]), [input_std])
sess = tf.Session()
result = sess.run(normalized)
return result
第7步:使用所有这些内容自动播放Tinder
现在已经有了分类器,从前面扩展“ Person”类,并使用“ predict_likeliness”函数对其进行扩展,该函数使用分类器实例来验证是否应喜欢给定的人。
# In the Person class
def predict_likeliness(self, classifier, sess):
ratings = []
for image in self.images:
req = requests.get(image, stream=True)
tmp_filename = f"./images/tmp/run.jpg"
if req.status_code == 200:
with open(tmp_filename, "wb") as f:
f.write(req.content)
img = person_detector.get_person(tmp_filename, sess)
if img:
img = img.convert('L')
img.save(tmp_filename, "jpeg")
certainty = classifier.classify(tmp_filename)
pos = certainty["positive"]
ratings.append(pos)
ratings.sort(reverse=True)
ratings = ratings[:5]
if len(ratings) == 0:
return 0.001
return ratings[0]*0.6 + sum(ratings[1:])/len(ratings[1:])*0.4
现在,必须将所有拼图组合在一起。
首先,使用api令牌初始化tinder API。然后使用重新训练的图和标签将分类张量流图打开为张量流会话。然后获取附近的人员并进行可能性预测。
作为一个小小的奖励,如果Tinder上的人和我上同一所大学,添加了1.2的似然倍数,这样就更有可能与本地学生匹配。
对于所有预测的可能性得分为0.8的人,称其为“喜欢”,而其他所有人则为“不喜欢”。
将脚本开发为可以在脚本启动后的两个小时内自动播放。
from likeliness_classifier import Classifier
import person_detector
import tensorflow as tf
from time import time
if __name__ == "__main__":
token = "YOUR-API-TOKEN"
api = tinderAPI(token)
detection_graph = person_detector.open_graph()
with detection_graph.as_default():
with tf.Session() as sess:
classifier = Classifier(graph="./tf/training_output/retrained_graph.pb",
labels="./tf/training_output/retrained_labels.txt")
end_time = time() + 60*60*2
while time() < end_time:
try:
persons = api.nearby_persons()
pos_schools = ["Universität Zürich", "University of Zurich", "UZH"]
for person in persons:
score = person.predict_likeliness(classifier, sess)
for school in pos_schools:
if school in person.schools:
print()
score *= 1.2
print("-------------------------")
print("ID: ", person.id)
print("Name: ", person.name)
print("Schools: ", person.schools)
print("Images: ", person.images)
print(score)
if score > 0.8:
res = person.like()
print("LIKE")
else:
res = person.dislike()
print("DISLIKE")
except Exception:
pass
classifier.close()
现在可以让脚本运行任意长的时间,并在不拖累拇指的情况下玩弄Tinder!
如果有任何疑问或发现错误,请随时为Github存储库做出贡献。
https://github.com/joelbarmettlerUZH/auto-tinder