专栏首页量化投资与机器学习我就不用AI、ML模型预测股价,来点不一样的!

我就不用AI、ML模型预测股价,来点不一样的!

直接上代码

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import datetime
from scipy import stats
pd.options.mode.chained_assignment = None  # default='warn'
def load_db(file):
    fix_price = lambda x: float(str(x).replace(',',''))
    df = pd.read_csv(file)
    # clean data
    if 'Price' in df.columns:
        df.rename(columns={"Price":"Close"}, inplace=True)
    cols = []
    if 'Change %' in df.columns:
        cols.append('Change %')
    if 'Vol.' in df.columns:
        cols.append('Vol.')
    df.drop(cols, axis=1, inplace=True)
    try:
        df['Date'] = pd.to_datetime(df['Date'], format="%b %d, %Y")
    except:
        df['Date'] = pd.to_datetime(df['Date'])
    df['Open'] = df['Open'].apply(fix_price)
    df['High'] = df['High'].apply(fix_price)
    df['Low'] = df['Low'].apply(fix_price)
    df['Close'] = df['Close'].apply(fix_price)
    # This is the tricky part!!!
    # Calculate how far did price go from Open for each day!
    df['MaxHigh'] = df['High']/df['Open']
    df['MaxLow'] = df['Low']/df['Open']
    df['MaxClose'] = df['Close']/df['Open']
    # I will use prices beginning 2014
    return df[df['Date']>='2014-01-01']

你是否注意到MaxHigh,MaxLow和MaxClose列? 他们是关键! 我们将使用这些列计算概率。

sym = "GBP_USD"
period = "1d"
file = "./{} {}.csv".format(period, sym)
db = load_db(file).reset_index(drop=True)
db.tail()
def show_min_max(df):
    # find max and min values for High
    max_high = (df['MaxHigh'].max()-1)*100
    min_high = (df['MaxHigh'].min()-1)*100

    # find max and min values for Low
    max_low = (df['MaxLow'].max()-1)*100
    min_low = (df['MaxLow'].min()-1)*100

    # find max and min values for Close
    max_close = (df['MaxClose'].max()-1)*100
    min_close = (df['MaxClose'].min()-1)*100

    print("OpenToHigh\nMax: {:.2f}%\nMin: {:.2f}%\n".format(max_high, min_high))
    print("OpenToLow\nMax: {:.2f}%\nMin: {:.2f}%\n".format(max_low, min_low))
    print("OpenToClose\nMax: {:.2f}%\nMin: {:.2f}%\n".format(max_close, min_close))
show_min_max(db)
OpenToHigh
Max: 3.08%
Min: -0.05%

OpenToLow
Max: 0.00%
Min: -11.07%

OpenToClose
Max: 3.06%
Min: -8.02%

现在让我们编写一个函数来计算范围:

def prob(df, col, p):
    # I'm using percentile here
    lp = np.percentile(df[col].dropna(), p)
    fig, ax = plt.subplots(figsize=(20,10))
    plt.plot_date(df['Date'], df[col], ls='-', fmt='')
    ax.xaxis.set_tick_params(rotation=30, labelsize=10)
    ax.axhline(lp, color='r')
    return lp

在这种特殊情况下,我们希望赢得95%的交易。

定义我们感兴趣的概率。我选择了95%,但你可以改变它。 概率越小,利润越大! 此外,将计算从开始到过去100天的概率。

prb = 95
col = 'MaxHigh' # Let's find prb% of High (100-prb)
days = 100 # This is the testing size. So we will test last 100 days and exclude them from the calculation.
p = prob(df[:-days], col, 100-prb)
# get the current bar's open price
open_price = df['Open'].iloc[-1]
prediction = open_price * p
ticks = abs(open_price-(p*open_price))
monthly = 22 * ticks # 22 trading days in a month
mp = int(monthly / 0.0001)
print("Open Price: {:.5f}\nPrediction: {:.5f}\nProbability: {}%\nTicks: {:.5f}\nMonthly Ticks: {:.5f} ({} pips)".format(open_price, prediction, prb, ticks, monthly, mp))
Open Price: 1.26940
Prediction: 1.26998
Probability: 95%
Ticks: 0.00058
Monthly Ticks: 0.01279 (127 pips)

如果你在交易开始时每天建立一个长期交易,目标价格为5-6点,你将赢得95%的交易。 95%的价格将超过红线!

让我们用样本外数据进行回测!

# Create a prediction column for each day and multiply it with the value we have got from out model.
d = df[-days:] # get unseen data from our dataset
# p: value we have got from out model
d['HighPredicts'] = d['Open'] * p
total = len(d)
# if the High price of the day cross over our prediction, we will win!!!
won = len(d[d['High']>d['HighPredicts']])
hit_rate = won*100/total
print("Hit rate {:.2f}% of {} trades!".format(hit_rate, total))
Hit rate 96.00% of 100 trades!

NB! 它的胜率超过95%!

请注意,当你提高赢率时,你的利润会下降。

现在让我们尝试用相同的方法在空头头寸上。 让我们使用相同的函数预测当天的低点,但这次使用不同的计算方法!

col = 'MaxLow' 
p = prob(df[:-days], col, prb)
# get the current bar's open price
open_price = df['Open'].iloc[-1]
prediction = open_price * p
ticks = abs(open_price-(p*open_price))
monthly = 22 * ticks
mp = int(monthly / 0.0001)
print("Open Price: {:.5f}\nPrediction: {:.5f}\nProbability: {}%\nTicks: {:.5f}\nMonthly Ticks: {:.5f} ({} pips)".format(open_price, prediction, prb, ticks, monthly, mp))
Open Price: 1.26940
Prediction: 1.26880
Probability: 95%
Ticks: 0.00060
Monthly Ticks: 0.01316 (131 pips)

图表显示95%的价格将低于红线。 利润为每天60 ticks(6 pips)。

样本外:

d = df[-days:]
d['LowPredicts'] = d['Open'] * p
total = len(d)
# if Low of the day cross under our prediction, we will win!!!
won = len(d[d['Low']<d['LowPredicts']])
hit_rate = won*100/total
print("Hit rate {:.2f}% of {} trades!".format(hit_rate, total))
Hit rate 92.00% of 100 trades!

现在我们确定了95%的高低范围。 最后让我们计算收盘价范围而不是高/低价,看看是否存在差异!

col = 'MaxClose' # Let's find 75% of Low but this time we will just use prb instead of 100-prb
max_close = prob(df[df['MaxClose']>1], col, 100-prb) # only for green bars
min_close = prob(df[df['MaxClose']<1], col, prb) # only for red bars
# get the current bar's open price
open_price = df['Open'].iloc[-1]
prediction_max = open_price * max_close
prediction_min = open_price * min_close
print("Open Price: {:.5f}\nPrediction: {:.5f} - {:.5f}\nProbability: {}%".format(open_price, prediction_min, prediction_max, prb))                                             
print("Low ticks: {:.5f}".format(open_price-prediction_min))
print("High ticks: {:.5f}".format(prediction_max-open_price))
Open Price: 1.26940
Prediction: 1.26901 - 1.26973
Probability: 95%
Low ticks: 0.00039
High ticks: 0.00033

这很有趣! 因为当我们预测95%的最高价和最低价时,所获得的价格超过了这个价格。 那么为什么在这种情况下我们必须预测收盘价呢?

进行回测,看看我们是否能获得相同的概率!

# Create a prediction column for each day
d = df[-days:]
d['CloseLowPredicts'] = d['Open'] * min_close
d['CloseHighPredicts'] = d['Open'] * max_close
total = len(d)

# We are predicting a level we can make money. In this article, we are not trying to predict actual closing price
# That's why we are going to try to take profit!

# if Low of the day cross under our prediction, we will win!!!
low_won = len(d[d['Low']<d['CloseLowPredicts']])

# if High of the day cross over our prediction, we will win!!!
high_won = len(d[d['High']>d['CloseHighPredicts']])
low_hit_rate = low_won*100/total
high_hit_rate = high_won*100/total
print("Hit rate {:.2f}% for Low - {:.2f}% for High".format(low_hit_rate, high_hit_rate))
Hit rate 94.00% for Low - 98.00% for High

几乎相同。

添加两个异动平均线:

def prob_ma(df, col, p, is_low=False):
    # I'm using percentile here
    df['MA1'] = df['Close'].rolling(5).mean()
    df['MA2'] = df['Close'].rolling(220).mean()
    if is_low:
        cond = df['MA2']>df['MA1']
    else:
        cond = df['MA1']>df['MA2']
    lp = np.percentile(df[col][cond].dropna(), p)
    return lp, len(df[col][cond].dropna())
col = 'MaxHigh' # Let's find 75% of High (100-prb)
days = 100 # This is the testing size. So we will test last 100 days and exclude them from the calculation.
p, trades = prob_ma(df[:-days], col, 100-prb)
# get the current bar's open price
open_price = df['Open'].iloc[-1]
prediction = open_price * p
ticks = abs(open_price-(p*open_price))
monthly = 22 * ticks # 22 trading days in a month
mp = int(monthly / 0.0001)
print("Trades: {}/{}\nOpen Price: {:.5f}\nPrediction: {:.5f}\nProbability: {}%\nTicks: {:.5f}\nMonthly Ticks: {:.5f} ({} pips)".format(trades, len(df[:-days]), open_price, prediction, prb, ticks, monthly, mp))
Trades: 286/1025
Open Price: 1.26940
Prediction: 1.27023
Probability: 95%
Ticks: 0.00083
Monthly Ticks: 0.01833 (183 pips)

总结

看起来不是预测收盘价,我们应该更好地预测最高/低价。 因为它们以更高的命中率提供更多的ticks!

我们大多数时候都使用过“预测”这个词。 但这不是预测。 这只是使用统计学中的统计方法和计算概率。

作者:Atilla Yurtseven

来源:https://www.tradingview.com/u/Dumani/

本文分享自微信公众号 - 量化投资与机器学习(Lhtz_Jqxx),作者:QIML编辑部

原文出处及转载信息见文内详细说明,如有侵权,请联系 yunjia_community@tencent.com 删除。

原始发表时间:2018-08-23

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • 基于机器学习算法的时间序列价格异常检测(附代码)

    异常检测也称为异常值检测,是一种数据挖掘过程,用于确定数据集中发现的异常类型并确定其出现的详细信息。 在当今世界,由于大量数据无法手动标记异常值,自动异常检测显...

    量化投资与机器学习微信公众号
  • Datatable:Python数据分析提速高手,飞一般的感觉!

    https://datatable.readthedocs.io/en/latest/?badge=latest

    量化投资与机器学习微信公众号
  • 我用Facebook开源神器Prophet,预测时间序列基于Python(代码+论文)

    Prophet是Facebook 开源一款基于 Python 和 R 语言的数据预测工具。Facebook 表示,Prophet 相比现有预测工具更加人性化,并...

    量化投资与机器学习微信公众号
  • 数学建模中离散变量的处理——笔记二

    原文主要内容是利用Titanic数据集讲解常用的机器学习算法,原数据集的主要任务是根据相关变量预测乘客是否可以存活(It is your job to pred...

    用户7010445
  • 50道练习实践学习Pandas!

    原文地址:https://www.kesci.com/home/project/5ddc974ef41512002cec1dca

    Datawhale
  • 【MathorCup】2020年 A题 无车承运人平台线路定价问题,特征间的相关性分析

    问题 1:通过定量分析的方法,研究影响无车承运人平台进行货运线路定价的主要因素有哪些,并说明理由。 问题 2:根据附件 1 数据,通过建立数学模型,对已经成交...

    不太灵光的程序员
  • Pandas进阶修炼120题,给你深度和广度的船新体验

    本文为你介绍Pandas基础、Pandas数据处理、金融数据处理等方面的一些习题。

    数据派THU
  • Day05| 第四期-电商数据分析

    疫情期间,想必我们会增加网上购物,人们的生活越来越数字化。当我们消费时,无论是线上和线下都会产生大量的交易数据,对于商家来说数字化的运营方式非常必要,从大量的交...

    DataScience
  • pandas数据清洗,排序,索引设置,数据选取

    df.isnull() df的空值为True df.notnull() df的非空值为True

    李智
  • Pandas 数据分析: 3 种方法实现一个实用小功能

    与时间相关,自然第一感觉便是转化为datetime格式,这里需要注意:需要首先将两列转化为 str 类型。

    double

扫码关注云+社区

领取腾讯云代金券