首页
学习
活动
专区
工具
TVP
发布
精选内容/技术社群/优惠产品,尽在小程序
立即前往

Python——量化分析常用命令介绍(四)

这是奔跑的键盘侠的第109篇文章

上节课提到的MongoDB,其实安装很简单的,前几天下载页面打不开,无形中放大了心里阴影面积

用下面几个命令,几分钟就ok了。

1

MongDB的基本操作

输入mongo进入环境,然后:

> use my_quant

switched to db my_quant

> db

my_quant

>db.my_quant.insert({"code":"000001","date":"2015-01-05","index":true,"close":3350.52,"high":3369.28,"low":3253.88,"open":3258.63,"volume":531352391})

3行命令就可以新建一个数据库.

然后如果要删除一个数据库,用下面命令即可:

> show databases

admin 0.000GB

config 0.000GB

daily 0.000GB

local 0.000GB

my_quant 0.000GB

> use daily

switched to db daily

> show collections

daily

> db

daily

> db.dropDatabase()

{ "dropped" : "daily", "ok" : 1 }

> show databases

admin 0.000GB

config 0.000GB

local 0.000GB

my_quant 0.000GB

当然用db.daily.drop()也是可以的

2

搭建自己的数据库

前一篇讲到了爬取股票数据的基本函数,这次就是要把爬取的数据存储到我们的数据库了。先贴一下量化分析的大概框架吧。

代码语言:javascript
复制
├── README
├── MyQuant_v1 #量化分析程序目录
   ├── __init__.py
   ├── data #数据处理目录
   │   ├── __init__.py
   │   └── data_crawler.py  #爬取指数、股票数据
代码语言:javascript
复制
   ├──util # 公用程序
   │   ├── __init__.py
代码语言:javascript
复制
   │   └── database.py #链接数据库
代码语言:javascript
复制
   ├── backtest #回测
   │   ├── __init__.py
代码语言:javascript
复制
   │   └── _backtest_ #计划写一下回测走势图
代码语言:javascript
复制
   ├── factor #因子
   │   ├── __init__.py
   │   └──_ factor_.py #不准备开发
   ├── strategy #策略
   │   ├── __init__.py
   │   └── _strategy_ #计划简单写个,主要用于回测
   ├── trading #交易
   │   ├── __init__.py
代码语言:javascript
复制
   │   └── _trading_ #不准备开发
代码语言:javascript
复制
代码语言:javascript
复制
   └── log #日志目录
          ├── __init__.py
          ├── backtest.log #不准备开发
代码语言:javascript
复制
           └── transactions.log#不准备开发
代码语言:javascript
复制
代码语言:javascript
复制

至于为何要搞框架,之前也有介绍过了,程序复杂了,如果再揉在一篇中,随便改个东西都可能会要命。

先贴个小菜,链接数据库的,只有2行代码

代码语言:javascript
复制
#!/usr/bin/env python3.6
# -*- coding: utf-8 -*-
# @Time    : 2019-07-06 20:14
# @Author  : Ed Frey
# @File    : database.py
# @Software: PyCharm
from pymongo import MongoClient

DB_CONN = MongoClient('mongodb://127.0.0.1:27017')['my_quant']

然后注意,前方高能!!!

代码语言:javascript
复制
#!/usr/bin/env python3.6
# -*- coding: utf-8 -*-
# @Time    : 2019-07-01 22:11
# @Author  : Ed Frey
# @File    : data_crawler.py
# @Software: PyCharm
import tushare as ts
from util.database import DB_CONN
from pymongo import UpdateOne
from datetime import datetime

class DataCrawler():
    def __init__(self):
        self.daily = DB_CONN['daily']
        self.daily_hfq = DB_CONN['daily_hfq']
        self.daily_qfq = DB_CONN['daily_qfq']


    def crawl_index(self,begin_date=None,end_date=None):
        '''
        to crawl the index's trading information and set up our own database
        :param begin_date: YYYY-MM-DD
        :param end_date: YYYY-MM-DD
        :return:
        '''
        codes = ['000001','000300','399001','399005','399006']

        if begin_date is None:
            begin_date = '2008-01-01'
        if end_date is None:
            end_date = datetime.now().strftime('%Y-%m-%d')

        for code in codes:
            df_daily = ts.get_k_data(code,index=True,start=begin_date,end=end_date)
            self.save_data(code, df_daily, self.daily, {'index': True})


    def save_data(self, code, df_daily, collection, extra_fields=None):
        """
        save the data into MongoDB
        :param code: stock's code
        :param df_daily: the DataFrame including k line
        :param collection: saving collection
        :param extra_fields: the other fields that will be used one day
        """
        update_requests = []
        for df_index in df_daily.index:
            daily_obj = df_daily.loc[df_index]
            doc = self.daily_obj_2_doc(code, daily_obj)

            if extra_fields is not None:
                doc.update(extra_fields)

            update_requests.append(
                UpdateOne(
                    {'code': doc['code'], 'date': doc['date'], 'index': doc['index']},
                    {'$set': doc},
                    upsert=True)
            )

        if len(update_requests) > 0:
            update_result = collection.bulk_write(update_requests, ordered =False)
            print('Saving index daily, code: %s, inserted: %4d, modified: %4d'
                  %(code, update_result.upserted_count, update_result.modified_count),
                  flush=True)


    def crawl(self, begin_date=None, end_date=None):
        '''
        to crawl the stock's trading information and set up our own database,including 前复权、后复权、不复权
        :param begin_date: YYYY-MM-DD
        :param end_date: YYYY-MM-DD
        :return:
        '''
        df_stock = ts.get_stock_basics()
        codes = list(df_stock.index)

        if begin_date is None:
            begin_date = '2008-01-01'
        if end_date is None:
            end_date = datetime.now().strftime('%Y-%m-%d')

        for code in codes:
            df_daily = ts.get_k_data(code, autype=None, start=begin_date, end=end_date)
            self.save_data(code, df_daily, self.daily, {'index': False})

            df_daily_hfq = ts.get_k_data(code, autype='hfq', start=begin_date, end=end_date)
            self.save_data(code, df_daily_hfq, self.daily_hfq, {'index': False})

            df_daily_qfq = ts.get_k_data(code, autype='qfq', start=begin_date, end=end_date)
            self.save_data(code, df_daily_qfq, self.daily_qfq, {'index': False})


    @staticmethod
    def daily_obj_2_doc(code, daily_obj):
        doc = dict(daily_obj)
        doc['code'] = code

        return doc


if __name__ == '__main__':

    dc = DataCrawler()
    dc.crawl_index(begin_date='2019-07-01',end_date='2019-07-03')
    dc.crawl(begin_date='2019-07-01', end_date='2019-07-03')

测试结果随便贴一点:

/Users/Ed_Frey/anaconda2/envs/python36/bin/python /Users/Ed_Frey/Desktop/MyQuant_v1/data/data_crawler.py

Saving index daily, code: 000001, inserted: 0, modified: 0

Saving index daily, code: 000300, inserted: 0, modified: 0

Saving index daily, code: 399001, inserted: 0, modified: 0

Saving index daily, code: 300776, inserted: 3, modified: 0

Saving index daily, code: 600230, inserted: 3, modified: 0

Saving index daily, code: 600230, inserted: 3, modified: 0

<urlopen error timed out>

代码语言:javascript
复制
前面几行是已经爬取过的数据,没有插入或修改操作,后面几行新的数据进行了插入。
代码语言:javascript
复制
代码语言:javascript
复制

仅仅爬取3个交易日的数据,半个小时还没完事

时不时会报一下超时,用网上免费资源搭建数据库嘛,肯定要花点时间的。要想用交易所最详实的数据接口,vip费用也要几十万一年。所以,就酱紫慢慢爬吧。

至于终端如何查看数据库是否有更新。可以使用相应的查询参数查询,也贴一下吧(输出结果比较多,有删减):

> db.daily.find({"code":"000001"})

{ "_id" : ObjectId("5d20a455f9174eb451727881"), "code" : "000001", "date" : "2019-07-01", "index" : true, "close" : 3044.9, "high" : 3045.37, "low" : 3014.69, "open" : 3024.62, "volume" : 250840433 }

{ "_id" : ObjectId("5d20a455f9174eb451727882"), "code" : "000001", "date" : "2019-07-02", "index" : true, "close" : 3043.94, "high" : 3048.48, "low" : 3033.78, "open" : 3042.58, "volume" : 214520624 }

{ "_id" : ObjectId("5d20a455f9174eb451727883"), "code" : "000001", "date" : "2019-07-03", "index" : true, "close" : 3015.26, "high" : 3031.83, "low" : 3006.32, "open" : 3031.83, "volume" : 212296173 }

> db.daily.find({"date":"2019-07-01"})

{ "_id" : ObjectId("5d20a455f9174eb451727881"), "code" : "000001", "date" : "2019-07-01", "index" : true, "close" : 3044.9, "high" : 3045.37, "low" : 3014.69, "open" : 3024.62, "volume" : 250840433 }

{ "_id" : ObjectId("5d20ab5df9174eb451727894"), "code" : "000300", "date" : "2019-07-01", "index" : true, "close" : 3935.81, "high" : 3936.67, "low" : 3886.91, "open" : 3899.33, "volume" : 158370340 }

{ "_id" : ObjectId("5d20ab5df9174eb451727898"), "code" : "399001", "date" : "2019-07-01", "index" : true, "close" : 9530.46, "high" : 9530.46, "low" : 9339.77, "open" : 9384.79, "volume" : 337747139 }

{ "_id" : ObjectId("5d20ab5df9174eb45172789c"), "code" : "399005", "date" : "2019-07-01", "index" : true, "close" : 5908.05, "high" : 5909.84, "low" : 5803.07, "open" : 5828.43, "volume" : 33168558 }

{ "_id" : ObjectId("5d20ab5ef9174eb4517278a0"), "code" : "399006", "date" : "2019-07-01", "index" : true, "close" : 1568.16, "high" : 1568.4, "low" : 1534.56, "open" : 1545.06, "volume" : 20994431 }

{ "_id" : ObjectId("5d20ab5ff9174eb4517278a4"), "code" : "600150", "date" : "2019-07-01", "index" : false, "close" : 24, "high" : 24.98, "low" : 23.65, "open" : 24, "volume" : 369836 }

{ "_id" : ObjectId("5d20ab61f9174eb4517278ad"), "code" : "600153", "date" : "2019-07-01", "index" : false, "close" : 8.94, "high" : 9.08, "low" : 8.91, "open" : 8.98, "volume" : 134979 }

好了,这期就到这里了!

下一篇
举报
领券