昨天看到有人发布的python爬虫爬双色球开奖号码的,于是去看了看中国福彩网,发现根本不用爬,别人的数据查询接口是开放的。于是打算结合“大数据”做个神经模型用来预测下一期开奖号码。注意:本文仅用于LSTM神经网络模型的训练与使用学习,不对彩票购买起到任何指导作用,请勿参考购彩!
首先安装所需要的依赖库
复制
pip3 install pandas pip3 install keras pip3 install numpy pip3 install requests pip3 install tensorflow-cpu
下面是通过福彩网的接口获取历史开奖数据,并存储为csv文件,供神经模型调用。
复制
import requests import json import pandas as pd def getdata(): url = 'http://www.cwl.gov.cn/cwl_admin/front/cwlkj/search/kjxx/findDrawNotice' params = { 'name': 'ssq', 'issueCount': '', 'issueStart': '', 'issueEnd': '', 'dayStart': '', 'dayEnd': '', 'pageNo': '1', 'pageSize': '1555', 'week': '', 'systemType': 'PC' } response = requests.get(url, params=params) jsondata = response.json() if jsondata['state']==0: data = [] for item in jsondata['result']: print(item['blue']) blue_ball=item['blue'] red_balls=item['red'].split(',') data.append([item['code'], int(red_balls[0]), int(red_balls[1]), int(red_balls[2]), int(red_balls[3]), int (red_balls[4]), int(blue_ball)]) df = pd.DataFrame(data, columns=['期号', 'red1', 'red2', 'red3', 'red4', 'red5', 'blue']) df.to_csv('data.csv', index=False) getdata()
运行上面的程序后,会在当前目录下生成一个data.csv文件,里面存储了2013年以来的双色球开奖数据,共1555条。
预测代码
复制
import numpy as np import pandas as pd from keras.models import Sequential from keras.layers import Dense, LSTM # 导入数据 data = pd.read_csv('data.csv') # 数据预处理 data['red1'] = (data['red1'] - 1) / 33 data['red2'] = (data['red2'] - 1) / 33 data['red3'] = (data['red3'] - 1) / 33 data['red4'] = (data['red4'] - 1) / 33 data['red5'] = (data['red5'] - 1) / 33 data['red6'] = (data['red6'] - 1) / 33 data['blue'] = (data['blue'] - 1) / 16 # 将数据划分为训练集和测试集 train_data = data.iloc[:1500, :] test_data = data.iloc[1500:, :] # 定义函数来生成训练和测试数据 def generate_data(data, lookback): X, Y = [], [] for i in range(len(data) - lookback): X.append(data[i:i+lookback, :]) Y.append(data[i+lookback, :]) return np.array(X), np.array(Y) # 设置LSTM模型参数 lookback = 10 batch_size = 32 epochs = 100 # 生成训练和测试数据 train_X, train_Y = generate_data(train_data[['red1', 'red2', 'red3', 'red4', 'red5', 'red6', 'blue']].values, lookback) test_X, test_Y = generate_data(test_data[['red1', 'red2', 'red3', 'red4', 'red5', 'red6', 'blue']].values, lookback) # 创建LSTM模型 model = Sequential() model.add(LSTM(64, input_shape=(lookback, 7))) model.add(Dense(32, activation='relu')) model.add(Dense(7, activation='sigmoid')) model.compile(loss='mean_squared_error', optimizer='adam') # 训练模型 model.fit(train_X, train_Y, batch_size=batch_size, epochs=epochs, validation_data=(test_X, test_Y)) # 预测下一期双色球 last_data = data[['red1', 'red2', 'red3', 'red4', 'red5', 'red6', 'blue']].tail(lookback) last_data = (last_data - 1) / np.array([33, 33, 33, 33, 33, 33, 16]) last_data = np.array(last_data) last_data = np.reshape(last_data, (1, lookback, 7)) prediction = model.predict(last_data) prediction = np.round(prediction * np.array([33, 33, 33, 33, 33, 33, 16]) + np.array([1, 1, 1, 1, 1, 1, 1])) print("下一期双色球预测结果为:", prediction)
中国福彩网开奖接口
name表示彩票类型
- ssq 双色球
- kl8 快乐8
- 3d 3D
- qlc 七乐彩
接口比较简单,可以通过时间查询,也可以查询期数,还能设置每次多少条数据,建议一次查完,不然害得构造请求接口麻烦。
下面是双色球的查询接口示例,查询第一页,每页1555条数据
复制
http://www.cwl.gov.cn/cwl_admin/front/cwlkj/search/kjxx/findDrawNotice?name=ssq&issueCount=&issueStart=&issueEnd=&dayStart=&dayEnd=&pageNo=1&pageSize=1555&week=&systemType=PC
评论 (1)