你必須很努力

Day23 - 將臺灣證券交易所的每日收盤行情存入 DB

2021/10/06
字數統計: 611閱讀時間: 3 min

前言

前面已經知道如何抓「臺灣證券交易所」的每日收盤行情 CSV 檔,接下來要處理資料,並存入 DB

說明

在處理過程中,需要考量,可能會有新上市的公司,若有的話,要建立 Stock,至於已經下市的公司,則應該要軟刪除 (這部分還沒做,可自行研究)

同一天的資料,若執行多次,應該要判斷是否已在 DB 建立過,不應該重複建立

實作

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
# app/features/twse/allbut_0999/save_to_db.rb

module Twse::Allbut0999
class SaveToDb

include Twse::Helpers

def execute
start_time = Time.current
puts "#{self.class}, start_time: #{start_time.to_s}"

end_transaction_date = find_latest_transaction_date
return puts "#{self.class}, 已經是最新的資料" if end_transaction_date == start_time

is_linux = `uname -a`[/Linux/].present?
file_paths = Dir["data/twse/ALLBUT0999/*/*/*"]
file_paths.each do |file_path|
rows = decode_data(file_path, is_linux)
next if not_process?(rows, end_transaction_date)

row_index = find_rows_index(rows)
all_rows = filter_rows(rows, row_index)
filtered_stocks = filter_by_stocks(all_rows)
Stock.import(filtered_stocks) if filtered_stocks.present?

import_daily_quotes(all_rows)
end
puts "#{self.class}, done_time:#{Time.current}, #{(Time.current - start_time).to_s} sec"
rescue StandardError => e
puts "errors: #{e.inspect}, #{e.backtrace}"
end

private

def find_latest_transaction_date
DailyQuote.latest_transaction_date
end

def not_process?(rows, end_transaction_date)
year, month, day = rows[0].scan(/\d+/)
year = "20" + (year.to_i + 11).to_s[1..2]
@file_date = year + month + day
end_transaction_date.present? && @file_date.to_date <= end_transaction_date
end

def find_rows_index(rows)
row_index = nil
rows.each_with_index { |row, index| row_index = index if row.include?("證券代號") }
row_index
end

def filter_rows(rows, row_index)
all_rows = []
rows[(row_index + 1)..-1].each do |row_string|
row_item = []
row_string.split(',"').each { |row| row_item << row.gsub(/[=|,|"]/, '') }
all_rows << row_item
end
all_rows
end

def filter_by_stocks(all_rows)
stock_infos = []
all_rows.each { |rows| stock_infos << { code: rows[0], name: rows[1] } }
stocks = Stock.all.select(:code).index_by(&:code)

need_create_stocks = []
stock_infos.each do |stock_info|
stock = stocks[stock_info[:code]]
next if stock

need_create_stocks << Stock.new(code: stock_info[:code], name: stock_info[:name])
end
need_create_stocks
end

def import_daily_quotes(all_rows)
stocks = Stock.all.select(:code).index_by(&:code)

need_create_daily_quotes = []
all_rows.each do |row|
stock = stocks[row[0]]

need_create_daily_quotes << stock.daily_quotes.new(
transaction_date: @file_date.to_date,
trade_volume: row[2],
number_of_transactions: row[3],
trade_price: row[4],
opening_price: row[5],
highest_price: row[6],
lowest_price: row[7],
closing_price: row[8],
ups_and_downs: row[9],
price_difference: row[10],
last_best_bid_price: row[11],
last_best_bid_volume: row[12],
last_best_ask_price: row[13],
last_best_ask_volume: row[14],
price_earning_ratio: row[15],
)
end
DailyQuote.import(need_create_daily_quotes) if need_create_daily_quotes.present?
end

end
end

檢查是否有存入 DB

小結

現在看別人寫好的 code ,應該會覺得蠻簡單的,當下自己在做的時候,並沒有範例的 code 可以參考,邊摸索邊做出來的,過程的酸甜苦辣只有自己知道,唯有自己做過,才會知道~


鐵人賽文章連結:https://ithelp.ithome.com.tw/articles/10272899
medium 文章連結:https://link.medium.com/HmmocOLuTjb
本文同步發布於 小菜的 Blog https://riverye.com/

備註:之後文章修改更新,以個人部落格為主

原文連結:https://riverye.com/2021/10/06/Day23-將臺灣證券交易所的每日收盤行情存入-DB/

發表日期:2021-10-06

更新日期:2022-12-21

CATALOG
  1. 1. 前言
  2. 2. 說明
  3. 3. 實作
  4. 4. 檢查是否有存入 DB
  5. 5. 小結