How to Make Stock Price Predictions Using Reinforcement Learning? (2024)

Kavika Roy

10 min read

Jul 3, 2023

a. Moving Averages:

The moving average of a stock is calculated to smooth the price data and constantly update the average price. In finance, the MA (moving average) is considered a stock indicator and is used in technical analysis. The short-term price fluctuations are mitigated in this process. The MA is further divided into the following:

i. Simple Moving Average (SMA)

SMA is calculated using the arithmetic mean for a given set of values over a specific period. Here, the set of values is the stock prices. These are then added and divided by the number of prices in the set.

Formula: A1+ A2+ A3+… Ann

Here, A= average in the period; nn= number of periods; SMA= n

b. ARIMA:

It is another approach to time series forecasting. ARIMA and exponential smoothing are widely used methods as they offer a complementary approach to the problem. ARIMA describes the auto-correlations in data, while exponential smoothing relies on seasonality in data and trend description.

c. Box Jenkins Model:

This model can analyze different types of time series data for forecasting. It is a mathematical model that uses inputs from specified time series to forecast data ranges. The Box Jenkins model determines the outcomes based on the differences between data points. It identifies trends for forecasting stock prices using autoregression, moving averages, and seasonal differences.

d. Rescaled Range Analysis:

It is a statistical technique developed to assess the magnitude and nature of data variability over a certain period. The rescaled range analysis method is used to identify and evaluate persistence, randomness, and mean reversion based on the time series data from the stock markets. This insight is used to make proper investment strategies.

Mode:

It is the common value that occurs in the dataset.

Median:

It is the middle number in the dataset. For example, in 4, 6, 7, 9, and 11, the median is 7.

Arithmetic Mean:

It is the average value of the dataset.

Normal Distribution:

It is also called standard normal distribution or Gaussian distribution model. It is charted along the horizontal axis, representing the total value spectrum in the dataset. The values of half the dataset will be higher than the mean, while the other half will be longer than the mean. And the other data points will be around the mean, with a few lying on extreme/ tail ends on both sides.

Skewness:

It measures the asymmetry/ symmetry of the price/ data point distribution. The skewness will be zero in a standard normal distribution. A negative skewness will lead to a distorted bell curve on the left, while positive skewness will cause a distorted bell curve on the right side.

It is an area of machine learning that takes the appropriate action to maximize returns for a given situation. Many software applications and machines use reinforcement learning (RL) to identify the best behavior/ path to arrive at the desired result for a specific situation.

Reinforcement learning is different from supervised learning. In the latter, the training data is the answer key to training the model with the correct answer. However, in RL, the reinforcement agent decides which task to perform, as there is no specific answer used for training. It allows machine learning developers to train the algorithm without using a dataset. The algorithm will learn from experience and improve itself over time.

Fortunately, there are a few datasets available to train the algorithms. Developers can access the datasets from the following:

The data is available from 1st January 2000 to 31st April 2021. It provides the price history and trading volumes of the fifty stocks indexed on NIFTY 50 from the National Stock Exchange (NSE) in India. The datasets have day-level pricing and trading values for each stock. They are provided in a .cvs file with a metadata file for additional information.

Yahoo Finance has a vast collection of the latest information on trends, prices, and patterns in the global stock market.

NSE Data & Analytics Limited provides data and info-vending services. It is previously known as DotEx International Limited, a separate setup dedicated to providing datasets and analytics. The Capital Market Segment data & market quotes, Currency Derivative Market Segment (CDS), Futures and Options Segment (F&O), Corporate Data, Corporate Bond Market data and Securities Lending & Borrowing Market (SLBM), and Wholesale Debt Market Segment (WDM) data is provided in these datasets.

Python is a popular programming language. It is used to develop models in different niches and has become a go-to choice for creating algorithms for stock trading. It is an open-source package and easy to use when building complex statistical models. Python allows developers to develop their own data connectors, execute mechanisms, run backtesting, and optimize testing modules.

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

sns.set()

!pip install yfinance --upgrade --no-cache-dir

from pandas_datareader import data as pdr

import fix_yahoo_finance as yf

from collections import deque

import random

import tensorflow.compat.v1 as tf

tf.compat.v1.disable_eager_execution()

In this analysis, we used yahoo finance open-source data,

yf.pdr_override()

# df_full = pdr.get_data_yahoo("FB", start="2018-01-01", end="201
9-01-01").reset_index()

# df_full.to_csv('FB.csv',index=False)

df_full= pd.read_csv('/content/INFY.NS.csv')

df_full.head()

df= df_full.copy()

name = 'Q-learning agent'

class Agent:

 def__init__(self, state_size, window_size, trend, skip, batch_size):

 self.state_size = state_size

 self.window_size = window_size

self.half_window = window_size // 2

self.trend = trend

self.skip = skip

self.action_size = 3

self.batch_size = batch_size

self.memory = deque(maxlen = 1000)

self.inventory = []

self.gamma = 0.95

self.epsilon = 0.5

self.epsilon_min = 0.01

self.epsilon_decay = 0.999

tf.reset_default_graph()

self.sess = tf.InteractiveSession()

self.X = tf.placeholder(tf.float32, [None, self.state_size])

self.Y = tf.placeholder(tf.float32, [None, self.action_size])

feed = tf.layers.dense(self.X, 256, activation = tf.nn.relu)

self.logits = tf.layers.dense(feed, self.action_size)

self.cost = tf.reduce_mean(tf.square(self.Y - self.logits))

self.optimizer = tf.train.GradientDescentOptimizer(1e-5).minimize(

self.cost

self.sess.run(tf.global_variables_initializer())

defact(self, state):

if random.random() <= self.epsilon:

return random.randrange(self.action_size)

return np.argmax(

self.sess.run(self.logits, feed_dict = {self.X: state})[0]

defget_state(self, t):

window_size = self.window_size + 1

d = t - window_size + 1

block = self.trend[d : t + 1] if d >= 0else -d * [self.tren
d[0]] + self.trend[0 : t + 1]

res = []

for i inrange(window_size - 1):

res.append(block[i + 1] - block[i])

return np.array([res])

defreplay(self, batch_size):

mini_batch = []

l = len(self.memory)

for i inrange(l - batch_size, l):

mini_batch.append(self.memory[i])

replay_size = len(mini_batch)

X = np.empty((replay_size, self.state_size))

Y = np.empty((replay_size, self.action_size))

states = np.array([a[0][0] for a in mini_batch])

new_states = np.array([a[3][0] for a in mini_batch])

Q = self.sess.run(self.logits, feed_dict = {self.X: states})

Q_new = self.sess.run(self.logits, feed_dict = {self.X: new_states})

for i inrange(len(mini_batch)):

state, action, reward, next_state, done = mini_batch[i]

target = Q[i]

target[action] = reward

ifnot done:

target[action] += self.gamma * np.amax(Q_new[i])

X[i] = state

Y[i] = target

cost, _ = self.sess.run(

[self.cost, self.optimizer], feed_dict = {self.X: X, self.Y: Y}

ifself.epsilon >self.epsilon_min:

self.epsilon *= self.epsilon_decay

return cost

defbuy(self, initial_money):

starting_money = initial_money

states_sell = []

states_buy = []

inventory = []

state = self.get_state(0)

for t inrange(0, len(self.trend) - 1, self.skip):

action = self.act(state)

next_state = self.get_state(t + 1)

if action == 1and initial_money >= self.trend[t] an
d t < (len(self.trend) - self.half_window):

inventory.append(self.trend[t])

initial_money -= self.trend[t]

states_buy.append(t)

print('day %d: buy 1 unit at price %f, total balan
ce %f'% (t, self.trend[t], initial_money))

elif action == 2andlen(inventory):

bought_price = inventory.pop(0)

initial_money += self.trend[t]

states_sell.append(t)

try:

invest = ((close[t] - bought_price) / bought_price) * 100

except:

invest = 0

print(

'day %d, sell 1 unit at price %f, investment %f %%, total balance %f,'

% (t, close[t], invest, initial_money)

state = next_state

invest = ((initial_money - starting_money) / starting_money) * 100

total_gains = initial_money - starting_money

return states_buy, states_sell, total_gains, invest

deftrain(self, iterations, checkpoint, initial_money):

for i inrange(iterations):

total_profit = 0

inventory = []

state = self.get_state(0)

starting_money = initial_money

for t inrange(0, len(self.trend) - 1, self.skip):

action = self.act(state)

next_state = self.get_state(t + 1)

if action == 1and starting_money >= self.trend[t] and t < (len(self.tr
end) - self.half_window):

inventory.append(self.trend[t])

starting_money -= self.trend[t]

elif action == 2andlen(inventory) >0:

bought_price = inventory.pop(0)

total_profit += self.trend[t] - bought_price

starting_money += self.trend[t]

invest = ((starting_money - initial_money) / initial_money)

self.memory.append((state, action, invest,

next_state, starting_money < initial_money))

state = next_state

batch_size = min(self.batch_size, len(self.memory))

cost = self.replay(batch_size)

if (i+1) % checkpoint == 0:

print('epoch: %d, total rewards: %f.3, cost: %f, total money:
 %f'%(i + 1, total_profit, cost,

starting_money))

close = df.Close.values.tolist()

initial_money = 10000

window_size = 30

skip = 1

batch_size = 32

agent = Agent(state_size = window_size,

window_size = window_size,

trend = close,

skip = skip,

batch_size = batch_size)

agent.train(iterations = 200, checkpoint = 10, ini
tial_money = initial_money)

Output:

How to Make Stock Price Predictions Using Reinforcement Learning? (3)

close = df.Close.values.tolist()

initial_money = 10000

window_size = 30

skip = 1

batch_size = 32

agent = Agent(state_size = window_size,

window_size = window_size,

trend = close,

skip = skip,

batch_size = batch_size)

agent.train(iterations = 200, checkpoint = 10, initial_money = init
ial_money)

Output:

How to Make Stock Price Predictions Using Reinforcement Learning? (4)

fig = plt.figure(figsize = (15,5))

plt.plot(close, color='r', lw=2.)

plt.plot(close, '^', markersize=10, color='m', la
bel = 'buying signal', markevery = states_buy)

plt.plot(close, 'v', markersize=10, color='k', label = 'selling signal', markevery = states_sell)

plt.title('total gains %f, total investment %f%%'%(total_gains, invest))

plt.legend()

plt.savefig(name+'.png')

plt.show()

How to Make Stock Price Predictions Using Reinforcement Learning? (5)

You can always hire artificial intelligence solution providers to develop stock price prediction models using regressive learning. The models will learn and improve themselves as you continue to feed data and use them for predictions.

The day is not far when AI engineers develop an accurate stock predictor to forecast the upcoming changes in prices to help investors make better decisions. This also helps with risk management and minimizing the after-effects of a market crash in case of global calamities.

Stock Market: A stock market contains several exchanges like the NASDAQ, the New York Stock Exchange, etc. Stocks are listed on the exchange and available for buyers and sellers to make a transaction.

The exchange is a place that brings buyers and sellers together to share, trade, sell, and buy stocks. It also monitors the demand and supply of each stock and the subsequent changes in the price of the stock. A stock is a share released by a company available to the public for purchase/ selling.

In simple terms, the stock market is a collection of exchanges that facilitate the buying and selling of shares listed by companies. The transactions happen formally as the OTC (over-the-counter) marketplaces are guided by strict regulations. The stock market is interchangeably used with the term stock exchange. Traders can buy or sell the stock on one or many exchanges.

Broker: A person who acts as a third party between the buyer and seller. Many traders rely on stock brokers to complete the transactions on the exchanges.

Bid: It is the amount you intend to/ are willing to pay for a stock listed on the exchange.

Buy: It implies the act of buying shares of a company or taking a position in it.

Sell: Handing over your stock in exchange for money. It also refers to cutting losses by getting rid of stock with decreasing prices.

Bull: It is a stock market trend/ condition where the prices are expected to rise high.

Bear: It is the opposite of bull and a condition of the stock market where the investors expect the prices to fall.

Portfolio: A collection of financial investments a person owns.

Exchange: A marketplace where different stocks and investments are traded.

Going Long: It refers to placing a bet on the stock price that will increase. This allows you to buy for less (low) and sell for a higher price.

Volatility: It is the frequency with which the stock prices move up or down. Stock markets have high volatility as the prices change fast.

Originally Published on DataToBiz

How to Make Stock Price Predictions Using Reinforcement Learning? (2024)

FAQs

Is reinforcement learning good for stock trading? ›

Deep Reinforcement Learning (DRL) algorithms have been increasingly used to construct stock trading strategies, but they often face performance challenges when applied to financial data with low signal-to-noise ratios and unevenness, as these methods were originally designed for the gaming community.

Read On ›

Can we use reinforcement learning for prediction? ›

Due to the frequency and duration of stock price fluctuations, reinforcement learning (RL) is a more suitable predictive tool for statistical analysis of data than supervised learning.

Discover More Details ›

What is the best algorithm for predicting stock prices? ›

The LSTM algorithm has the ability to store historical information and is widely used in stock price prediction (Heaton et al. 2016). For stock price prediction, LSTM network performance has been greatly appreciated when combined with NLP, which uses news text data as input to predict price trends.

How to calculate stock price prediction? ›

This method of predicting future price of a stock is based on a basic formula. The formula is shown above (P/E x EPS = Price). According to this formula, if we can accurately predict a stock's future P/E and EPS, we will know its accurate future price.

See Details ›

What is the best source to learn stock market? ›

The Best Online Stock Trading Classes of 2024

Best Overall: Investors Underground.
Best for Beginners: Udemy.
Best Value: Bullish Bears.
Best for Technical Analysis: Skillshare.
Best for Day Traders: Bear Bull Traders.
Best for Live Training: Bulls on Wall Street.
Best for University Education: Coursera.

More items...

Find Out More ›

Is learning technical analysis enough for trading? ›

Final Takeaway on technical analysis

Technical analysis is more an assortment of tools and theories that can be used by analysts and traders. There are traders who have been successful through it, and there are those who didn't.

Tell Me More ›

How to build an AI stock predictor? ›

AI Stock Prediction Software Development: The Process

Step 1: Define Objectives and Scope. ...
Step 2: Data Collection and Preparation. ...
Step 3: Select a Machine Learning Model. ...
Step 4: Data Splitting and Model Training. ...
Step 5: Integration with Web Framework. ...
Step 6: User Interface Design. ...
Step 7: Connectivity to External APIs.

More items...

Jan 12, 2024

Show Me More ›

How do you predict stock prices using deep learning techniques? ›

To predict stock prices using deep learning, an appropriate model architecture is constructed. This typically involves stacking multiple layers of LSTM cells to create a deep LSTM network. The number of layers and LSTM cells per layer are hyperparameters that need to be carefully tuned to achieve optimal performance.

Explore More ›

How to predict stock market for next day? ›

After-hours trading commonly helps indicate the next day's open. Extended-hours trading in stocks takes place on electronic markets known as ECNs before the financial markets open for the day, as well as after they close. This activity can help investors predict the open market direction.

What is the most accurate stock predictor? ›

1. AltIndex – Overall Most Accurate Stock Predictor with Claimed 72% Win Rate. From our research, AltIndex is the most accurate stock predictor to consider today. Unlike other predictor services, AltIndex doesn't rely on manual research or analysis.

Show Me More ›

Which model is best for stock price prediction? ›

Which machine learning algorithm is best for stock prediction? A. LSTM (Long Short-term Memory) is one of the extremely powerful algorithms for time series. It can catch historical trend patterns & predict future values with high accuracy.

Read The Full Story ›

What is the best indicator to predict stocks? ›

Seven of the best indicators for day trading are:

On-balance volume (OBV)
Accumulation/distribution (A/D) line.
Average directional index.
Aroon oscillator.
Moving average convergence divergence (MACD)
Relative strength index (RSI)
Stochastic oscillator.

See Details ›

Can you mathematically predict the stock market? ›

Although we can use several metrics and technical analysis techniques, there is not a surefire way of predicting the behavior of a stock with an exact measure. In this sense, there is always an element of randomness that occurs in stock behavior.

Get More Info Here ›

What is the AI tool for share market prediction? ›

TrendSpider is an AI tool for stock trading and price prediction which uses a sophisticated AI engine to research charts and technical signs. It then generates automatic alternate indicators and ideas tailor-made for your approach. Features: Examines custom indicators for buy/promote signals.

What are the techniques used in stock market prediction? ›

Alongside the patterns, techniques are used such as the exponential moving average (EMA), oscillators, support and resistance levels or momentum and volume indicators. Candle stick patterns, believed to have been first developed by Japanese rice merchants, are nowadays widely used by technical analysts.

Which education is best for stock market? ›

The following degrees are examples of the fields you can pursue to develop and advance in a stock trading career:

Business administration and finance. ...
Investment management. ...
Economics. ...
Statistics. ...
Computer science. ...
Data analytics. ...
Applied mathematics.

View Details ›

Which study is best for trading? ›

Professional Day Trader Degrees

Finance/ Business Administration.
Economics.
Computer Science and Statistics.
Applied Mathematics, Engineering, Physics.

Which is the best course for learning stock market? ›

Top 5 Share Market Courses For Beginners In India [2024]

“Trade Like A Pro” by Trendy Traders Academy. Trendy Traders gives its students an outstanding new way to learn the stock market. ...
Rachana Ranade Academy. ...
Nifty Trading Academy. ...
BSE Academy. ...
Online Courses by NSE Academy.

Mar 2, 2024

Learn More ›