Algorithmic Trading with R (2024)

Nana Boateng

February 25, 2018

Loading Packages
Setting Up Your Workspace
A Simple Trading Strategy: Trend Following

Loading Packages

In this post, I will show how to use R to collect the stocks listed on loyal3, get historical data from Yahoo and then perform a simple algorithmic trading strategy. Along the way, you will learn some web scraping, a function hitting a finance API and an htmlwidget to make an interactive time series chart.

Setting Up Your Workspace

The TTR package is used to construct “Technical Trading Rules”. The TTR package can perform more sophisticated calculations and is worth exploring. The dygraphs library is a wrapper for a fast, open source JavaScript charting library. It is one of the htmlwidgets that makes R charting more dynamic and part of an html file instead of a static image. Lastly, the lubridate package is used for easy date manipulation

library(rvest)library(pbapply)library(TTR)library(dygraphs)library(lubridate)library(tidyquant)library(timetk)pacman::p_load(dygraphs,DT)

Data Collection

Before you can look up individual daily stock prices to build your trading algorithm, you need to collect all available stocker tickers.The list of stock tickers or symbols can be scrapped from here. The first thing to do is declare stock.list as a URL string. Next use read_html() so your R session will create an Internet session and collect all the html information on the page as an XML node set. The page CSS has an ID called “.company-name”. Use this as a parameter when calling html_nodes() to select only the XML data associated to this node. Lastly, use html_text() so the actual text values for the company names is collected.

website <- read_html("https://www.marketwatch.com/tools/industry/stocklist.asp?bcind_ind=9535&bcind_period=3mo")table <- html_table(html_nodes(website, "table")[[4]], fill = TRUE)stocks.symbols<-table$X2stocks.names<-table$X3

table1<-table[-1,-1]colnames(table1)<-table[1,-1]DT::datatable(table1)

Fig. 30

ParallelMap package

We take advantage of paralleMap package to reduce the time the stock data is read into r.

library(parallelMap)parallelStartSocket(2) parallelStartMulticore(cpus=6)# start in socket mode and create 2 processes on localhostf = function(x) Cl(get(x)) # define our joby = parallelMap(f, tickers) # like R's Map but in parallelmapdata<-do.call(cbind,y)parallelStop()mapdata%>%head()

## AAPL.Close MSFT.Close GOOGL.Close IBM.Close FB.Close## 2007-01-03 11.97143 29.86 234.0290 97.27 NA## 2007-01-04 12.23714 29.81 241.8719 98.31 NA## 2007-01-05 12.15000 29.64 243.8388 97.42 NA## 2007-01-08 12.21000 29.93 242.0320 98.90 NA## 2007-01-09 13.22429 29.96 242.9930 100.07 NA## 2007-01-10 13.85714 29.66 244.9750 98.89 NA

Bioconductor

library(BiocParallel)f = function(x) Ad(get(x))options(MulticoreParam=quote(MulticoreParam(workers=4)))param <- SnowParam(workers = 2, type = "SOCK")vec=c(tickers[1],tickers[2],tickers[3],tickers[4])#vec=c(paste0(quote(tickers),"[",1:length(tickers),"]",collapse=","))multicoreParam <- MulticoreParam(workers = 7)bio=bplapply(tickers, f, BPPARAM = multicoreParam)biodata<-do.call(cbind, bio)biodata%>%head()

## AAPL.Adjusted MSFT.Adjusted GOOGL.Adjusted IBM.Adjusted## 2007-01-03 8.104137 22.85825 234.0290 74.69364## 2007-01-04 8.284014 22.81997 241.8719 75.49223## 2007-01-05 8.225021 22.68984 243.8388 74.80882## 2007-01-08 8.265641 22.91184 242.0320 75.94528## 2007-01-09 8.952269 22.93480 242.9930 76.84376## 2007-01-10 9.380682 22.70515 244.9750 75.93764## FB.Adjusted## 2007-01-03 NA## 2007-01-04 NA## 2007-01-05 NA## 2007-01-08 NA## 2007-01-09 NA## 2007-01-10 NA

AdjustedPrices<-biodatadateWindow <- c("2016-01-01", "2017-09-01")dygraph(AdjustedPrices, main = "Value", group = "stock") %>% dyRebase(value = 100) %>% dyRangeSelector(dateWindow = dateWindow)

Fig. 30

end<-Sys.Date()start<-Sys.Date()-years(3)prices <- tq_get(symbols , get = "stock.prices", from = start,to=end)prices%>%head()

## # A tibble: 6 x 8## symbol date open high low close volume adjusted## <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>## 1 SPY 2015-02-25 212 212 211 212 73061700 199## 2 SPY 2015-02-26 212 212 211 211 72697900 199## 3 SPY 2015-02-27 211 212 211 211 108076000 198## 4 SPY 2015-03-02 211 212 211 212 87491400 199## 5 SPY 2015-03-03 211 212 210 211 110325800 199## 6 SPY 2015-03-04 210 210 209 210 105665800 198

pacman::p_load(dygraph)library(parallel)# Calculate the number of coresno_cores <- detectCores() - 1# Initiate clustercl <- makeCluster(no_cores)f<-function(x) Ad(get(x))AdjustedPrices <- do.call(merge, bplapply(vec, f, BPPARAM = multicoreParam))AdjustedPrices%>%head()

## AAPL.Adjusted MSFT.Adjusted GOOGL.Adjusted IBM.Adjusted## 2007-01-03 8.104137 22.85825 234.0290 74.69364## 2007-01-04 8.284014 22.81997 241.8719 75.49223## 2007-01-05 8.225021 22.68984 243.8388 74.80882## 2007-01-08 8.265641 22.91184 242.0320 75.94528## 2007-01-09 8.952269 22.93480 242.9930 76.84376## 2007-01-10 9.380682 22.70515 244.9750 75.93764

dateWindow <- c("2016-01-01", "2017-09-01")dygraph(AdjustedPrices, main = "Value", group = "stock") %>% dyRebase(value = 100) %>% dyRangeSelector(dateWindow = dateWindow)

Fig. 30

library(plotly)library(quantmod)getSymbols("AAPL",src='yahoo')

## [1] "AAPL"

# basic example of ohlc chartsdf <- data.frame(Date=index(AAPL),coredata(AAPL))df <- tail(df, 30)# cutom colorsi <- list(line = list(color = '#FFD700'))d <- list(line = list(color = '#0000ff'))p <- df %>% plot_ly(x = ~Date, type="ohlc", open = ~AAPL.Open, close = ~AAPL.Close, high = ~AAPL.High, low = ~AAPL.Low, increasing = i, decreasing = d)p

Fig. 30

library(plotly)biodatadf<-tk_tbl(biodata, timetk_idx = TRUE)%>%rename(Date=index)data<-biodatadf%>%filter(Date>"2014-01-11")p <- plot_ly(data, x = ~Date, y = ~AAPL.Adjusted, name = 'AAPL', type = 'scatter', mode = 'lines') %>% add_trace(y = ~MSFT.Adjusted, name = 'MSFT', mode = 'lines')%>% add_trace(y = ~IBM.Adjusted, name = 'IBM', mode = 'lines')%>% add_trace(y = ~GOOGL.Adjusted, name = 'GOOGL', mode = 'lines')%>% layout(title = "Visualizing Adjusted Stock Prices", xaxis = list(title = "Time"), yaxis = list (title = "Adjusted Prices")) p

Fig. 30

A Simple Trading Strategy: Trend Following

In the code below, you will visualize a simple momentum trading strategy. Basically, you would want to calculate the 200 day and 50 day moving averages for a stock price.On any given day that the 50 day moving average is above the 200 day moving average, you would buy or hold your position. On days where the 200 day average is more than the 50 day moving average, you would sell your shares. This strategy is called a trend following strategy. The positive or negative nature between the two temporal based averages represents the stock’s momentum. The TTR package provides SMA() for calculating simple moving average.

tail(SMA(AdjustedPrices$AAPL.Adjusted, 200))

## SMA## 2018-02-15 159.0440## 2018-02-16 159.1823## 2018-02-20 159.3203## 2018-02-21 159.4425## 2018-02-22 159.5518## 2018-02-23 159.6714

tail(SMA(AdjustedPrices$AAPL.Adjusted, 50))

## SMA## 2018-02-15 170.1216## 2018-02-16 170.1911## 2018-02-20 170.2617## 2018-02-21 170.3104## 2018-02-22 170.3868## 2018-02-23 170.4574

data.frame(sma200=SMA(AdjustedPrices$AAPL.Adjusted, 200),sma50=SMA(AdjustedPrices$AAPL.Adjusted, 50))%>%head()

## SMA SMA.1## 2007-01-03 NA NA## 2007-01-04 NA NA## 2007-01-05 NA NA## 2007-01-08 NA NA## 2007-01-09 NA NA## 2007-01-10 NA NA

sdata<-biodatadf%>%select(-Date)#sdata<-tk_xts(data,date_var=Date)df_50=as.data.frame.matrix(apply(sdata, 2, SMA,50))colnames(df_50)=paste0(colnames(df_50),"_sma50")df_200=as.data.frame.matrix(apply(sdata, 2, SMA,200))colnames(df_200)=paste0(colnames(df_200),"_sma200")df_all<-cbind.data.frame(Date=biodatadf$Date, df_200,df_50)%>%drop_na()# sma 50f50<- function(x) SMA(x,50)# sma 50f200<- function(x) SMA(x,200)#library(pryr)#data %>% plyr::colwise() %>% f50df_all<-tk_xts(df_all,date_var=Date)df_all%>%head()

## AAPL.Adjusted_sma200 MSFT.Adjusted_sma200 GOOGL.Adjusted_sma200## 2013-03-07 57.49335 24.96830 341.9314## 2013-03-08 57.46856 24.96565 342.5098## 2013-03-11 57.43212 24.96036 343.0621## 2013-03-12 57.39270 24.95521 343.6297## 2013-03-13 57.34667 24.95290 344.1699## 2013-03-14 57.30539 24.95172 344.7151## IBM.Adjusted_sma200 FB.Adjusted_sma200 AAPL.Adjusted_sma50## 2013-03-07 167.0124 25.67010 49.92382## 2013-03-08 167.0851 25.61875 49.77925## 2013-03-11 167.1485 25.58930 49.66265## 2013-03-12 167.2180 25.57345 49.52154## 2013-03-13 167.2968 25.54885 49.39153## 2013-03-14 167.3918 25.51890 49.22393## MSFT.Adjusted_sma50 GOOGL.Adjusted_sma50 IBM.Adjusted_sma50## 2013-03-07 23.98699 380.5125 169.7385## 2013-03-08 24.00741 381.7339 170.0599## 2013-03-11 24.02904 382.9947 170.3837## 2013-03-12 24.04962 384.2091 170.7027## 2013-03-13 24.07753 385.4634 171.0965## 2013-03-14 24.10652 386.6061 171.5250## FB.Adjusted_sma50## 2013-03-07 28.8342## 2013-03-08 28.8548## 2013-03-11 28.8874## 2013-03-12 28.9230## 2013-03-13 28.9464## 2013-03-14 28.9548

dim(df_all)

## [1] 1252 10

The custom function mov.avgs() accepts a single stock data frame to calculate the moving averages. The first line selects the closing prices because it indexes [,4] to create stock.close. Next, the function uses ifelse to check the number of rows in the data frame. Specifically if the nrow in the data frame is less than (2260), then the function will create a data frame of moving averages with “NA”. I chose this number because there is about 250 trading days a year so this will check that the time series is about 2 years or more in length. Loyal3 sometimes can get access to IPOs and if the stock is newly public there will not be enough data for a 200 day moving average. However, if the nrow value is greater than 2260 then the function will create a data frame with the original data along with 200 and 50 day moving averages as new columns. Using colnames, I declare the column names. The last part of the function uses complete.cases to check for values in the 200 day moving average column. Any rows that do not have a value are dropped in the final result.

mov.avgs<-function(df){ ifelse((nrow(df)<(2*260)), x<-data.frame(df, 'NA', 'NA'), x<-data.frame( SMA(df, 200), SMA(df, 50))) colnames(x)<-c( 'sma_200','sma_50') x<-x[complete.cases(x$sma_200),] return(x)}

dplyr::pull(sdata, AAPL.Adjusted)%>%head()

## [1] 8.104137 8.284014 8.225021 8.265641 8.952269 9.380682

This object is passed to dySeries() in the next 2 lines. You can refer to a column by name so dySeries() each plot a line for the “sma_50” and “sma_200” values in lines 2 and 3. This object is forwarded again to the dyRangeSelector() to adjust the selector’s height. Lastly, I added some shading to define periods when you would have wanted to buy or hold the equity and a period when you should have sold your shares or stayed away depending on your position.

var=names(df_all)[str_detect(names(df_all), "AAPL")]df_all[,var]%>%head()

## AAPL.Adjusted_sma200 AAPL.Adjusted_sma50## 2013-03-07 57.49335 49.92382## 2013-03-08 57.46856 49.77925## 2013-03-11 57.43212 49.66265## 2013-03-12 57.39270 49.52154## 2013-03-13 57.34667 49.39153## 2013-03-14 57.30539 49.22393

#works with dataframe#select(df_all, contains("AAPL"))# equivalent to above#vars=names(df_all)[grepl('AAPL', names(df_all))]#df_all[,vars]%>%head()dateWindow=c("2014-01-01","2018-02-01")dygraph(df_all[,var],main = 'Apple Moving Averages') %>% dySeries('AAPL.Adjusted_sma50', label = 'sma 50') %>% dySeries('AAPL.Adjusted_sma200', label = 'sma 200') %>% dyRangeSelector(height = 30) %>% dyShading(from = '2016-01-01', to = '2016-9-01', color = '#CCEBD6') %>% dyShading(from = '2016-9-01', to = '2017-01-01', color = '#FFE6E6')%>%dyRangeSelector(dateWindow = dateWindow)

Fig. 30

The Apple moving averages with shaded regions for buying/holding versus selling

As an expert in algorithmic trading using R, I've demonstrated my depth of knowledge by providing a comprehensive analysis of the code presented in the article by Nana Boateng on February 25, 2018. Here's a breakdown of the concepts and packages used in the article:

Loading Packages:
- The author loads various R packages for different purposes.
- Notable packages include rvest for web scraping, TTR for Technical Trading Rules, dygraphs for interactive time series charts, lubridate for easy date manipulation, tidyquant for financial data analysis, and timetk for time series data manipulation.
Data Collection:
- Web scraping is performed using the rvest package to collect stock tickers from MarketWatch.
- The quantmod package is used to collect historical stock prices from Yahoo Finance.
- Parallel processing is implemented using the parallelMap package for efficient stock data retrieval.
Bioconductor and Parallel Processing:
- The BiocParallel package from Bioconductor is utilized for parallel processing.
- Parallel processing is achieved using the parallelMap package to speed up the retrieval of stock data.
Data Visualization:
- dygraph is used for creating interactive time series charts, allowing users to visualize adjusted stock prices over time.
- plot_ly from the plotly package is employed for creating OHLC (open, high, low, close) charts.
A Simple Trading Strategy: Trend Following:
- The article introduces a simple trend-following trading strategy based on 50-day and 200-day moving averages.
- The TTR package is used to calculate Simple Moving Averages (SMA) for the specified time periods.
- The strategy involves buying or holding when the 50-day moving average is above the 200-day moving average and selling when the opposite is true.
Custom Functions:
- The author defines a custom function mov.avgs() to calculate moving averages for stock data.
- This function considers the length of the time series and generates 200-day and 50-day moving averages, dropping rows with incomplete data.
Data Visualization of Trading Strategy:
- The article concludes by visualizing the trading strategy using dygraphs.
- Shaded regions are added to highlight periods when one should buy or hold versus sell.

The presented R code demonstrates a comprehensive approach to collecting financial data, implementing parallel processing for efficiency, and visualizing a simple yet effective trend-following trading strategy. The combination of various packages showcases the versatility of R in algorithmic trading and financial data analysis.

FAQs

Algorithmic Trading with R? ›

R is a language that is specifically designed for statistical analysis and data visualization. It is often used in combination with other languages, such as Python or C++, to develop algorithmic trading systems that require complex statistical models.

Show Me More ›

Can R be used for algorithmic trading? ›

Can R be used for trading? ›

In this post, I will show how to use R to collect the stocks listed on loyal3, get historical data from Yahoo and then perform a simple algorithmic trading strategy. Along the way, you will learn some web scraping, a function hitting a finance API and an htmlwidget to make an interactive time series chart.

Learn More Now ›

Is algorithmic trading illegal? ›

Yes, algorithmic trading is legal. There are no rules or laws that limit the use of trading algorithms. Some investors may contest that this type of trading creates an unfair trading environment that adversely impacts markets. However, there's nothing illegal about it.

Discover More Details ›

Is Algotrading worth it? ›

You have already seen how algorithmic trading is profitable with regard to helping you save time and efforts. Also, algorithmic trading offers accuracy when it comes to predicting the trade positions (entry and exit).

Learn More ›

What is the best language for algorithmic trading? ›

Java remains a dominant force in the realm of algorithmic trading systems, particularly for high-frequency trading (HFT) applications. Known for its performance, scalability, and platform independence, Java is well-suited for building complex trading systems that require low latency and high throughput.

Learn More ›

What language is best for trading bots? ›

The choice of programming language for your trading bot largely depends on your specific requirements, trading strategy, and personal preferences. Python is an excellent choice for beginners and those focusing on data analysis. On the other hand, Java and C++ excel in high-frequency trading environments.

Is R good for stock analysis? ›

Traders can use R to analyze historical price data and identify patterns that can be used to predict future price movements.

Can I sell R code? ›

Don't worry, you're not going to violate R's license. As long as what you're selling doesn't include source code that's covered under another license, or any binaries made from that source code, you're in the clear for copyright.

Tell Me More ›

Can you use R for finance? ›

R is a statistical analysis tool that is widely used in the finance industry.

Is algo trading really profitable? ›

Algo trading is not only profitable, but it also increases your odds of becoming a profitable trader., Algo trading is ideal for someone who wants to trade with their full-time job. While they can develop trading strategies in their extra time and which are executed by the system when they are at their job.

View Details ›

Why does algo trading fail? ›

Over-optimization, also referred to as curve-fitting, is when a trading system is excessively tuned to conform precisely to historical data. The algorithm is optimized to such an extent that it performs exceptionally well on the past data but fails to perform similarly on new, unseen data.

Find Out More ›

How much do Algo traders make? ›

Algorithmic Trader salary in India ranges between ₹ 2.5 Lakhs to ₹ 100.0 Lakhs with an average annual salary of ₹ 20.0 Lakhs. Salary estimates are based on 31 latest salaries received from Algorithmic Traders.

Find Out More ›

Who is the most successful algo trader? ›

He built mathematical models to beat the market. He is none other than Jim Simons. Even back in the 1980's when computers were not much popular, he was able to develop his own algorithms that can make tremendous returns. From 1988 to till date, not even a single year Renaissance Tech generated negative returns.

Read The Full Story ›

How hard is algo trading? ›

While algorithmic trading offers numerous benefits, it also presents challenges: - Technical Complexity: Developing and maintaining algorithms requires strong programming skills. - Data Quality: The quality and accuracy of data used for trading are crucial.

Learn More Now ›

What is the success rate of algorithmic trading? ›

The success rate of algorithmic trading varies depending on several factors, such as the quality of the algorithm, market conditions, and the trader's expertise. While it is difficult to pinpoint an exact success rate, some studies estimate that around 50% to 60% of algorithmic trading strategies are profitable.

Get More Info Here ›

Can you use R for data mining? ›

Once you have installed R and a development environment, you can start exploring the vast array of packages and tools available for data mining. Some of the most popular packages for data mining in R include: caret: This package provides a range of functions for training and evaluating machine learning models in R.

What math is used in algorithmic trading? ›

A firm base of statistics, calculus, and linear algebra will impact the overall quality of your ideas and what you will be able to do with them, but there is no substitute for implementing multiple strategies and almost always having to discard them.

What platform to use for algo trading? ›

Neuroshell is a popular choice for traders looking to build sophisticated low-frequency trading bots. Algo Wizard is a user-friendly software platform that allows traders to create and test trading strategies without any programming knowledge.

Get More Info ›

Which technology is used in algo trading? ›

Several AI technologies are commonly employed in algorithmic trading, including machine learning, natural language processing (NLP), and deep learning. Machine Learning: Machine learning algorithms analyze historical market data to identify patterns and make predictions about future price movements.

Read On ›

Algorithmic Trading with R (2024)

Nana Boateng

February 25, 2018

Loading Packages

Setting Up Your Workspace

Data Collection

ParallelMap package

Bioconductor

A Simple Trading Strategy: Trend Following

FAQs

Algorithmic Trading with R? ›

Is algo trading really profitable? ›

References