Post

Portfolio Analysis: a beginning

This post is an exploration of R coding, applied to analyzing a financial portfolio. We will scrape ETF stock data from web pages and APIs (via quantmod), calculate financial metrics, and produce tables. In a future post we will create graphs and charts.

Librairies

The following libraries are used:

The rvest package in R is a popular and powerful tool designed for web scraping. It allows users to easily read and manipulate the data from web pages. The tydiverse package is a collection of packages useful for data science, including ggplot2 and dyplr which are necessary for the code used here. The package flextable is used to produce nice-looking HTML tables.

1
2
3
4
#Run the libraries
    library(rvest)
    library(tidyverse)
    library(flextable)

Scraping ticker data

The following code uses tools of the rvest library to create a function that scrapes data from finance.yahoo.com. This requires getting the XPATH to specific data in the web page. If the web page changes, it will break the script. It probably would be best in the future to change this for getting info from a database (via API or other means).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# This is a function that takes one argument: ticker. The function constructs a URL for the
# Yahoo Finance page of the given ticker, then uses read_html to download and parse the HTML
# content of that page. 
    get_financials <- function(ticker) {
      url <- paste0("https://finance.yahoo.com/quote/", ticker)
      page <- read_html(url)
      
      pe_ratio <- page %>%
        html_nodes(xpath = '/html/body/div[1]/div/div/div[1]/div/div[3]/div[1]/div/div[1]/div/div/div/div[2]/div[2]/table/tbody/tr[3]/td[2]') %>%
        html_text() %>%
        as.numeric()
      
#Earnings yield is the inverse of the P/E ratio. We can calculate this here.
      earnings_yield <- round(1 / pe_ratio, 4)
      

      expense_ratio_text <- page %>%
        html_nodes(xpath = '/html/body/div[1]/div/div/div[1]/div/div[3]/div[1]/div/div[1]/div/div/div/div[2]/div[2]/table/tbody/tr[7]/td[2]') %>%
        html_text()
      
      # Remove the '%' sign and convert to numeric
      expense_ratio <- as.numeric(gsub("%", "", expense_ratio_text))
      
      
      return(data.frame(Ticker = ticker, ExpenseRatio = expense_ratio, PE = pe_ratio, EarningsYield = earnings_yield))
    }

    tickers <- c("VUN.TO", "VCN.TO", "XEF.TO", "AVUV", "AVDV", "XEC.TO", "AVES")  # The tickers part of the portfolio

# Define the weights for each ticker in the portfolio. This will be used to calculate the weighted averages.
    portfolio_weights <- c("VUN.TO" = 0.315, "VCN.TO" = 0.23, "XEF.TO" = 0.165, 
                           "AVUV" = 0.115, "AVDV" = 0.075, "XEC.TO" = 0.05, "AVES" = 0.05)

# `lapply` is a function in R that applies a function over a list or vector. In this case,
# the function get_financials is applied to each element of the tickers vector. `bind_rows()`
# combines multiple data frames into one by binding them row-wise. This means that it takes
# data frames and stacks them on top of each other.
    financial_data <- lapply(tickers, get_financials) %>%
      bind_rows()

Since the Vanguard and Blackrock tickers’ Expense Ratios are missing, the following code adjusts for this by manually inserting values. I will eventually add code to scrape the data from their respective websites.

1
2
3
4
5
# Manually adjust 
    correct_expense_ratios <- c("VUN.TO" = 0.17, "VCN.TO" = 0.05, "XEF.TO" = 0.22, "XEC.TO" = 0.28)
    financial_data$ExpenseRatio <- ifelse(financial_data$ExpenseRatio == 0,
                                          correct_expense_ratios[financial_data$Ticker],
                                          financial_data$ExpenseRatio)
TickerExpenseRatioPEEarningsYield
VUN.TO0.1721.780.0459
VCN.TO0.0512.270.0815
XEF.TO0.2213.650.0733
AVUV0.257.150.1399
AVDV0.367.250.1379
XEC.TO0.3811.370.0880
AVES0.367.530.1328

Add weights and calculate averages

We manually add the weight of each tickers in the portfolio.

1
2
3
4
5
# Add the weights to the data frame
financial_data$Weight <- portfolio_weights[financial_data$Ticker]

# Rearrange the columns
financial_data <- financial_data[c("Ticker", "Weight", "PE", "EarningsYield", "ExpenseRatio")]

The following blocks are to calculate the weighted averages. Here are the average Expense Ratios.

1
2
3
4
5
6
7
# Calculate the weighted ER for each ticker
financial_data$WeightedER <- round(financial_data$ExpenseRatio * financial_data$Weight,2)

# Calculate the sum of the weighted ER  to get the weighted average ER
weighted_average_er <- sum(financial_data$WeightedER)

print(paste("Weighted Average ER:", weighted_average_er))

[1] "Weighted Average ER: 0.19"

Here are the weighted average PE ratios.

1
2
3
4
5
6
7
8
# Calculate the weighted PE for each ticker
financial_data$WeightedPE <- round(financial_data$PE * financial_data$Weight, 2)

# Calculate the sum of the weighted PEs to get the weighted average PE
weighted_average_pe <- sum(financial_data$WeightedPE)

print(paste("Weighted Average PE:", weighted_average_pe))

[1] "Weighted Average PE: 14.22"

Here are the sum of ER ratios. I want to print it out as a %.

1
2
3
4
5
6
7
8
9
10
# Calculate the weighted Earning Yields for each ticker
financial_data$WeightedEY <- round(financial_data$EarningsYield * financial_data$Weight, 4)

# Calculate the sum of the EarningYields
weighted_average_ey <- sum(financial_data$WeightedEY)

# Convert to percentage
weighted_average_ey_percent <- weighted_average_ey * 100

print(sprintf("Weighted Average Earning Yields: %.2f%%", weighted_average_ey_percent))

[1] "Weighted Average Earning Yields: 8.29%"

Here we create a summary data frame.

1
2
3
4
5
# Create a summary data frame
summary_data <- data.frame(
  Metric = c("Weighted Average PE", "Weighted Average ER", "Weighted Average EY"),
  Value = c(weighted_average_pe, weighted_average_er, weighted_average_ey_percent)
)

Now we print out the results in two flextables tables.

1
flextable(financial_data)

Ticker

Weight

PE

EarningsYield

ExpenseRatio

WeightedER

WeightedPE

WeightedEY

VUN.TO

0.315

21.76

0.0460

0.17

0.05

6.85

0.0145

VCN.TO

0.230

12.23

0.0818

0.05

0.01

2.81

0.0188

XEF.TO

0.165

13.65

0.0733

0.22

0.04

2.25

0.0121

AVUV

0.115

7.12

0.1404

0.25

0.03

0.82

0.0161

AVDV

0.075

7.26

0.1377

0.36

0.03

0.54

0.0103

XEC.TO

0.050

11.31

0.0884

0.28

0.01

0.57

0.0044

AVES

0.050

7.50

0.1333

0.36

0.02

0.38

0.0067

1
flextable(summary_data)

Metric

Value

Weighted Average PE

14.22

Weighted Average ER

0.19

Weighted Average EY

8.29

If you have large tables, you can use the DT package, which utilizes the JavaScript library DataTables to create interactive and dynamic outputs. This allows for functions like search and sort.

This post is licensed under CC BY 4.0 by the author.