R Studio - Cost Management Analysis Report
- Eighplus Admin
- Apr 8, 2022
- 2 min read
Updated: 3 days ago

Description
CostManagement.csv is a database of Funtern’s cloud team’s MS Azure Usage from February 3rd, 2022 to March 15th, 2022 (approximately one month). Information includes Date of Usage, cost in dollars and cost in CAD. Meanwhile, foreign exchange rate (USD:CAD or dollars per CAD) is also calculated.
Data Preparation
Firstly, I initialized the data types of each variable in the data frame to make it easier for further analysis. I turned ‘UsageDate’ into ‘date’ data type and then automatically filled in the missing dates. Then, I kept 0s in both USD and CAD cost for missing values to make the whole dataset works well.
```{r}
setwd("C:/Users/fangj/Desktop/funtern")
CMAfile <- read.csv("CostManagement_FUNTERN_2022-03-16.csv", head=TRUE)
#transfer table into dataframe
CMAdf <- data.frame(Date = as.Date(CMAfile$ï..UsageDate), USD = as.double(CMAfile$Cost.USD..K.), CAD = as.double(CMAfile$Cost.CAD..K.))
CMAdf <- pad(CMAdf, interval = "day", end_val = Sys.Date()+1)
CMAdf <- fill_by_value(CMAdf, value=0)
CMAdf <- CMAdf[CMAdf[["Date"]] <= "2022-03-15", ]
CMAdf
```
```{r}
install.packages("ggplot2")
library(ggplot2)
```
Process
- Task 1: I calculated FX for each day by the formula: Cost of USD / Cost of CAD, which is the ratio
between two currency. For example, on February 3rd, 2022, the cost was 3.84k USD and
4.87k CAD respectively. After calculation, the FX is 0.788 which means exchange rate
between US dollar and Canadian dollar was 0.788 dollars per CAD.
# Calculate the FX for each day.
```{r}
FX <- CMAdf$USD/CMAdf$CAD
CMAdf['FX_dollars_per_CAD'] <- FX
CMAdf[is.na(CMAdf)] = 0
CMAdf
```
- Task 2: To calculate mean, minima and maxima for Cost USD (K), Cost CAD (K) and FX, I used
aggregate function in excel to make a brief summary table (shown as below).

It’s worth noting that the FX remained steady during this period.
- Task 3: To make line charts for these 3 variables, I used ‘ggplot2’ package in RStudio to
make multiple-line chart, separated each variable in different colors.
```{r}
install.packages("reshape2")
library(reshape2)
```
# line charts for Cost USD (K), Cost CAD (K) and FX
```{r}
chartdf <- data.frame(x = CMAdf$Date, y = c(CMAdf$USD, CMAdf$CAD), line = c("USD cost", "CAD cost"))
ggplot(chartdf, aes(x = x, y = y, col = line)) + geom_line(size = 1) + ggtitle("Line chart for Cost USD (K), Cost CAD (K)") + labs(x = "Date", y = "Cost")
ggplot(CMAdf, aes(Date)) + geom_line(aes(y=FX_dollars_per_CAD), color = "red", size = 1)
```
1) Line chart for costs in USD and CAD in thousands.

2) Line chart for FX.

Summary
Obviously, it’s hard to find out the pattern in those charts and table above. Firstly, there are some missing dates in this small dataset which affected the statistical values at some extent. Second, we need a longer period to make a cost prediction. However, there is one thing is likely to be sure that we can know a cost range (between 1k and 4k approximately )after we got the mode from this dataset.
FUNTERN Associate: Junyu Fang
Comentários