After writing two blogs on the delights of working with the agile language Ruby, I decided to eat my own dog food and challenge myself to convert a significant piece of my portfolio returns simulation program from the statistical package R to Ruby.

The program does its work by resampling daily percentage change figures from the 46+ years of of academic portfolios available on the website of Dartmouth finance professor Ken French. The specific technique I use is the bootstrap, wherein I randomly sample with replacement many times from each of 8 lengthy portfolio vectors that house the 46 years of daily returns data. As an illustration, consider the investigation of 1 year returns for Market, a portfolio that performs much like the Wilshire 5000, generally acknowledged as the closest proxy to the overall US stock market. There are approximately 252 trading days per year, so to look at the distribution of one year Market returns, I randomly sample with replacement 252 observations from the 46+ year vector of 11,665 daily figures, computing the growth of an initial $1 investment. I repeat the process a large number of times – in this case 100,000 – to get a feel for what the sampling distribution of the $1 growth over a year  looks like. I  then compute similar distributions for 5 year, 10 year and 20 year returns, randomly sampling 5*252, 10*252, and 20*252 daily returns respectively, repeating each calculation 100,000 times. I finally look at other French portfolios, including Risk Free, Small Growth, Small Neutral, Small Value, Large Growth, Large Neutral and Large Value, performing the same calculations on each. When all is said and done, I write a pretty large file (8 portfolios*4 years*100,000 samples, a total of 3.2M records) for subsequent computations and analysis in R.

The R language is wired to do this kind of work readily, with powerful array operations cutting the programming tedium. Ruby, though, can also handle much of the job;  the challenge is translating tasks from one language to another. I was able to copy the internet files from the French site to my computer in Ruby with a standard library, but had to install a special module to handle Microsoft zip/unzip. I then designed a function to read the two files that held the daily data, merging them by the common key of date. I wrote the combined structure to a comma-delimited file, also invoking R with a system command to create a data frame that stores the 8 portfolio values keyed by date. The created text file served as the data foundation for subsequent bootstrap computations as well.

The calculations for all the bootstrap samples are quite intensive. I first read the just-written data file into a “hash of arrays” keyed on portfolio name. Thus there's an array of 11,665 returns for “Market”, another for “Risk Free”, etc. I then perform the equivalent of a triply-nested loop of samples, within portfolios, within years -- using Ruby iterators to perform the calculations. The computation at the heart of the loops looks like the following for a single growth of $1 figure:

save[h].push((DY*years).times.collect{french[h][rand(len)]}.inject(1) {|totret, pct| totret*(1+pct/100)})

Yikes! In words, take a random sample with replacement of 252*years from the vector of returns, compute the growth of a $1 and store the result. Repeat that calculation 100,000 times for each year/portfolio combination, saving the calcs. Finally, using another set of iterators, loop through the accumulating Ruby data structure, writing a comma-delimited file with a record for each year, portfolio, repetition calculation. Also, load the just-written file into an R data frame for subsequent statistical/graphical analysis.

The results of the calculations are intriguing – and, for me at least, sobering. Consider the following chart of 5 year growth of $1 returns based on the distribution of 100,000 bootstrap samples for each portfolio:


% in Red at 5 Years

Median 5 Year Return

Middle 95% 5 Year Returns

Risk Free




Small Growth




Small Neutral




Small Value




Large Growth




Large Neutral




Large Value








The table cites several statistics summarizing performance for each index. First, notice that it's quite possible to lose principal on each portfolio except Risk Free, which is as you'd expect with a safe T-Bill-like security. $1 invested in Risk Free for 5 years becomes $1.32 at the median, with a 95% “confidence interval” of $1.30 to $1.33. The 5 year bootstrap returns of this portfolio show no risk – but underwhelming upside as well.

In contrast, 5 years in Small Growth leads to a loss of principal in almost 20% of the simulations, with a $1.42 median and a $.63 to $3.12 95% range. You can win big – and lose big –  with Small Growth. I'd certainly be wary of putting my savings in this portfolio with just a 5 year time horizon. Small Value, with only 1% in-the-red bootstrap results and a median growth of $1 of $2.07, seems a much better bet for a 5 year commitment. Of course, one could always build a portfolio of portfolios, combining, for example, Risk Free with some combination of the others. 50% Risk Free and 50% Small Value would dampen both the upside and downside returns.

The Market portfolio, which looks much like the index that serves as the basis for many mutual funds and ETF's, performs better than Small Growth but not as well as Small Value. Many investors would be surprised to note that their 5 year investment in the most popular index fund could lose principal 10% of the time. The median growth of $1 is $1.58 for 5 years – at 9.6% per year compounded, in line with historical precedent.  

I really like the resampling approach to “forecasting” portfolio returns. With a bootstrap sample size of 100,000, I can look at the density and quantiles of hypothetical returns for a clear sense of  the magnitude of risks and rewards. Seeing the extreme bootstrap calculations (the min and max 5 year growth of $1 figures for Small Growth are $.21 and $8.89, respectively!) provides perspective that might be lost with just forecasting models.

Investors must use bootstrap results with caution, however. An implicit assumption with this method is that returns in the future will look like those in the past. If you were to ask Warren Buffet or John Bogle, they'd probably tell you that assumption is too optimistic in 2010. In addition, the returns used in this analysis do not net out the costs associated with mutual funds or ETFs, so growth of $1 figures for “real” investors would be less than those cited here. Finally, aside from Market and Risk Free, the other portfolios are “academic” and hypothetical. The Small Value depicted with the French data would not be the same as the small value your broker peddles. Those interested in portfolios that look much like those discussed here should consult Dimensional Fund Advisors.

Steve Miller also blogs at also blogs at