## Wednesday, April 27, 2016

### QuantLib : Least squares method implementation

A couple of years ago I published a blog post on using Microsoft Solver Foundation for curve fitting. In this post we go through that example again, but this time using corresponding Quantlib (QL) tools. In order to make everything a bit more concrete, the fitting scheme (as solved by Frontline Solver in Microsoft Excel) is presented in the screenshot below.

Second degree polynomial fitting has been performed, in order to find a set of coefficients which are minimizing sum of squared differences between observed rates and estimated rates. The same minimization scheme has also been used in my example program presented later below.

## LeastSquares class

After some initial woodshedding with QL optimization tools, I started to dream if it could be possible to create kind of a generic class, which could then handle all kinds of different curve fitting schemes.

When using LeastSquares class, client is creating its instance by setting desired optimization method (Quantlib::OptimizationMethod) and all parameters needed for defining desired end criteria (Quantlib::EndCriteria) for optimization procedure. The actual fitting procedure will be started by calling Fit method of LeastSquares class. For this method, client is giving independent values (Quantlib::Array), vector of function pointers (boost::function), initial parameters (Quantlib::Array) and parameter constraints (Quantlib::Constraint).

Objective function implementation (Quantlib::CostFunction) used by QL optimizers is hidden as a nested class inside LeastSquares class. Since the actual task performed by least squares method is always to minimize a sum of squared errors, this class is not expected to change. Objective function is then using a set of given independent values and function pointers (boost::function) to calculate sum of squared errors between independent (observed) values and estimated values. By using function pointers (boost::function), algorithm will not be hard-coded into class method.

Finally, for function pointers used in objective function inside LeastSquares class, implementation class with correct method signature is expected to be available. In the example program, there is CurvePoint class, which is modelling a rate approximation for a given t (maturity) and a set of given external parameters.

## Example program

Create a new C++ console project and copyPaste the following into corresponding header and implementation files. Help is available here if needed. In documentations page, there are three excellent presentation slides on Boost and Quantlib libraries by Dimitri Reiswich. "Hello world" examples of QuantLib optimizers can be found within these files.

Finally, thanks again for reading my blog.
-Mike

```// CurvePoint.h
#pragma once
#include <ql/quantlib.hpp>
using namespace QuantLib;
//
// class modeling rate approximation for a single
// curve point using a set of given parameters
class CurvePoint
{
private:
Real t;
public:
CurvePoint(Real t);
Real operator()(const Array& coefficients);
};
//
//
//
// CurvePoint.cpp
#include "CurvePoint.h"
//
CurvePoint::CurvePoint(Real t) : t(t) { }
Real CurvePoint::operator()(const Array& coefficients)
{
Real approximation = 0.0;
for (unsigned int i = 0; i < coefficients.size(); i++)
{
approximation += coefficients[i] * pow(t, i);
}
return approximation;
}
//
//
//
// LeastSquares.h
#pragma once
#include <ql/quantlib.hpp>
using namespace QuantLib;
//
class LeastSquares
{
public:
LeastSquares(boost::shared_ptr<OptimizationMethod> solver, Size maxIterations, Size maxStationaryStateIterations,
Real rootEpsilon, Real functionEpsilon, Real gradientNormEpsilon);
//
Array Fit(const Array& independentValues, std::vector<boost::function<Real(const Array&)>> dependentFunctions,
const Array& initialParameters, Constraint parametersConstraint);
private:
boost::shared_ptr<OptimizationMethod> solver;
Size maxIterations;
Size maxStationaryStateIterations;
Real rootEpsilon;
Real functionEpsilon;
//
// cost function implementation as private nested class
class ObjectiveFunction : public CostFunction
{
private:
std::vector<boost::function<Real(const Array&)>> dependentFunctions;
Array independentValues;
public:
ObjectiveFunction(std::vector<boost::function<Real(const Array&)>> dependentFunctions, const Array& independentValues);
Real value(const Array& x) const;
Disposable<Array> values(const Array& x) const;
};
};
//
//
//
// LeastSquares.cpp
#pragma once
#include "LeastSquares.h"
//
LeastSquares::LeastSquares(boost::shared_ptr<OptimizationMethod> solver, Size maxIterations,
Size maxStationaryStateIterations, Real rootEpsilon, Real functionEpsilon, Real gradientNormEpsilon)
: solver(solver), maxIterations(maxIterations), maxStationaryStateIterations(maxStationaryStateIterations),
//
Array LeastSquares::Fit(const Array& independentValues, std::vector<boost::function<Real(const Array&)>> dependentFunctions,
const Array& initialParameters, Constraint parametersConstraint)
{
ObjectiveFunction objectiveFunction(dependentFunctions, independentValues);
EndCriteria endCriteria(maxIterations, maxStationaryStateIterations, rootEpsilon, functionEpsilon, gradientNormEpsilon);
Problem optimizationProblem(objectiveFunction, parametersConstraint, initialParameters);
//LevenbergMarquardt solver;
EndCriteria::Type solution = solver->minimize(optimizationProblem, endCriteria);
return optimizationProblem.currentValue();
}
LeastSquares::ObjectiveFunction::ObjectiveFunction(std::vector<boost::function<Real(const Array&)>> dependentFunctions,
const Array& independentValues) : dependentFunctions(dependentFunctions), independentValues(independentValues) { }
//
Real LeastSquares::ObjectiveFunction::value(const Array& x) const
{
// calculate squared sum of differences between
// observed and estimated values
Array differences = values(x);
Real sumOfSquaredDifferences = 0.0;
for (unsigned int i = 0; i < differences.size(); i++)
{
sumOfSquaredDifferences += differences[i] * differences[i];
}
return sumOfSquaredDifferences;
}
Disposable<Array> LeastSquares::ObjectiveFunction::values(const Array& x) const
{
// calculate differences between all observed and estimated values using
// function pointers to calculate estimated value using a set of given parameters
Array differences(dependentFunctions.size());
for (unsigned int i = 0; i < dependentFunctions.size(); i++)
{
differences[i] = dependentFunctions[i](x) - independentValues[i];
}
return differences;
}
//
//
//
// tester.cpp
#include "LeastSquares.h"
#include "CurvePoint.h"
//
int main()
{
// 2nd degree polynomial least squares fitting for a curve
// create observed market rates for a curve
Array independentValues(9);
independentValues[0] = 0.19; independentValues[1] = 0.27; independentValues[2] = 0.29;
independentValues[3] = 0.35; independentValues[4] = 0.51; independentValues[5] = 1.38;
independentValues[6] = 2.46; independentValues[7] = 3.17; independentValues[8] = 3.32;
//
// create corresponding curve points to be approximated
std::vector<boost::shared_ptr<CurvePoint>> curvePoints;
curvePoints.push_back(boost::shared_ptr<CurvePoint>(new CurvePoint(0.083)));
curvePoints.push_back(boost::shared_ptr<CurvePoint>(new CurvePoint(0.25)));
curvePoints.push_back(boost::shared_ptr<CurvePoint>(new CurvePoint(0.5)));
curvePoints.push_back(boost::shared_ptr<CurvePoint>(new CurvePoint(1.0)));
curvePoints.push_back(boost::shared_ptr<CurvePoint>(new CurvePoint(2.0)));
curvePoints.push_back(boost::shared_ptr<CurvePoint>(new CurvePoint(5.0)));
curvePoints.push_back(boost::shared_ptr<CurvePoint>(new CurvePoint(10.0)));
curvePoints.push_back(boost::shared_ptr<CurvePoint>(new CurvePoint(20.0)));
curvePoints.push_back(boost::shared_ptr<CurvePoint>(new CurvePoint(30.0)));
//
// create container for function pointers for calculating rate approximations
std::vector<boost::function<Real(const Array&)>> dependentFunctions;
//
// for each curve point object, bind function pointer to operator() and add it into container
for (unsigned int i = 0; i < curvePoints.size(); i++)
{
dependentFunctions.push_back(boost::bind(&CurvePoint::operator(), curvePoints[i], _1));
}
// perform least squares fitting and print optimized coefficients
LeastSquares leastSquares(boost::shared_ptr<OptimizationMethod>(new LevenbergMarquardt), 10000, 1000, 1E-09, 1E-09, 1E-09);
NoConstraint parametersConstraint;
Array initialParameters(3, 0.0);
Array coefficients = leastSquares.Fit(independentValues, dependentFunctions, initialParameters, parametersConstraint);
for (unsigned int i = 0; i < coefficients.size(); i++)
{
std::cout << coefficients[i] << std::endl;
}
return 0;
}
```

## Sunday, April 10, 2016

### C# : flexible design for processing market data

Third-party analytics softwares will usually require a full set of market data to be feeded into system, before performing any of those precious high intensity calculations. Market data (curves, surfaces, fixing time-series, etc.) has to be feeded into system following some specific file configurations. Moreover, source data might have to be collected from several different vendor sources. Needless to say, the process can easily turn into a semi-manageable mess involving Excel workbooks and a lot of manual processing, which is always a bad omen.

For this reason, I finally ended up creating one possible design solution for flexible processing of market data files. I have been going through some iterations starting with Abstract Factory, before landing with the current one using Delegates to pair data and algorithms. With the current solution, I start to feel quite comfortable already.

## UML

Each market data point (such as mid USD swap rate for 2 years) is presented as a RiskFactor object. All individual RiskFactor objects are hosted in a list inside generic MarketDataElements<T> object, which enables hosting any type of data. MarketDataElements<T> object itself is hosted by static BaseMarket class.

The actual algorithms needed for creating any type of vendor data are captured in static ProcessorLibrary class. During the processing task, MarketDataElements<T> object will be handled for a specific library method implementation, which will then request values from vendor source for all involved RiskFactor objects. In the example program, ProcessorLibrary has a method for processing RiskFactor objects using Bloomberg market data API. This specific method is then using DummyBBCOMMWrapper class for requesting values for a RiskFactor object.

For the purpose of pairing specific data (MarketDataElements<T>) and specific algorithm (ProcessorLibrary), BaseMarket class is hosting a list of ElementProcessor objects as well as a list of Delegates bound with specific methods found in ProcessorLibrary. For the processing task, each ElementProcessor object is feeded with delegate method for specific ProcessorLibrary implementation method and information on MarketDataElements<T> object.

Finally, (not shown in UML) program is also using static TextFileHandler class for handling text files and static Configurations class for hosting hardcoded configurations, initially read from App.config file.

## Files

App.config

CSV for all RiskFactor object configurations
Fields (matching with RiskFactor object properties) :
• Data vendor identification string, matching with the one given in configuration file
• Identification code (ticker) for a market data element, found in the system for which the data will be created
• Vendor ticker for a market data element (Bloomberg ISIN code)
• Field name for a market data element (Bloomberg field PX_MID)
• Divider (Bloomberg is presenting a rate as percentage 1.234, but target system may need to have an input as absolute value 0.01234)
• Empty field for a value to be processed by specific processor implementation for specific data vendor.

Result CSV

## The program

Create a new console project and CopyPaste the following program into corresponding CS files. When testing the program in a real production environment, just add reference to Bloomberg API DLL file and replace DummyBBCOMMWrapper class with this.

Adding a new data vendor processor XYZ involves the following four easy steps :
1. Update providers in App.config file : <add key="RiskFactorDataProviders" value="BLOOMBERG,XYZ" />
2. Create a new method implementation into ProcessorLibrary :  public static void ProcessXYZRiskFactors(dynamic marketDataElements) { // implement algorithm}
3. Add selection for a new processor into BaseMarket class method createElementProcessors: if(dataProviderString.ToUpper() == "XYZ") elementProcessors.Add(new ElementProcessor(ProcessorLibrary.ProcessXYZRiskFactors, elements));
4. Create new RiskFactor object configurations into source file for a new vendor XYZ
Finally, thanks for reading my blog.
-Mike

```// MainProgram.cs
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace MarketDataProcess
{
class MainProgram
{
static void Main(string[] args)
{
try
{
// process base market risk factors to file
BaseMarket.Process();
BaseMarket.PrintToFile();
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
}
}
}
//
//
//
//
// BaseMarket.cs
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace MarketDataProcess
{
// class for administrating risk factors and processors for base market
public static class BaseMarket
{
private static List<string> inputFileStreams = new List<string>();
private static MarketDataElements<RiskFactor> riskFactors = new MarketDataElements<RiskFactor>();
private static List<ElementProcessor> elementProcessors = new List<ElementProcessor>();
private static int nRiskFactors = 0;
//
public static void Process()
{
// read all source data string streams from file into list
//
// extract string streams and create risk factor objects
foreach (string inputFileStream in inputFileStreams)
{
RiskFactor element = new RiskFactor();
element.Create(inputFileStream);
}
nRiskFactors = riskFactors.elements.Count;
//
// create and execute market data element processors
// finally run technical check on created risk factors
BaseMarket.createElementProcessors();
elementProcessors.ForEach(processor => processor.Process());
BaseMarket.Check();
}
public static void PrintToFile()
{
// write created base market risk factors into file
StringBuilder stream = new StringBuilder();
for (int i = 0; i < riskFactors.elements.Count; i++)
{
stream.AppendLine(String.Format("{0},{1}", riskFactors.elements[i].systemTicker, riskFactors.elements[i].value));
}
TextFileHandler.Write(Configurations.BaseMarketResultDataFilePathName, stream.ToString(), false);
}
private static void createElementProcessors()
{
// market data element processor types are defined in configuration file
List<string> dataProviderStrings = Configurations.RiskFactorDataProviders.Split(',').ToList<string>();
//
foreach (string dataProviderString in dataProviderStrings)
{
if (dataProviderString == String.Empty) throw new Exception("No element processor defined");
List<RiskFactor> elements = riskFactors.elements.Where(factor => factor.provider == dataProviderString).ToList<RiskFactor>();
if(dataProviderString.ToUpper() == "BLOOMBERG") elementProcessors.Add(new ElementProcessor(ProcessorLibrary.ProcessBloombergRiskFactors, elements));
}
}
public static MarketDataElements<RiskFactor> GetRiskFactors()
{
// create and return deep copy of all base market risk factors
return riskFactors.Clone();
}
private static void Check()
{
int nValidRiskFactors = 0;
//
// loop through all created risk factors for base market
for (int i = 0; i < riskFactors.elements.Count; i++)
{
// extract risk factor to be investigated for valid value
RiskFactor factor = riskFactors.elements[i];
//
// valid value inside risk factor should be double-typed converted to string, ex. "0.05328"
// check validity of this value with Double
// TryParse-method returning TRUE if string value can be converted to double
double value = 0;
bool isValid = Double.TryParse(factor.value, out value);
if (isValid) nValidRiskFactors++;
//
// if value is not convertable to double, get user input for this value
if (!isValid)
{
while (true)
{
Console.Write(String.Format("Provide input for {0} >", factor.systemTicker));
bool validUserInput = Double.TryParse(Console.ReadLine(), out value);
if (validUserInput)
{
factor.value = value.ToString();
nValidRiskFactors++;
break;
}
else
{
// client is forced to set (at least technically) valid value
Console.WriteLine("Invalid value, try again");
}
}
}
}
}
}
}
//
//
//
//
// Configurations.cs
using System;
using System.Configuration;

namespace MarketDataProcess
{
// static class for hosting configurations
public static class Configurations
{
// readonly data for public sharing
//
// private constructor will be called just before any configuration is requested
static Configurations()
{
// configuration strings are assigned to static class data members for easy access
RiskFactorDataProviders = ConfigurationManager.AppSettings["RiskFactorDataProviders"].ToString();
BaseMarketSourceDataFilePathName = ConfigurationManager.AppSettings["BaseMarketSourceDataFilePathName"].ToString();
BaseMarketResultDataFilePathName = ConfigurationManager.AppSettings["BaseMarketResultDataFilePathName"].ToString();
}
}
}
//
//
//
//
// MarketDataElement.cs
using System;
using System.Collections.Generic;
using System.Linq;

namespace MarketDataProcess
{
// generic template for all types of market data elements (risk factors, fixings)
public class MarketDataElements<T> where T : ICloneable, new()
{
public List<T> elements = new List<T>();
{
}
public MarketDataElements<T> Clone()
{
// create a deep copy of market data elements object
MarketDataElements<T> clone = new MarketDataElements<T>();
//
// copy content for all elements into a list
List<T> elementList = new List<T>();
foreach (T element in elements)
{
}
return clone;
}
}
// risk factor as market data element
public class RiskFactor : ICloneable
{
public string provider;
public string systemTicker;
public string vendorTicker;
public string vendorField;
public string divider;
public string value;
//
public RiskFactor() { }
public void Create(string stream)
{
List<string> fields = stream.Split(',').ToList<string>();
this.provider = fields[0];
this.systemTicker = fields[1];
this.vendorTicker = fields[2];
this.vendorField = fields[3];
this.divider = fields[4];
this.value = fields[5];
}
public object Clone()
{
RiskFactor clone = new RiskFactor();
clone.provider = this.provider;
clone.systemTicker = this.systemTicker;
clone.vendorTicker = this.vendorTicker;
clone.vendorField = this.vendorField;
clone.divider = this.divider;
clone.value = this.value;
return clone;
}
}
// fixing as market data element
public class Fixing : ICloneable
{
public string provider;
public string systemTicker;
public string vendorTicker;
public string vendorField;
public string frequency;
public string nYearsBack;
public string divider;
public string value;
public Dictionary<string, string> timeSeries = new Dictionary<string, string>();
//
public Fixing() { }
public void Create(string stream)
{
List<string> fields = stream.Split(',').ToList<string>();
this.provider = fields[0];
this.systemTicker = fields[1];
this.vendorTicker = fields[2];
this.vendorField = fields[3];
this.frequency = fields[4];
this.nYearsBack = fields[5];
this.divider = fields[6];
this.value = fields[7];
}
public object Clone()
{
Fixing clone = new Fixing();
clone.provider = this.provider;
clone.systemTicker = this.systemTicker;
clone.vendorTicker = this.vendorTicker;
clone.vendorField = this.vendorField;
clone.divider = this.divider;
clone.value = this.value;
//
// create deep copy of timeseries dictionary
Dictionary<string, string> timeSeriesClone = new Dictionary<string, string>();
foreach (KeyValuePair<string, string> kvp in this.timeSeries)
{
}
clone.timeSeries = timeSeriesClone;
return clone;
}
}
}
//
//
//
//
// ElementProcessor.cs
using System;
using System.Collections.Generic;

namespace MarketDataProcess
{
// algorithm for creating market data element objects
public delegate void Processor(dynamic marketDataElements);
//
// class for hosting data and algorithm
public class ElementProcessor
{
dynamic marketDataElements;
{
this.marketDataElements = marketDataElements;
}
public void Process()
{
}
}
}
//
//
//
//
// ProcessorLibrary.cs
using System;
using System.Collections.Generic;
using System.Linq;

namespace MarketDataProcess
{
public static class ProcessorLibrary
{
public static void ProcessBloombergRiskFactors(dynamic marketDataElements)
{
List<RiskFactor> riskFactors = (List<RiskFactor>)marketDataElements;
BBCOMMWrapper.BBCOMMDataRequest request;
dynamic[, ,] result = null;
string SYSTEM_DOUBLE = "System.Double";
int counter = 0;
//
// group data list into N lists grouped by distinct bloomberg field names
var dataGroups = riskFactors.GroupBy(factor => factor.vendorField);
//
// process each group of distinct bloomberg field names
for (int i = 0; i < dataGroups.Count(); i++)
{
// extract group, create data structures for securities and fields
List<RiskFactor> dataGroup = dataGroups.ElementAt(i).ToList<RiskFactor>();
List<string> securities = new List<string>();
List<string> fields = new List<string>() { dataGroup[0].vendorField };
//
// import securities into data structure feeded to bloomberg api
for (int j = 0; j < dataGroup.Count; j++)
{
}
//
// create and use request object to retrieve bloomberg data
request = new BBCOMMWrapper.ReferenceDataRequest(securities, fields);
result = request.ProcessData();
//
// write retrieved bloomberg data into risk factor group data
for (int k = 0; k < result.GetLength(0); k++)
{
string stringValue;
dynamic value = result[k, 0, 0];
//
if (value.GetType().ToString() == SYSTEM_DOUBLE)
{
// handle output value with divider only if retrieved value is double
// this means that data retrieval has been succesfull
double divider = Convert.ToDouble(dataGroup[k].divider);
stringValue = (value / divider).ToString();
dataGroup[k].value = stringValue;
counter++;
}
else
{
// write non-double value (bloomberg will retrieve #N/A) if retrieved value is not double
stringValue = value.ToString();
dataGroup[k].value = stringValue;
}
}
}
}
}
}
//
//
//
//
// DummyBBCOMMWrapper.cs
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace BBCOMMWrapper
{
// abstract base class for data request
public abstract class BBCOMMDataRequest
{
// input data structures
protected List<string> securityNames = new List<string>();
protected List<string> fieldNames = new List<string>();
//
// output result data structure
protected dynamic[, ,] result;
//
public dynamic[, ,] ProcessData()
{
// instead of the actual Bloomberg market data, random numbers are going to be generated
Random randomGenerator = new Random(Math.Abs(Guid.NewGuid().GetHashCode()));
//
for (int i = 0; i < securityNames.Count; i++)
{
for (int j = 0; j < fieldNames.Count; j++)
{
result[i, 0, j] = randomGenerator.NextDouble();
}
}
return result;
}
}
//
// concrete class implementation for processing reference data request
public class ReferenceDataRequest : BBCOMMDataRequest
{
public ReferenceDataRequest(List<string> bloombergSecurityNames,
List<string> bloombergFieldNames)
{
securityNames = bloombergSecurityNames;
fieldNames = bloombergFieldNames;
result = new dynamic[securityNames.Count, 1, fieldNames.Count];
}
}
}
//
//
//
//
// TextFileHandler.cs
using System;
using System.Collections.Generic;
using System.IO;

namespace MarketDataProcess
{
public static class TextFileHandler
{
public static void Read(string filePathName, List<string> output)
{
// read file content into list
{
}
}
public static void Write(string filePathName, List<string> input, bool append)
{
// write text stream list to file
StreamWriter writer = new StreamWriter(filePathName, append);
input.ForEach(it => writer.WriteLine(it));
writer.Close();
}
public static void Write(string filePathName, string input, bool append)
{
// write bulk text stream to file
StreamWriter writer = new StreamWriter(filePathName, append);
writer.Write(input);
writer.Close();
}
}
}
```