Exploring F# in Excel
source link: https://www.primacy.co/post/exploring-fsharp-in-excel
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Exploring F# in Excel
ExcelDotnet is an experimental tool which allows UDFs to be created in F# with almost the same experience as developing in VBA. Learn about getting all the power of .NET with F# in Excel.
Visual Basic for Applications (VBA), the name is enough to strike fear into the hearts of developers the world over. It's slow, error prone, and lacks many of the features developers have taken for granted for over a decade. Yet, it remains the GoTo language for countless end users the world over. Why? Seamless integration with Microsoft's Office Suite and one-click access to a scripting environment which offers complete control over your documents as well as the ability to create User Defined Functions (UDFs) to enhance your Excel spreadsheets.
VBA has stood the test of time against alternative extension technologies created and then abandoned by Microsoft.
A better way?
Microsoft themselves have offered a way to write succinct, robust, and performant code since 2005 but haven't taken the initiative to offer a true replacement to VBA in this time.
Enter ExcelDotnet (working title) an experimental tool which I developed to allow UDFs to be created in F# with almost the same experience as developing in VBA. It allows you to write F# scripts that are dynamically compiled and execute as standard UDFs from Excel. All native Excel data types are supported without excessive marshalling ceremony from your script.
#r "nuget: FSharp.Data, 4.2.7" [<UDF>] let CsvFile (path:string) = let csv = FSharp.Data.CsvFile.Load(path, hasHeaders = false).Cache() let rows = csv.Rows |> Seq.length let cols = csv.NumberOfColumns Array2D.init rows cols (fun r c -> let row = Seq.item r csv.Rows let v = row.[c] XLValue12.Parse v )
One of Excel's oldest extension technologies is their C API, documented in the Excel Software Development Kit, still maintained to this day and the only suitable technology for high performance UDFs. This was my target for integration with F#.
Native code in F#
But F# is a .NET language! You may protest, it can't be used against a raw C API. .NET has seen several significant enhancements to improve the story in this regard, but there are key challenges to be overcome:
Interoperable data types
Excel's key data type over the C API is the xloper12 (since 2007) as defined in the XLCALL.H file.
typedef struct xloper12 { union { double num; /* xltypeNum */ XCHAR *str; /* xltypeStr */ BOOL xbool; /* xltypeBool */ int err; /* xltypeErr */ int w; struct { WORD count; /* always = 1 */ XLREF12 ref; } sref; /* xltypeSRef */ struct { XLMREF12 *lpmref; IDSHEET idSheet; } mref; /* xltypeRef */ struct { struct xloper12 *lparray; RW rows; COL columns; } array; /* xltypeMulti */ struct { union { int level; /* xlflowRestart */ int tbctrl; /* xlflowPause */ IDSHEET idSheet; /* xlflowGoto */ } valflow; RW rw; /* xlflowGoto */ COL col; /* xlflowGoto */ BYTE xlflow; } flow; /* xltypeFlow */ struct { union { BYTE *lpbData; /* data passed to XL */ HANDLE hdata; /* data returned from XL */ } h; long cbData; } bigdata; /* xltypeBigData */ } val; DWORD xltype; } XLOPER12, *LPXLOPER12;
This definition makes extensive use of the `union` keyword which should not be confused with the Discriminated Unions of F#. Instead, `union` means that multiple names and multiple data types can occupy the same memory address.
This can be achieved in F# with extensive use of the `[<StructLayout(LayoutKind.Explicit)>]` and `[<FieldOffset(0)>]` attributes to define all the structs we need and be on our merry way. However now we are left with very unsafe data types at the heart of our lovely F# code.
[<StructLayout(LayoutKind.Explicit)>] type xlOper12Val = struct [<FieldOffset(0)>] val mutable num:float /// wchar_t* string. First char is length. [<FieldOffset(0)>] val mutable str:nativeint [<FieldOffset(0)>] val mutable xbool:int [<FieldOffset(0)>] val mutable err:xlErrType [<FieldOffset(0)>] val mutable w:int [<FieldOffset(0)>] val mutable sref:SRef [<FieldOffset(0)>] val mutable mref:MRef [<FieldOffset(0)>] val mutable arr:xlOperArray [<FieldOffset(0)>] val mutable flow:Flow [<FieldOffset(0)>] val mutable bigData:BigData end [<StructLayout(LayoutKind.Sequential)>] type xlOper12 = struct val mutable value:xlOper12Val val mutable xlType:xlOper12Type end
F# of course offers the solution in the form of Active Patterns. With Active Patterns we can hide the implementation of the unsafe types behind the familiar construct of Pattern Matching
let (|XlNum|_|) (oper:xlOper12) = match oper.xlType &&& xlOper12Type.xltypeNum with | xlOper12Type.xltypeNum -> Some oper.value.num | _ -> None let (|XlBool|_|) (oper:xlOper12) = match oper.xlType &&& xlOper12Type.xltypeBool with | xlOper12Type.xltypeBool -> Some (oper.value.xbool > 0) | _ -> None let (|XlInt|_|) (oper:xlOper12) = match oper.xlType &&& (xlOper12Type.xltypeBigData) with | xlOper12Type.xltypeBigData -> None | _ -> match oper.xlType &&& xlOper12Type.xltypeInt with | xlOper12Type.xltypeInt -> Some oper.value.w | _ -> None let useOper (x:xlOper12) = match x with | XlNum n -> box n | XlBool b -> box b | XlInt i -> box i
From the end-user perspective they are no different to any other F# Discriminated Union.
Handling strings
XLCALL.H defines the type `*XCHAR` to be used for strings. This type uses a 16 bit wide Unicode `wchar_t` with the first character encoding the length of the string. This maps nicely to a .NET `char`. We can use `ReadOnlySpan<char>` to provide a type and memory safe way to access strings returned from Excel.
module internal XCHAR = let ofPtr (p:nativeptr<char>) = let length = int (NativePtr.read p) if length = 0 then new ReadOnlySpan<char>([||]) else let vPtr = NativePtr.add p 1 |> NativePtr.toVoidPtr new ReadOnlySpan<char>(vPtr, length)
Communicating with Excel
The Excel SDK defines a handful of key interface functions that must be exported by your dll as natively callable functions.
Since .NET 5 introduced the UnmanagedCallersOnlyAttribute, Aaron Robinson has provided the wonderful project DNNE over on GitHub. This allows the definition of natively callable functions within F# or any .NET language using only attribute annotations.
[<C99DeclCode("#include <xlcall.h>")>] [<UnmanagedCallersOnly(CallConvs = [| typeof<CallConvStdcall> |], EntryPoint = "xlAutoOpen")>] let xlAutoOpen():int = // Initialisation code 1 [<UnmanagedCallersOnly(CallConvs = [| typeof<CallConvStdcall> |], EntryPoint = "xlAutoFree12")>] let xlAutoFree12([<C99Type("LPXLOPER12")>] pxFree:nativeint):unit = // Free .NET memory used by Excel () [<UnmanagedCallersOnly(CallConvs = [| typeof<CallConvStdcall> |], EntryPoint = "xlAutoClose")>] let xlAutoClose():int = // Cleanup code 1
Excel also defines a single callback to allow sending data back to the process. A sample implementation of obtaining this function is provided in the file XLCALL.CPP and this can be readily adapted for use from .NET using the UnmanagedFunctionPointerAttribute.
[<DllImport("kernel32.dll")>] extern nativeint private GetModuleHandle(string moduleName); [<DllImport("kernel32.dll")>] extern nativeint private GetProcAddress(nativeint hModule, string procedureName); [<UnmanagedFunctionPointer(CallingConvention.StdCall)>] type Excel12vDelegate = delegate of int * int * nativeptr<nativeptr<xlOper12>> * nativeptr<xlOper12> -> xlRetCode module ExcelInterop = let excel12vCallback = "MdCallBack12" let tryGetExcelDelegate() = let h = GetModuleHandle(null) let ptrExcel12v = GetProcAddress(h, excel12vCallback); if ptrExcel12v <> IntPtr.Zero then Some (Marshal.GetDelegateForFunctionPointer<Excel12vDelegate>(ptrExcel12v)) else None
With these key points covered we now have the skeleton to enable native interoperability with Excel's C API.
Defining UDFs
In VBA, UDFs are created for any publicly available `Function` in the code. This won't do in F#. We want to write lots of functions and only make some of them available to Excel. To achieve this I added a `UDFAttribute` to let our add-in know that only these functions should be exposed to excel and provide some extra details to let Excel know how to run the function and provide additional help to the caller.
[<Sealed; AttributeUsage(validOn = AttributeTargets.Method); AllowNullLiteral>] type UDFAttribute() = inherit Attribute() member val FunctionName = "" with get, set member val ArgumentText = "" with get, set member val MacroType = WorksheetFunction with get, set member val Category = "" with get, set member val CommandShortcut:char option = None with get, set member val HelpTopic:string option = None with get, set member val FunctionHelp:string option = None with get, set member val ThreadSafe = false with get, set [<Sealed; AttributeUsage(validOn = AttributeTargets.Parameter); AllowNullLiteral>] type ArgumentHelpAttribute(helpText:string) = inherit Attribute() member val HelpText = helpText
With the help of these attributes and some reflection code we can expose almost any let-bound function to Excel.
[<UDF>] let AddTwo ([<ArgumentHelp("The first number to add")>] a:float) ([<ArgumentHelp("And the second")>] b) = a + b
A motivating example
Spreadsheet based calculations are ubiquitous across engineering disciplines. Effectively forming a simplified UI over a reference calculation. Spreadsheet formulae are notoriously difficult to read and identify numerical and logical errors. The new `LET()` function allows you to assign names to values which somewhat improves the situation but not much compared to reading code in an editor.
To this end, I reworked an old control valve sizing spreadsheet to perform its calculations from F#. Here's the result:
The code for the `calcCv()` function is kept in a single F# script file which is embedded with the workbook. This particular problem is iterative and Goal Seek is used to find the solution in the same manner as any native Excel formula.
#I @"C:\Path\To\ExcelDotnet" #r "ExcelDotnet.UDFAttributes.dll" #r "ExcelDotnet.dll" open ExcelDotnet // Calculation based on formulas provided in Masoneilan Control Valve Sizing Handbook (2004) // Page 5 let n2 = 0.00214 // Page 16 let n5 = 0.00241 // Page 15 let n6 = 2.73 /// Liquid critical pressure factor. Page 5 let liquidCritPresFactor liqVapPres liqCritPres = 0.96 - 0.28 * sqrt (liqVapPres / liqCritPres) /// Combined Liquid Pressure Recovery Factor. Page 6 let FLP FL usPressureChanges valveDia Cv = FL / (sqrt ((usPressureChanges * FL * FL * Cv * Cv) / (n2 * valveDia ** 4.) + 1.)) /// Reducer corrected Choked Flow (Gas and Vapor). Page 16 let xTP xT Fp usPressureChanges Cv valveDia = xT / (Fp * Fp) / ((xT * usPressureChanges * Cv * Cv) / (n5 * valveDia ** 4.) + 1.) /// Inlet Loss Coefficient. Page 11 let inletReducerK1 valveDia usPipeDia = 0.5 * ((1. - (valveDia / usPipeDia) ** 2.) ** 2.) /// Outlet Loss Coefficient. Page 11 let outletReducerK2 valveDia dsPipeDia = (1. - (valveDia / dsPipeDia) ** 2.) ** 2. /// Inlet Bernoulli Coefficient. Page 11 let bernoulliInletReducer valveDia usPipeDia = 1. - (valveDia / usPipeDia) ** 4.
/// Outlet Bernoulli Coefficient. Page 11 let bernoulliOutletReducer valveDia dsPipeDia = 1. - (valveDia / dsPipeDia) ** 4. /// Piping Geometry Factor. Page 11 let Fp Cv valveDia sumK = ((Cv * Cv * sumK) / (n2 * valveDia ** 4.) + 1.) ** -0.5 /// Gas expansion factor. Page 14 let gasExpFactor pDropRatio Fk xTP = 1. - pDropRatio / (3. * Fk * xTP) /// Liquid maximum pressure drop. Page 15 let deltaPfMax (upstreamPres:float) liqVapPres FF FLP = FLP * FLP * (upstreamPres - FF * liqVapPres) /// Vapour maximum pressure drop. Page 15 let deltaPgMax (upstreamPres:float) Fk xTP = upstreamPres * Fk * xTP /// Two-Phase Flow Cv Factor. Page 15 let calcCv2 massFlow Fp liqMassFrac dPf dPg gasExpFactor upstreamLiqDensity upstreamVapDensity = let vapMassFrac = 1. - liqMassFrac let y = gasExpFactor massFlow / (n6 * Fp) * sqrt (liqMassFrac / (dPf * upstreamLiqDensity) + vapMassFrac / (dPg * upstreamVapDensity * y * y))
[<UDF()>] let calcCv ([<ArgumentHelp("Mass Flow (kg/h)")>] massFlow) ([<ArgumentHelp("Upstream Pressure (kPaa)")>] upstreamPres) ([<ArgumentHelp("Liquid Vapour Pressure (kPaa)")>] liqVapPres) ([<ArgumentHelp("Pressure Drop (kPa)")>] pressureDrop) ([<ArgumentHelp("Liquid Mass Fraction (kg/kg)")>] liqMassFrac) ([<ArgumentHelp("Upstream Liquid Density (kg/m³)")>] upstreamLiqDensity) ([<ArgumentHelp("Upstream Vapour Density (kg/m³)")>] upstreamVapDensity) ([<ArgumentHelp("Liquid Critical Pressure (kPaa)")>] liqCritPres) ([<ArgumentHelp("Specific Heat Ratio, Cp/Cv")>] vapCpCv) ([<ArgumentHelp("Valve Diameter (mm)")>] valveDia) ([<ArgumentHelp("Upstream Pipe Diameter (mm)")>] usPipeDia) ([<ArgumentHelp("Downstream Pipe Diameter (mm)")>] dsPipeDia) ([<ArgumentHelp("Valve Liquid Pressure Recovery Factor, FL")>] FL) ([<ArgumentHelp("Valve Vapour Critical Flow Factor, xT")>] xT) ([<ArgumentHelp("Valve Flow Coefficient, Cv")>] cvGuess) = let k1 = inletReducerK1 valveDia usPipeDia let k2 = outletReducerK2 valveDia dsPipeDia let kB1 = bernoulliInletReducer valveDia usPipeDia let kB2 = bernoulliOutletReducer valveDia dsPipeDia // Effect of Pipe Reducers. Page 11 let sumK = k1 + k2 + kB1 - kB2 // Page 6 let usPressureChanges = k1 + kB1 //Ratio of specific heats factor. Page 14 let Fk = vapCpCv / 1.4 let Fp = Fp cvGuess valveDia sumK let xTP = xTP xT Fp usPressureChanges cvGuess valveDia let FLP = FLP FL usPressureChanges valveDia cvGuess let FF = liquidCritPresFactor liqVapPres liqCritPres let dPfMax = deltaPfMax upstreamPres liqVapPres FF FLP let dPf = min pressureDrop dPfMax let dPgMax = deltaPgMax upstreamPres Fk xTP let dPg = min pressureDrop dPgMax // Pressure drop ratio. Page 13 let pDropRatio = dPg / upstreamPres let gasExpFactor = gasExpFactor pDropRatio Fk xTP calcCv2 massFlow Fp liqMassFrac dPf dPg gasExpFactor upstreamLiqDensity upstreamVapDensity
The result is well-referenced and easy to read and verify for even a novice developer. Numerical checks can also be made using the REPL / `dotnet fsi`.
Dynamic compilation of F# scripts
This part of the project turned out to be extremely straight forward. Simply add a reference to FSharp.Compiler.Service and with the following code you can emit dynamic assemblies ready for consumption.
module Checker = let private checker = FSharpChecker.Create() // https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/compiler-options let private compilerArgs (scriptPath:string) = [| "-a" scriptPath "--targetprofile:netcore" "--target:module" |] let tryCompile scriptPath = async { let compilerArgs = compilerArgs scriptPath let! errors, retCode, maybeAssembly = checker.CompileToDynamicAssembly(compilerArgs, None) return match maybeAssembly with | Some a -> Ok (a, errors) | None -> Error (retCode, errors) }
That's it! Full support for all of F#. NuGet packages work immediately using `#r "nuget:"` directives and now you have all the power of .NET in Excel.
Since starting this project I have now joined Primacy full time working with F# and other interesting technologies. If this kind of work appeals to you then you may also be interested in joining the team over on our jobs page.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK