1

Exploring F# in Excel

 2 years ago
source link: https://www.primacy.co/post/exploring-fsharp-in-excel
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Exploring F# in Excel

ExcelDotnet is an experimental tool which allows UDFs to be created in F# with almost the same experience as developing in VBA. Learn about getting all the power of .NET with F# in Excel.

Visual Basic for Applications (VBA), the name is enough to strike fear into the hearts of developers the world over. It's slow, error prone, and lacks many of the features developers have taken for granted for over a decade. Yet, it remains the GoTo language for countless end users the world over. Why? Seamless integration with Microsoft's Office Suite and one-click access to a scripting environment which offers complete control over your documents as well as the ability to create User Defined Functions (UDFs) to enhance your Excel spreadsheets.

VBA has stood the test of time against alternative extension technologies created and then abandoned by Microsoft.

A better way?

Microsoft themselves have offered a way to write succinct, robust, and performant code since 2005 but haven't taken the initiative to offer a true replacement to VBA in this time.

Enter ExcelDotnet (working title) an experimental tool which I developed to allow UDFs to be created in F# with almost the same experience as developing in VBA. It allows you to write F# scripts that are dynamically compiled and execute as standard UDFs from Excel. All native Excel data types are supported without excessive marshalling ceremony from your script.

CSV spreadsheet
#r "nuget: FSharp.Data, 4.2.7"

[<UDF>]
let CsvFile (path:string) =
    let csv = FSharp.Data.CsvFile.Load(path, hasHeaders = false).Cache()
    let rows = csv.Rows |> Seq.length
    let cols = csv.NumberOfColumns

    Array2D.init rows cols (fun r c ->
        let row = Seq.item r csv.Rows
        let v = row.[c]
        XLValue12.Parse v
    )

One of Excel's oldest extension technologies is their C API, documented in the Excel Software Development Kit, still maintained to this day and the only suitable technology for high performance UDFs. This was my target for integration with F#.

Native code in F#

But F# is a .NET language! You may protest, it can't be used against a raw C API. .NET has seen several significant enhancements to improve the story in this regard, but there are key challenges to be overcome:

Interoperable data types

Excel's key data type over the C API is the xloper12 (since 2007) as defined in the XLCALL.H file.

typedef struct xloper12
{
    union
    {
        double num;                           /* xltypeNum */
        XCHAR *str;                           /* xltypeStr */
        BOOL xbool;                           /* xltypeBool */
        int err;                           /* xltypeErr */
        int w;
        struct
        {
            WORD count;                       /* always = 1 */
            XLREF12 ref;
        } sref;                        /* xltypeSRef */
        struct
        {
            XLMREF12 *lpmref;
            IDSHEET idSheet;
        } mref;                        /* xltypeRef */
        struct
        {
            struct xloper12 *lparray;
            RW rows;
            COL columns;
        } array;                    /* xltypeMulti */
        struct
        {
            union
            {
                int level;            /* xlflowRestart */
                int tbctrl;            /* xlflowPause */
                IDSHEET idSheet;        /* xlflowGoto */
            } valflow;
            RW rw;                           /* xlflowGoto */
            COL col;                       /* xlflowGoto */
            BYTE xlflow;
        } flow;                        /* xltypeFlow */
        struct
        {
            union
            {
                BYTE *lpbData;            /* data passed to XL */
                HANDLE hdata;            /* data returned from XL */
            } h;
            long cbData;
        } bigdata;                    /* xltypeBigData */
    } val;
    DWORD xltype;
} XLOPER12, *LPXLOPER12;

This definition makes extensive use of the `union` keyword which should not be confused with the Discriminated Unions of F#. Instead, `union` means that multiple names and multiple data types can occupy the same memory address.

This can be achieved in F# with extensive use of the `[<StructLayout(LayoutKind.Explicit)>]` and `[<FieldOffset(0)>]` attributes to define all the structs we need and be on our merry way. However now we are left with very unsafe data types at the heart of our lovely F# code.

[<StructLayout(LayoutKind.Explicit)>]
type xlOper12Val =
    struct
        [<FieldOffset(0)>] val mutable num:float
        /// wchar_t* string. First char is length.
        [<FieldOffset(0)>] val mutable str:nativeint
        [<FieldOffset(0)>] val mutable xbool:int
        [<FieldOffset(0)>] val mutable err:xlErrType
        [<FieldOffset(0)>] val mutable w:int
        [<FieldOffset(0)>] val mutable sref:SRef
        [<FieldOffset(0)>] val mutable mref:MRef
        [<FieldOffset(0)>] val mutable arr:xlOperArray
        [<FieldOffset(0)>] val mutable flow:Flow
        [<FieldOffset(0)>] val mutable bigData:BigData
    end

[<StructLayout(LayoutKind.Sequential)>]
type xlOper12 =
    struct
        val mutable value:xlOper12Val
        val mutable xlType:xlOper12Type
    end

F# of course offers the solution in the form of Active Patterns. With Active Patterns we can hide the implementation of the unsafe types behind the familiar construct of Pattern Matching

let (|XlNum|_|) (oper:xlOper12) =
    match oper.xlType &&& xlOper12Type.xltypeNum with
    | xlOper12Type.xltypeNum -> Some oper.value.num
    | _ -> None

let (|XlBool|_|) (oper:xlOper12) =
    match oper.xlType &&& xlOper12Type.xltypeBool with
    | xlOper12Type.xltypeBool -> Some (oper.value.xbool > 0)
    | _ -> None

let (|XlInt|_|) (oper:xlOper12) =
    match oper.xlType &&& (xlOper12Type.xltypeBigData) with
    | xlOper12Type.xltypeBigData -> None
    | _ ->
        match oper.xlType &&& xlOper12Type.xltypeInt with
        | xlOper12Type.xltypeInt -> Some oper.value.w
        | _ -> None

let useOper (x:xlOper12) =
    match x with
    | XlNum n -> box n
    | XlBool b -> box b
    | XlInt i -> box i

From the end-user perspective they are no different to any other F# Discriminated Union.

Handling strings

XLCALL.H defines the type `*XCHAR` to be used for strings. This type uses a 16 bit wide Unicode `wchar_t` with the first character encoding the length of the string. This maps nicely to a .NET `char`. We can use `ReadOnlySpan<char>` to provide a type and memory safe way to access strings returned from Excel.

module internal XCHAR =
    let ofPtr (p:nativeptr<char>) =
        let length = int (NativePtr.read p)
        if length = 0 then
            new ReadOnlySpan<char>([||])
        else
            let vPtr =
                NativePtr.add p 1
                |> NativePtr.toVoidPtr
            new ReadOnlySpan<char>(vPtr, length)

Communicating with Excel

The Excel SDK defines a handful of key interface functions that must be exported by your dll as natively callable functions.

Since .NET 5 introduced the UnmanagedCallersOnlyAttribute, Aaron Robinson has provided the wonderful project DNNE over on GitHub. This allows the definition of natively callable functions within F# or any .NET language using only attribute annotations.

[<C99DeclCode("#include <xlcall.h>")>]
[<UnmanagedCallersOnly(CallConvs = [| typeof<CallConvStdcall> |], EntryPoint = "xlAutoOpen")>]
let xlAutoOpen():int =
    // Initialisation code
    1

[<UnmanagedCallersOnly(CallConvs = [| typeof<CallConvStdcall> |], EntryPoint = "xlAutoFree12")>]
let xlAutoFree12([<C99Type("LPXLOPER12")>] pxFree:nativeint):unit =
    // Free .NET memory used by Excel
    ()

[<UnmanagedCallersOnly(CallConvs = [| typeof<CallConvStdcall> |], EntryPoint = "xlAutoClose")>]
let xlAutoClose():int =
    // Cleanup code
    1

Excel also defines a single callback to allow sending data back to the process. A sample implementation of obtaining this function is provided in the file XLCALL.CPP and this can be readily adapted for use from .NET using the UnmanagedFunctionPointerAttribute.

[<DllImport("kernel32.dll")>]
extern nativeint private GetModuleHandle(string moduleName);

[<DllImport("kernel32.dll")>]
extern nativeint private GetProcAddress(nativeint hModule, string procedureName);

[<UnmanagedFunctionPointer(CallingConvention.StdCall)>]
type Excel12vDelegate = delegate of int * int * nativeptr<nativeptr<xlOper12>> * nativeptr<xlOper12> -> xlRetCode

module ExcelInterop =
    let excel12vCallback = "MdCallBack12"

    let tryGetExcelDelegate() =
        let h = GetModuleHandle(null)
        let ptrExcel12v = GetProcAddress(h, excel12vCallback);

        if ptrExcel12v <> IntPtr.Zero then
            Some (Marshal.GetDelegateForFunctionPointer<Excel12vDelegate>(ptrExcel12v))
        else
            None

With these key points covered we now have the skeleton to enable native interoperability with Excel's C API.

Defining UDFs

In VBA, UDFs are created for any publicly available `Function` in the code. This won't do in F#. We want to write lots of functions and only make some of them available to Excel. To achieve this I added a `UDFAttribute` to let our add-in know that only these functions should be exposed to excel and provide some extra details to let Excel know how to run the function and provide additional help to the caller.

[<Sealed; AttributeUsage(validOn = AttributeTargets.Method); AllowNullLiteral>]
type UDFAttribute() =
    inherit Attribute()
    member val FunctionName = "" with get, set
    member val ArgumentText = "" with get, set
    member val MacroType = WorksheetFunction with get, set
    member val Category = "" with get, set
    member val CommandShortcut:char option = None with get, set
    member val HelpTopic:string option = None with get, set
    member val FunctionHelp:string option = None with get, set
    member val ThreadSafe = false with get, set

[<Sealed; AttributeUsage(validOn = AttributeTargets.Parameter); AllowNullLiteral>]
type ArgumentHelpAttribute(helpText:string) =
    inherit Attribute()
    member val HelpText = helpText

With the help of these attributes and some reflection code we can expose almost any let-bound function to Excel.

[<UDF>]
let AddTwo ([<ArgumentHelp("The first number to add")>] a:float) ([<ArgumentHelp("And the second")>] b) = a + b

A motivating example

Spreadsheet based calculations are ubiquitous across engineering disciplines. Effectively forming a simplified UI over a reference calculation. Spreadsheet formulae are notoriously difficult to read and identify numerical and logical errors. The new `LET()` function allows you to assign names to values which somewhat improves the situation but not much compared to reading code in an editor.

To this end, I reworked an old control valve sizing spreadsheet to perform its calculations from F#. Here's the result:

The code for the `calcCv()` function is kept in a single F# script file which is embedded with the workbook. This particular problem is iterative and Goal Seek is used to find the solution in the same manner as any native Excel formula.

#I @"C:\Path\To\ExcelDotnet"
#r "ExcelDotnet.UDFAttributes.dll"
#r "ExcelDotnet.dll"

open ExcelDotnet

// Calculation based on formulas provided in Masoneilan Control Valve Sizing Handbook (2004)

// Page 5
let n2 = 0.00214
// Page 16
let n5 = 0.00241
// Page 15
let n6 = 2.73

/// Liquid critical pressure factor. Page 5
let liquidCritPresFactor liqVapPres liqCritPres =
    0.96 - 0.28 * sqrt (liqVapPres / liqCritPres)

/// Combined Liquid Pressure Recovery Factor. Page 6
let FLP FL usPressureChanges valveDia Cv =
    FL / (sqrt ((usPressureChanges * FL * FL * Cv * Cv) / (n2 * valveDia ** 4.) + 1.))

/// Reducer corrected Choked Flow (Gas and Vapor). Page 16
let xTP xT Fp usPressureChanges Cv valveDia =
    xT / (Fp * Fp) / ((xT * usPressureChanges * Cv * Cv) / (n5 * valveDia ** 4.) + 1.)

/// Inlet Loss Coefficient. Page 11
let inletReducerK1 valveDia usPipeDia =
    0.5 * ((1. - (valveDia / usPipeDia) ** 2.) ** 2.)

/// Outlet Loss Coefficient. Page 11
let outletReducerK2 valveDia dsPipeDia =
    (1. - (valveDia / dsPipeDia) ** 2.) ** 2.

/// Inlet Bernoulli Coefficient. Page 11
let bernoulliInletReducer valveDia usPipeDia =
    1. - (valveDia / usPipeDia) ** 4.
/// Outlet Bernoulli Coefficient. Page 11
let bernoulliOutletReducer valveDia dsPipeDia =
    1. - (valveDia / dsPipeDia) ** 4.

/// Piping Geometry Factor. Page 11
let Fp Cv valveDia sumK =
    ((Cv * Cv * sumK) / (n2 * valveDia ** 4.) + 1.) ** -0.5

/// Gas expansion factor. Page 14
let gasExpFactor pDropRatio Fk xTP =
    1. - pDropRatio / (3. * Fk * xTP)

/// Liquid maximum pressure drop. Page 15
let deltaPfMax (upstreamPres:float) liqVapPres FF FLP =
    FLP * FLP * (upstreamPres - FF * liqVapPres)

/// Vapour maximum pressure drop. Page 15
let deltaPgMax (upstreamPres:float) Fk xTP =
    upstreamPres * Fk * xTP

/// Two-Phase Flow Cv Factor. Page 15
let calcCv2 massFlow Fp liqMassFrac dPf dPg gasExpFactor upstreamLiqDensity upstreamVapDensity =
    let vapMassFrac = 1. - liqMassFrac
    let y = gasExpFactor
    massFlow / (n6 * Fp) * sqrt (liqMassFrac / (dPf * upstreamLiqDensity) + vapMassFrac / (dPg * upstreamVapDensity * y * y))
[<UDF()>]
let calcCv
    ([<ArgumentHelp("Mass Flow (kg/h)")>] massFlow)
    ([<ArgumentHelp("Upstream Pressure (kPaa)")>] upstreamPres)
    ([<ArgumentHelp("Liquid Vapour Pressure (kPaa)")>] liqVapPres)
    ([<ArgumentHelp("Pressure Drop (kPa)")>] pressureDrop)
    ([<ArgumentHelp("Liquid Mass Fraction (kg/kg)")>] liqMassFrac)
    ([<ArgumentHelp("Upstream Liquid Density (kg/m³)")>] upstreamLiqDensity)
    ([<ArgumentHelp("Upstream Vapour Density (kg/m³)")>] upstreamVapDensity)
    ([<ArgumentHelp("Liquid Critical Pressure (kPaa)")>] liqCritPres)
    ([<ArgumentHelp("Specific Heat Ratio, Cp/Cv")>] vapCpCv)
    ([<ArgumentHelp("Valve Diameter (mm)")>] valveDia)
    ([<ArgumentHelp("Upstream Pipe Diameter (mm)")>] usPipeDia)
    ([<ArgumentHelp("Downstream Pipe Diameter (mm)")>] dsPipeDia)
    ([<ArgumentHelp("Valve Liquid Pressure Recovery Factor, FL")>] FL)
    ([<ArgumentHelp("Valve Vapour Critical Flow Factor, xT")>] xT)
    ([<ArgumentHelp("Valve Flow Coefficient, Cv")>] cvGuess) =

    let k1 = inletReducerK1 valveDia usPipeDia
    let k2 = outletReducerK2 valveDia dsPipeDia
    let kB1 = bernoulliInletReducer valveDia usPipeDia
    let kB2 = bernoulliOutletReducer valveDia dsPipeDia

    // Effect of Pipe Reducers. Page 11
    let sumK = k1 + k2 + kB1 - kB2

    // Page 6
    let usPressureChanges = k1 + kB1

    //Ratio of specific heats factor. Page 14
    let Fk = vapCpCv / 1.4
    let Fp = Fp cvGuess valveDia sumK
    let xTP = xTP xT Fp usPressureChanges cvGuess valveDia

    let FLP = FLP FL usPressureChanges valveDia cvGuess
    let FF = liquidCritPresFactor liqVapPres liqCritPres

    let dPfMax = deltaPfMax upstreamPres liqVapPres FF FLP
    let dPf = min pressureDrop dPfMax

    let dPgMax = deltaPgMax upstreamPres Fk xTP
    let dPg = min pressureDrop dPgMax

    // Pressure drop ratio. Page 13
    let pDropRatio = dPg / upstreamPres
    let gasExpFactor = gasExpFactor pDropRatio Fk xTP

    calcCv2 massFlow Fp liqMassFrac dPf dPg gasExpFactor upstreamLiqDensity upstreamVapDensity

The result is well-referenced and easy to read and verify for even a novice developer. Numerical checks can also be made using the REPL / `dotnet fsi`.

Dynamic compilation of F# scripts

This part of the project turned out to be extremely straight forward. Simply add a reference to FSharp.Compiler.Service and with the following code you can emit dynamic assemblies ready for consumption.

module Checker =

    let private checker = FSharpChecker.Create()

    // https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/compiler-options
    let private compilerArgs (scriptPath:string) =
        [|
            "-a"
            scriptPath
            "--targetprofile:netcore"
            "--target:module"
        |]

    let tryCompile scriptPath =
        async {
            let compilerArgs = compilerArgs scriptPath
            let! errors, retCode, maybeAssembly =
                checker.CompileToDynamicAssembly(compilerArgs, None)

            return
                match maybeAssembly with
                | Some a ->
                    Ok (a, errors)
                | None ->
                    Error (retCode, errors)
        }

That's it! Full support for all of F#. NuGet packages work immediately using `#r "nuget:"` directives and now you have all the power of .NET in Excel.

Since starting this project I have now joined Primacy full time working with F# and other interesting technologies. If this kind of work appeals to you then you may also be interested in joining the team over on our jobs page.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK