How Steampipe enables KPIs as code

CMD Solutions built tools for continuous controls assurance by using open-source Steampipe to define information security performance metrics as SQL statements.

InfoWorld | Dec 21, 2022 3:00 am PST

PublicDomainPictures (CC0)

Ciaran Finnegan is the cybersecurity practice lead at CMD Solutions Australia and Phil Massyn is a senior security consultant there. About a year ago they began using Steampipe and its CrowdStrike plugin to scan their customers’ AWS environments.

Now Finnegan and Massyn are building an internal system for what they call “continuous controls assurance.” Another way to say it might be “KPIs as code.” Here’s an example of a KPI (key performance indicator):

Critical or high severity vulnerabilities are remediated within the organization’s policy timeframe.

How do you translate that objective into code? With Steampipe, you do it by writing SQL queries that can join across the diverse APIs that your software stack exposes. In this case that means querying an endpoint management system, CrowdStrike, then joining with information from a workforce management system, Salesforce—with the understanding that either or both of these may change—to produce query results that map from a vulnerability to a device to a person.

[ Also on InfoWorld: What is an API? Application programming interfaces explained ]

Here’s the query.


SELECT
    ZTA.system_serial_number || ' (' || salesforce_krow__project_resources__c.name || ')' as resource,
    CASE
        WHEN ZTA.assessment ->> 'os' = '100' THEN 'ok'
        ELSE 'alarm'
    END AS status,
    ZTA.system_serial_number || ' (' || salesforce_krow__project_resources__c.name || ' has a score of ' || (ZTA.assessment ->> 'os') as reason,
    jsonb_path_query_array(ZTA.assessment_items['os_signals'], '$[*] ? (@.meets_criteria != "yes").criteria') #>> '{}' as detail
FROM   
    crowdstrike_zta_assessment ZTA
-- Link the serial number to the Salesforce data, so we can find the owner
-- LEFT JOIN is important, in case there isn't a link, we still want to see the data
LEFT JOIN salesforce_fixed_asset__c
    ON ZTA.system_serial_number = serial_number__c
-- Here an INNER JOIN is necessary.  If the serial number exists in Krow, but no owner, that could indicate a
-- a data inconsistency in Krow, which will break the query.  We want an INNER JOIN, because both entries must exist
INNER JOIN salesforce_krow__project_resources__c
    ON salesforce_fixed_asset__c.project_resource__c = salesforce_krow__project_resources__c.id

The tables in play are provided by the CrowdStrike and Salesforce plugins. None of the predefined Salesforce tables would have met the need, but that didn’t matter because CMD Solutions were using their own custom Salesforce objects, and because the Salesforce plugin can dynamically acquire custom objects.

You can run the query in any of the ways Steampipe queries run: with the Steampipe CLI, with psql (or any Postgres CLI), with Metabase (or any Postgres-compatible BI tool), with Python (or any programming language). Or, as CMD Solutions have done, you can wrap a query in a Steampipe control that forms part of a benchmark that runs on the command line with steampipe check, or as a dashboard with steampipe dashboard.

From queries to controls and benchmarks

Here’s the control that packages the query. It’s just a thin wrapper that names and defines a KPI.

[ Attend Virtual Summit on November 8 - CIO's Future of Cloud Summit: Mastering Complexity & Digital Innovation – Register Today! ]

 
control "SEC_002" {
    title = "SEC-002 - % of in-scope personnel compute devices with a Crowdstrike Agent Zero Trust Score for OS of 100"
    sql = <<EOT
    -- SQL as above
    EOT
    }

The control rolls up into a benchmark.

 
benchmark "sec" {
    title = "Security"
    children = [
        ...
        control.SEC_002
        ...
    ]
}

So you can run SEC_002 individually: steampipe check control.SEC_002. Or you can run all the controls in the benchmark: steampipe check benchmark.sec. Results can flow out in a variety of formats for downstream analysis.

But first, where and how to run steampipe check in a scheduled manner? From their documentation:

steampipe-scheduled-job-runner
Run scheduled Steampipe benchmark checks securely and inexpensively on AWS using ECS Fargate. We use AWS Copilot to define Step Functions and AWS ECS Fargate scheduled jobs to run Steampipe checks in Docker. Steampipe benchmarks and controls are retrieved at run-time from a git respository to support a GitOps workflow

The job runs every night, pulls down queries from a repo, executes those against targets, and exports the outputs to Amazon S3—as Markdown, and as JSON that’s condensed by a custom template.

Checking DMARC configuration

Here's another KPI:

All organizational email domains are configured for DMARC

And here’s the corresponding query, again wrapped in a control.

 
control "INF_001" {
    title = "INF-001 - Organisational email domains without DMARC configured"
    description = "Protect against spoofing & phishing, and help prevent messages from being marked as spam. See https://support.google.com/a/answer/2466563?hl=en for more details."
    sql = <<EOT
        WITH ASSET_LIST as (
            SELECT
                D.domain,
                concat('_dmarc.',D.domain) as dmarc,
                COUNT(N.*) as MXCount
            FROM
                csv.domains D
            LEFT JOIN net_dns_record N on  N.domain = D.domain and N.type = 'MX'
            GROUP BY
                D.domain,
                concat('_dmarc.',D.domain)
        )
        SELECT
            A.domain as resource,
            CASE
                WHEN A.MXCount = 0 then 'skip'
                WHEN N.value LIKE '%p=reject;%' THEN 'ok'
                WHEN N.value LIKE '%p=quarantine;%' THEN 'ok'
                ELSE 'alarm'
            END as status,
            CASE
                WHEN A.MXCount = 0 then 'No MX record for domain ' || A.domain
                WHEN N.value LIKE '%p=reject;%' THEN 'Domain ' || A.domain || ' has a reject policy.'
                WHEN N.value LIKE '%p=quarantine;%' THEN 'Domain ' || A.domain || ' has a quarantine policy.  Consider making it reject.'
                WHEN N.value IS NULL THEN 'Domain ' || A.domain || ' has no DMARC policy defined.'
                WHEN N.value LIKE '%p=none;%' THEN 'Domain ' || A.domain || ' has a dmarc policy of none.'
                ELSE 'Domain ' || A.domain || ' has no DMARC policy'
            END as reason,
            A.domain as domain
        FROM
            ASSET_LIST A
        LEFT JOIN net_dns_record N on N.domain = A.dmarc and N.type = 'TXT' and N.value like 'v=DMARC1%'
    EOT
}

The tables here come from the CSV and Net plugins. Like Salesforce, the CSV plugin acquires tables dynamically. In this case the list of domains to check lives in a file called domains.csv retrieved from a domain name system management API. The domain names drive a join with the net_dns_record table to figure out, from MX records, which names are configured for DMARC.

Like all Steampipe controls, these report the required columns resource, status, and reason. It’s purely a convention, as you can write all kinds of queries against plugin-provided tables, but when you follow this convention your queries play in Steampipe’s benchmark and dashboard ecosystem.

Checking for inactive user accounts

It’s true that joining across APIs—with SQL as the common way to reason over them—is Steampipe’s ultimate superpower. But you don’t have to join across APIs. Many useful controls query one or several tables provided by a single plugin.

Here’s one more KPI:

Inactive Okta accounts are reviewed within the organization’s policy time frames

Here’s the corresponding control.

 
control "IAM_001" {
    title = "IAM-001 - Dormant Okta accounts are accounts that have not logged on in the last 30 days"
    sql = <<EOT
SELECT
    U.email as resource,
    CASE
        WHEN U.status <> 'ACTIVE' THEN 'skip'
        WHEN date_part('day', CURRENT_TIMESTAMP - U.activated) < 30 OR date_part('day', CURRENT_TIMESTAMP - U.last_login) < 30 THEN 'ok'
        ELSE 'alarm'
    END as status,
    CASE
        WHEN U.status <> 'ACTIVE' THEN 'User ' || u.email || ' is no longer active'
        WHEN U.last_login is null THEN 'User ' || u.email || ' has never logged on'
        WHEN date_part('day', CURRENT_TIMESTAMP - U.activated) < 30 OR date_part('day', CURRENT_TIMESTAMP - U.last_login) < 30 THEN 'Last logon was on ' || U.last_login
        ELSE 'User ' || u.email || ' last logon on ' || U.last_login
    END as reason,
    U.email,
    U.last_login
FROM
    okta_user U
EOT
}

Controls like this express business logic in a clear and readable way, and require only modest SQL skill.

Next steps

As daily snapshots accumulate, Finnegan and Massyn are exploring ways to visualize them and identify trends and key risk indicators (KRIs). A Python script reads the customized steampipe check output and builds JSON and Markdown outputs that flow to S3. They’ve built a prototype Steampipe dashboard to visualize queries, and considering how a visualization tool might help complete the picture.

Why do all this? “There are products on the market we could buy,” Finnegan says, “but they don’t integrate with all our services, and don’t give us the granular mapping from business objectives to SQL statements. That’s the magic of Steampipe for us.”

For more details, see the repos for their Fargate runner and their continuous controls assurance module. If you have a similar story to tell, please get in touch. We’re always eager to know how people are using Steampipe.

How Steampipe enables KPIs as code

How Steampipe enables KPIs as code

CMD Solutions built tools for continuous controls assurance by using open-source Steampipe to define information security performance metrics as SQL statements.

[ Also on InfoWorld: What is an API? Application programming interfaces explained ]

From queries to controls and benchmarks

Checking DMARC configuration

Checking for inactive user accounts

Next steps

Recommend

中国要在太空建“4S店”！坐火箭就像坐飞机高铁

从源码层面深度剖析Spring循环依赖

扭亏为盈、股价飞升，洪恩打赢“翻身仗”？

Nature's Secrets Revealed

KuCoin Labs Launches Incubation Program to Accelerate Builders

FCC proposes $300 million fine for massive auto warranty robocaller scam

Arcane season 2: everything we know so far

Howard Wilcox Haggard

Are We Doctors or Data Workers?

RTX 40笔记本显卡全线泄露：第一次xx90、功耗达175W

About Joyk