6

Making a JSON api from a CSV file using fly

 1 month ago
source link: https://willschenk.com/howto/2024/making_a_json_api_from_a_csv_file_using_fly/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

I wanted to be able to serve up and process a CSV file on fly, but have the actual processing of that data happen on the server without any sort of fancy database setup. Here's a way to do it using the persistent volumns.

Bookkeeping

Bookworm has ruby 3.1, so lets use that

  asdf local ruby 3.1.4

And lets create a quick sintara app:

    bundle init
    bundle add sqlite3 sinatra puma rerun rackup sinatra-activerecord

config.ru

  require File.expand_path('app', File.dirname(__FILE__))

  run App

app.rb:

  require 'sinatra/base'
  require 'sqlite3'
  require "sinatra/activerecord"

  class App < Sinatra::Base
    register Sinatra::ActiveRecordExtension
    
    set :database, {adapter: "sqlite3", database: "stations.db"}
    
    get '/' do
      "Hello world"
    end
  end

And then you can run with

  rerun bundle exec rackup

And test

 curl http://localhost:9292
Hello world

Package the app

Deploy the app

  FROM debian:bookworm-slim

  RUN apt-get update
  RUN apt-get install -y ruby ruby-dev \
      build-essential curl sqlite-utils \
      python3-click-default-group

  WORKDIR /app

  RUN gem install bundler:2.3.26

  COPY Gemfile* ./
  RUN bundle install

  COPY * ./

  EXPOSE 8080
  CMD ["bundle", "exec", "rackup", "--host", "0.0.0.0", "--port", "8080"]
  docker build . -t test && docker run -it --rm -p 8080:8080 test
 curl http://localhost:8080
Hello world

Looks good!

Deploy the app

  fly launch --no-deploy --name=chargermap

Inside of the created fly.toml, lets add a section for a persistent volume:

[mounts]
  source="myapp_data"
  destination="/data"

Then we can deploy with

  fly deploy

And then test:

curl https://chargermap.fly.dev
Hello world

Write the logic

Ok, so now that we have something served up, lets actually write the code.

loader.rb:

  require 'csv'
  require 'sqlite3'

  class Loader
    def initialize
      @dir = ENV['DB_DIR'] || '.'
    end

    def db; "#{@dir}/db"; end
    def csv; "#{@dir}/csv"; end
    
    def db_exists?; File.exists? db; end
    def csv_exists?; File.exists? csv; end

    def ensure!
      if !db_exists?
        if !csv_exists?
          puts "Downloading csv"
          download_csv
        end
      end

      if !db_exists?
        create_db
      end
    end
    
    def download_csv
      puts "Downloading csv"
      system( "curl https://willschenk.com/alt_fuel_stations.csv -o #{csv}" )
    end

    def create_db
      puts "Creating database"

      system( "sqlite-utils insert #{db} data #{csv} --csv --detect-types" )
    end
  end

  if __FILE__ == $0
    puts "Hello there"

    l = Loader.new
    puts "DB Exists? #{l.db_exists?}"
    puts "CSV Exists? #{l.csv_exists?}"

    l.ensure!

    puts "DB Exists? #{l.db_exists?}"
    puts "CSV Exists? #{l.csv_exists?}"
  end
ruby loader.rb
Hello there
DB Exists? false
CSV Exists? false
DB Exists? false
CSV Exists? true

app.rb:

  require 'sinatra/base'
  require 'sqlite3'
  require "sinatra/activerecord"
  require_relative './loader'

  class Data < ActiveRecord::Base
  end

  class App < Sinatra::Base
    register Sinatra::ActiveRecordExtension
    l = Loader.new
    
    set :database, {adapter: "sqlite3", database: l.db}
    
    get '/' do
      l = Loader.new
      content_type :json

      { db: l.db, csv: l.csv, csv_exists: l.csv_exists?, db_exists: l.db_exists? }.to_json
    end

    get '/stats' do
      content_type :json
      {
        count: Data.count,
        ct: Data.where( "State = ?", "CT" ).count,
        ny: Data.where( "State = ?", "NY" ).count
      }.to_json
    end

    post '/' do
      l = Loader.new

      l.ensure!

      redirect '/'
    end
  end
  curl http://localhost:9292 | jq .
{
  "db": "./db",
  "csv": "./csv",
  "csv_exists": true,
  "db_exists": true
}
  curl -X POST http://localhost:9292 | jq .
  curl http://localhost:9292 | jq .
{
  "csv_exists": true,
  "db_exists": true
}
  curl http://localhost:9292/stats | jq .
{
  "count": 73454,
  "ct": 822,
  "ny": 3793
}

Deploy new code

Inside of the fly.toml lets set the DB_DIR to point to our directory, and then deploy this sucker!

[env]
  DB_DIR="/data"
  fly deploy
curl https://chargermap.fly.dev | jq .
{
  "db": "/data/db",
  "csv": "/data/csv",
  "csv_exists": false,
  "db_exists": false
}

Now do a post to set things up

  curl -X POST https://chargermap.fly.dev

And then

  curl https://chargermap.fly.dev/stats | jq .
{
  "count": 73454,
  "ct": 822,
  "ny": 3793
}

We can go to the console and stop the machine, and then it will automatically start itself up again when you hit it!

  curl https://chargermap.fly.dev/stats | jq .
{
  "count": 73454,
  "ct": 822,
  "ny": 3793
}

Next steps

When do you want to reload the file? Is it every couple of days? What further transformations do you want to have on the data?

All things to keep playing with.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK