Making a JSON api from a CSV file using fly
source link: https://willschenk.com/howto/2024/making_a_json_api_from_a_csv_file_using_fly/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
I wanted to be able to serve up and process a CSV file on fly, but have the actual processing of that data happen on the server without any sort of fancy database setup. Here's a way to do it using the persistent volumns.
Bookkeeping
Bookworm has ruby 3.1, so lets use that
asdf local ruby 3.1.4
And lets create a quick sintara app:
bundle init
bundle add sqlite3 sinatra puma rerun rackup sinatra-activerecord
config.ru
require File.expand_path('app', File.dirname(__FILE__))
run App
app.rb
:
require 'sinatra/base'
require 'sqlite3'
require "sinatra/activerecord"
class App < Sinatra::Base
register Sinatra::ActiveRecordExtension
set :database, {adapter: "sqlite3", database: "stations.db"}
get '/' do
"Hello world"
end
end
And then you can run with
rerun bundle exec rackup
And test
curl http://localhost:9292
Hello world
Package the app
Deploy the app
FROM debian:bookworm-slim
RUN apt-get update
RUN apt-get install -y ruby ruby-dev \
build-essential curl sqlite-utils \
python3-click-default-group
WORKDIR /app
RUN gem install bundler:2.3.26
COPY Gemfile* ./
RUN bundle install
COPY * ./
EXPOSE 8080
CMD ["bundle", "exec", "rackup", "--host", "0.0.0.0", "--port", "8080"]
docker build . -t test && docker run -it --rm -p 8080:8080 test
curl http://localhost:8080
Hello world
Looks good!
Deploy the app
fly launch --no-deploy --name=chargermap
Inside of the created fly.toml
, lets add a section for a persistent volume:
[mounts]
source="myapp_data"
destination="/data"
Then we can deploy with
fly deploy
And then test:
curl https://chargermap.fly.dev
Hello world
Write the logic
Ok, so now that we have something served up, lets actually write the code.
loader.rb
:
require 'csv'
require 'sqlite3'
class Loader
def initialize
@dir = ENV['DB_DIR'] || '.'
end
def db; "#{@dir}/db"; end
def csv; "#{@dir}/csv"; end
def db_exists?; File.exists? db; end
def csv_exists?; File.exists? csv; end
def ensure!
if !db_exists?
if !csv_exists?
puts "Downloading csv"
download_csv
end
end
if !db_exists?
create_db
end
end
def download_csv
puts "Downloading csv"
system( "curl https://willschenk.com/alt_fuel_stations.csv -o #{csv}" )
end
def create_db
puts "Creating database"
system( "sqlite-utils insert #{db} data #{csv} --csv --detect-types" )
end
end
if __FILE__ == $0
puts "Hello there"
l = Loader.new
puts "DB Exists? #{l.db_exists?}"
puts "CSV Exists? #{l.csv_exists?}"
l.ensure!
puts "DB Exists? #{l.db_exists?}"
puts "CSV Exists? #{l.csv_exists?}"
end
ruby loader.rb
Hello there DB Exists? false CSV Exists? false DB Exists? false CSV Exists? true
app.rb
:
require 'sinatra/base'
require 'sqlite3'
require "sinatra/activerecord"
require_relative './loader'
class Data < ActiveRecord::Base
end
class App < Sinatra::Base
register Sinatra::ActiveRecordExtension
l = Loader.new
set :database, {adapter: "sqlite3", database: l.db}
get '/' do
l = Loader.new
content_type :json
{ db: l.db, csv: l.csv, csv_exists: l.csv_exists?, db_exists: l.db_exists? }.to_json
end
get '/stats' do
content_type :json
{
count: Data.count,
ct: Data.where( "State = ?", "CT" ).count,
ny: Data.where( "State = ?", "NY" ).count
}.to_json
end
post '/' do
l = Loader.new
l.ensure!
redirect '/'
end
end
curl http://localhost:9292 | jq .
{ "db": "./db", "csv": "./csv", "csv_exists": true, "db_exists": true }
curl -X POST http://localhost:9292 | jq .
curl http://localhost:9292 | jq .
{ "csv_exists": true, "db_exists": true }
curl http://localhost:9292/stats | jq .
{ "count": 73454, "ct": 822, "ny": 3793 }
Deploy new code
Inside of the fly.toml
lets set the DB_DIR
to point to our directory,
and then deploy this sucker!
[env]
DB_DIR="/data"
fly deploy
curl https://chargermap.fly.dev | jq .
{ "db": "/data/db", "csv": "/data/csv", "csv_exists": false, "db_exists": false }
Now do a post to set things up
curl -X POST https://chargermap.fly.dev
And then
curl https://chargermap.fly.dev/stats | jq .
{ "count": 73454, "ct": 822, "ny": 3793 }
We can go to the console and stop the machine, and then it will automatically start itself up again when you hit it!
curl https://chargermap.fly.dev/stats | jq .
{ "count": 73454, "ct": 822, "ny": 3793 }
Next steps
When do you want to reload the file? Is it every couple of days? What further transformations do you want to have on the data?
All things to keep playing with.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK