More or less all the big APIs are REST­ful these days. Yeah, you can quib­ble about what “REST” means (and I will, a bit) but the as­ser­tion is broad­ly true. Is it go­ing to stay that way forever? Seems un­like­ly. So, what’s nex­t?

What we talk about when we talk about “REST” · Th­ese days, it’s used col­lo­qui­al­ly to mean any API that is HTTP-based. In fac­t, the vast ma­jor­i­ty of them of­fer CRUD op­er­a­tions on things that have URIs, em­bed some of those URIs in their pay­load­s, and thus are ar­guably REST­ful in the orig­i­nal sense; al­though these days I’m hear­ing the oc­ca­sion­al “CRUDL” where L is for List.

At AWS where I work, we al­most al­ways dis­tin­guish, for a ser­vice or an ap­p, be­tween its “control plane” and its “data plane”. For ex­am­ple, con­sid­er our database-as-a-service RDS ; the con­trol plane apps are where you cre­ate, con­fig­ure, back-up, start, stop, and delete databas­es. The da­ta plane is SQL, with con­nec­tion pools and all that RDBMS bag­gage.

It’s in­ter­est­ing to note that the con­trol plane is pret­ty REST­ful, but the da­ta plane isn’t at al­l. (This isn’t nec­es­sar­i­ly a database thing: DynamoDB’s da­ta plane is pret­ty REST­ful.)

I think there’s a pat­tern there: The con­trol plane for al­most any­thing on­line has a good chance of be­ing REST­ful be­cause, well, that’s where you’re go­ing to be cre­at­ing and delet­ing stuff. The da­ta plane might be a dif­fer­ent sto­ry; my first pre­dic­tion here is that what­ev­er starts to dis­place REST will start do­ing it on the da­ta plane side, if on­ly be­cause con­trol planes and REST are such a nat­u­ral fit.

REST­ful im­per­fec­tions · What are some rea­sons we might want to move be­yond REST? Let me list a few:

La­ten­cy · Set­ting up and tear­ing down an HTTP con­nec­tion for ev­ery lit­tle op­er­a­tion you want to do is not free. A cou­ple of decades of ef­fort have re­duced the cost, but stil­l.

For ex­am­ple, con­sid­er two mes­sag­ing sys­tems that are built by peo­ple who sit close to me: Ama­zon SQS and MQ . SQS has been run­ning for a dozen years and can han­dle mil­lions of mes­sages per sec­ond and, as­sum­ing your senders and re­ceivers are rea­son­ably well bal­anced, can be re­al­ly freak­ing fast  —  in fac­t, I’ve heard sto­ries of mes­sages ac­tu­al­ly be­ing re­ceived be­fore they were sen­t; the long-polling re­ceiv­er grabbed the mes­sage be­fore the sender side got around to tear­ing down the PutMes­sage HTTP con­nec­tion. But the MQ da­ta plane, on the oth­er hand, doesn’t use HTTP, it us­es nailed-up TCP/IP con­nec­tions with its own fram­ing pro­to­col­s. So you can get as­ton­ish­ing­ly low la­ten­cies for trans­mit and re­ceive op­er­a­tions. But, on the oth­er hand, your through­put is lim­it­ed by the num­ber of mes­sages the “message broker” ter­mi­nat­ing those con­nec­tions can han­dle. A lot of peo­ple who use MQ are pret­ty con­vinced that one of the rea­sons they’re do­ing this is they don’t want a REST­ful in­ter­face.

Cou­pling · In the wild, most REST re­quests (like most things la­beled as APIs) op­er­ate syn­chronous­ly; that is to say, you call them (GET, POST, PUT, what­ev­er) and you stall un­til you get your re­sult back. Now (s­peak­ing HTTP lin­go) your re­quest might re­turn 202 Ac­cept­ed, in which case you’d ex­pect ei­ther to have sent a URI along to be called back as a web­hook, or to get one in the re­sponse that you can pol­l. But in all these cas­es, the cou­pling is still pret­ty tight; you (the caller) have to main­tain some sort of state about the re­quest un­til the caller has done with it, whether that’s now or lat­er.

Which sort of suck­s. In par­tic­u­lar when it’s one mi­croser­vice call­ing an­oth­er and the client ser­vice is send­ing re­quests at a high­er rate than the server-side one can han­dle; a sit­u­a­tion that can lead to acute pain very quick­ly.

Short life · Han­dling some re­quests takes mil­lisec­ond­s. Han­dling oth­ers  —  a cit­i­zen­ship ap­pli­ca­tion, for ex­am­ple  —  can take weeks and in­volve or­ches­trat­ing lots of ser­vices, and oc­ca­sion­al­ly hu­man in­ter­ac­tion­s. The no­tion of hav­ing a thread hang­ing wait­ing for some­thing to hap­pen is ridicu­lous.

A word on GraphQL · It ex­ist­s, ba­si­cal­ly, to han­dle the sit­u­a­tion where a client has to as­sem­ble sev­er­al fla­vors of in­for­ma­tion do its job  —  for ex­am­ple, a mo­bile app build­ing an information-rich dis­play. Since REST­ful in­ter­faces tend to do a good job of telling you about a sin­gle re­source, this can lead to a waste­ful flur­ry of re­quest­s. So GraphQL lets you cherry-pick an ar­bi­trary se­lec­tion of fields from mul­ti­ple re­sources in a sin­gle re­quest. Pre­sum­ably, the server-side im­ple­men­ta­tion is­sues that re­quest flur­ry in­side the da­ta cen­ter where those calls are cheap­er, then as­sem­bles your GraphQL out­put, but any­how that’s no longer your prob­lem.

I ob­serve that lots of client de­vel­op­ers like GraphQL, and it seems like the world has a place for it, but I don’t see it as that big a game-changer. To start with, it’s not as though client de­vel­op­ers can com­pose ar­bi­trary queries, lim­it­ed on­ly by the se­man­tics of GraphQL, and ex­pect to get uni­form­ly de­cent per­for­mance. (To be fair, the same is true of SQL.) Any­how, I see GraphQL as a con­ve­nience fea­ture de­signed to make syn­chronous APIs run more ef­fi­cient­ly.

A word on RPC · By which, these days, I guess I must mean gRPC. I dun­no, I’m old enough that I saw gen­er­a­tion af­ter gen­er­a­tion of RPC frame­works fail mis­er­ably; brit­tle, re­quir­ing lots of con­fig­u­ra­tion, and fail­ing to de­liv­er the an­tic­i­pat­ed per­for­mance win­s. Smells like mak­ing REST­ful APIs more tight­ly cou­pled, to me, and it’s hard to see that as a win. But I could be wrong.

Post-REST: Mes­sag­ing and Event­ing · This ap­proach is all over, and I mean all over, the cloud in­fras­truc­ture that I work on. The idea is you get a re­quest, you val­i­date it, maybe you do some com­pu­ta­tion on it, then you drop it on a queue (or bus, or stream, or what­ev­er you want to call it) and for­get about it, it’s not your prob­lem any more.

The next stage of re­quest han­dling is im­ple­ment­ed by ser­vices that read the queue and ei­ther route an an­swer back to the orig­i­nal re­quester or pass­es it on to an­oth­er ser­vice stage. Now for this to work, the queues in ques­tion have to be fast (which the­se, days, they are), scal­able (which they are), and very, very durable (which they are).

There are a lot of wins here: To start with, tran­sient query surges are no longer a prob­lem. Al­so, once you’ve got a mes­sage stream you can do fan-out and fil­ter­ing and as­sem­bly and sub­set­ting and all sorts of oth­er use­ful stuff, with­out dis­turb­ing the op­er­a­tions of the up­stream mes­sage source.

Post-REST: Orches­tra­tion · This gets in­to work­flow ter­ri­to­ry, some­thing I’ve been work­ing on a lot re­cent­ly . Where by “workflow” I mean a ser­vice track­ing the state of com­pu­ta­tions that have mul­ti­ple step­s, any one of which can take an ar­bi­trar­i­ly long time pe­ri­od, can fail, can need to be re­tried, and whose be­hav­ior and out­put af­fect the choice of sub­se­quent out­put steps and their be­hav­ior.

An in­creas­ing num­ber of (for ex­am­ple) Lamb­da func­tions are, rather than serv­ing re­quests and re­turn­ing re­spons­es, ex­e­cut­ing in the con­text of a work­flow that pro­vides their in­put, waits for them to com­plete, and routes their out­put fur­ther down­stream.

Post-REST: Per­sis­tent con­nec­tions · Back a few para­graphs I talked about how MQ mes­sage bro­kers work, main­tain­ing a bunch of nailed-up net­work con­nec­tion­s, and pump­ing bytes back and forth across them. It’s not hard to be­lieve that there are lots of sce­nar­ios where this is a good fit for the way da­ta and ex­e­cu­tion want to flow.

Now, we’re al­ready part­way there. For ex­am­ple, SQS clients rou­tine­ly use “long polling” (typ­i­cal­ly around 30 sec­ond­s) to re­ceive mes­sages. That mean­s, they ask for mes­sages and if there aren’t any, the serv­er doesn’t say “no dice”, it holds up the con­nec­tion for a while and if some mes­sages come in, shoots them back to the caller. If you have a bunch of threads (po­ten­tial­ly on mul­ti­ple host­s) long-polling an SQS queue, you can get mas­sive through­put and la­ten­cy and re­al­ly re­duce the cost of us­ing HTTP.

The next two steps for­ward are pret­ty easy to see, too. The first is HTTP/2, al­ready wide­ly de­ployed, which lets you mul­ti­plex mul­ti­ple HTTP re­quests across a sin­gle net­work con­nec­tion. Used in­tel­li­gent­ly, it can buy you quite a few of the ben­e­fits of a per­ma­nent con­nec­tion. But it’s still firm­ly tied to TCP, which has some un­for­tu­nate side-effects that I’m not go­ing to deep-dive on here, part­ly be­cause it’s not a thing I un­der­stand that deeply. But I ex­pect to see lots of apps and ser­vices get good val­ue out of HTTP/2 go­ing for­ward; in some part be­cause as far as clients can tel­l, they’re still mak­ing, and re­spond­ing to, the same old HTTP re­quests they were be­fore.

The next step af­ter that is QUIC ( Q uick U DP I nter­net C on­nec­tion­s) which aban­dons TCP in fa­vor of UDP, while re­tain­ing HTTP se­man­tic­s. This is al­ready in pro­duc­tion on a lot of Google prop­er­ties. I per­son­al­ly think it’s a re­al­ly big deal; one of the rea­sons that HTTP was so suc­cess­ful is that its con­nec­tions are short-lived and thus much less like­ly to suf­fer break­age while they’re at work. This is re­al­ly good be­cause de­sign­ing an application-level pro­to­col which can deal with bro­ken con­nec­tions is super-hard. In the world of HTTP, the most you have to deal with at one time is a failed re­quest, and a bro­ken con­nec­tion is just one of the rea­sons that can hap­pen. UDP makes the connection-breakage prob­lem go away by not re­al­ly hav­ing con­nec­tion­s.

Of course, there’s no free lunch. If you’re us­ing UDP, you’re not get­ting the TC in TCP, T rans­mis­sion C on­trol I mean, which takes care of pack­e­tiz­ing and re­assem­bly and check­sum­ming and throt­tling and loads of oth­er super-useful stuff. But judg­ing by the ev­i­dence I see, QUIC does enough of that well enough to sup­port HTTP se­man­tics clean­ly, so once again, apps that want to go on us­ing the same old XMLHttpRe­quest calls like it was 2005 can re­main hap­pi­ly obliv­i­ous.

Brave New World! · It seems in­evitable to me that, par­tic­u­lar­ly in the world of high-throughput high-elasticity cloud-native app­s, we’re go­ing to see a steady in­crease in re­liance on per­sis­tent con­nec­tion­s, or­ches­tra­tion, and message/event-based log­ic. If you’re not us­ing that stuff al­ready, now would be a good time to start learn­ing.

But I bet that for the fore­see­able fu­ture, a high pro­por­tion of all re­quests to ser­vices are go­ing to have (ap­prox­i­mate­ly) HTTP se­man­tic­s, and that for most con­trol planes and quite a few da­ta planes, REST still pro­vides a good clean way to de­com­pose com­pli­cat­ed prob­lem­s, and its ex­treme sim­plic­i­ty and re­silience will mean that if you want to de­sign net­worked app­s, you’re still go­ing to have to learn that way of think­ing about things.

