

Leaky abstraction by omission
source link: https://blog.ploeh.dk/2021/04/26/leaky-abstraction-by-omission/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Sometimes, an abstraction can be leaky because it leaves something out.
Consider the following interface definition. What's wrong with it?
public interface IReservationsRepository { Task Create(int restaurantId, Reservation reservation); Task<IReadOnlyCollection<Reservation>> ReadReservations( int restaurantId, DateTime min, DateTime max); Task<Reservation?> ReadReservation(Guid id); Task Update(Reservation reservation); Task Delete(Guid id); }
Perhaps you think that the name is incorrect; that this really isn't an example of the Repository design pattern, as it's described in Patterns of Enterprise Application Architecture. Ironically, of all patterns, it may be the one most affected by semantic diffusion.
That's not what I have in mind, though. There's something else with that interface.
It's not its CRUD design, either. You could consider that a leaky abstraction, since it strongly implies a sort of persistent data store. That's a worthwhile discussion, but not what I have in mind today. There's something else wrong with the interface.
Consistency #
Look closer at the parameters for the various methods. The Create
and ReadReservations
methods take a restaurantId
parameter, but the other three don't.
Why does the ReadReservations
method take a restaurantId
while ReadReservation
doesn't? Why is that parameter required for Create
, but not for Update
? That doesn't seem consistent.
The reason is that each reservation has an ID (a GUID). Once the reservation exists, you can uniquely identify it to read, update, or delete it.
As the restaurantId
parameter suggests, however, this interface is part of a multi-tenant code base. This code base implements an online restaurant reservation system as a REST API. It's an online service where each restaurant is a separate tenant.
While each reservation has a unique ID, the system still needs to associate it with a restaurant. Thus, the Create
method must take a restaurantId
parameter in order to associate the reservation with a restaurant.
Once the reservation is stored, however, it's possible to uniquely identify it with the ID. The ReadReservation
, Update
, and Delete
methods need only the id
to work.
On the other hand, when you're not querying on reservation ID, you'll need to identify the restaurant, as with the ReadReservations
methods. If you didn't identify the restaurant in that method, you'd get all reservations in the requested range, from all tenants. That's not what you want. Therefore, you must supply the restaurantId
to limit the query.
The interface is inconsistent, but also allows the underlying implementation to leak through.
Implied implementation detail #
If I told you that the implementation of IReservationsRepository
is based on a relational database, can you imagine the design? You may want to stop reading and see if you can predict what the database looks like.
The interface strongly implies a design like this:
CREATE TABLE [dbo].[Reservations] ( [Id] INT IDENTITY (1, 1) NOT NULL, [At] DATETIME2 (7) NOT NULL, [Name] NVARCHAR (50) NOT NULL, [Email] NVARCHAR (50) NOT NULL, [Quantity] INT NOT NULL, [PublicId] UNIQUEIDENTIFIER NOT NULL, [RestaurantId] INT NOT NULL, PRIMARY KEY CLUSTERED ([Id] ASC), CONSTRAINT [AK_PublicId] UNIQUE NONCLUSTERED ([PublicId] ASC) );
What I wrote above is even clearer now. You can't create a row in that table without supplying a RestaurantId
, since that column has a NOT NULL
constraint.
The PublicId
column has a UNIQUE
constraint, which means that you can uniquely read and manipulate a single row when you have an ID.
Since all reservations are in a single table, any query not based on PublicId
should also filter on RestaurantId
. If it doesn't, the result set could include reservations from all restaurants.
Other interpretations #
Is the above relational database design the only possible implementation? Perhaps not. You could implement the interface based on a document database as well. It'd be natural to store each reservation as a separate document with a unique ID. Again, once you have the ID, you can directly retrieve and manipulate the document.
Other implementations become harder, though. Imagine, for example, that you want to shard the database design: Each restaurant gets a separate database. Or perhaps, more realistically, you distribute tenants over a handful of databases, perhaps partitioned on physical location, or some other criterion.
With such a design, the ReadReservation
, Update
, and Delete
methods become more inefficient. While you should be able to identify the correct shard if you have a restaurant ID, you don't have that information. Instead, you'll have to attempt the operation on all databases, thereby eliminating most sharding benefits.
In other words, the absence of the restaurantId
parameter from some of the methods suggests certain implementation details.
Leak by omission #
I admit that I rarely run into this sort of problem. Usually, a leaky abstraction manifests by a language construct that contains too much information. This is typically an interface or base class that exposes implementation details by either requiring too specific inputs, or by returning data that reveals implementation details.
For a data access abstraction like the above 'repository', this most frequently happens when people design such an interface around an object-relational mapper (ORM). A class like Reservation
would then typically carry ORM details around. Perhaps it inherits from an ORM base class, or perhaps (this is very common) it has a parameterless constructor or getters and setters that model the relationships of the database (these are often called navigation properties).
Another common examples of a leaky abstraction might be the presence of Connect
and Disconnect
methods. The Connect
method may even take a connectionString
parameter, clearly leaking that some sort of database is involved.
Yet another example is CQS-violating designs where a Create
method returns a database ID.
All such leaky abstractions are leaky because they expose or require too much information.
The example in this article, on the contrary, is leaky because of a lack of detail.
Dependency Inversion Principle #
Ironically, I originally arrived at the above design because I followed the Dependency Inversion Principle (DIP). The clients of IReservationsRepository
are ASP.NET Controller actions, like this Delete
method:
[HttpDelete("restaurants/{restaurantId}/reservations/{id}")] public async Task Delete(int restaurantId, string id) { if (Guid.TryParse(id, out var rid)) { var r = await Repository.ReadReservation(rid) .ConfigureAwait(false); await Repository.Delete(rid).ConfigureAwait(false); if (r is { }) await PostOffice.EmailReservationDeleted(restaurantId, r) .ConfigureAwait(false); } }
As Robert C. Martin explains about the Dependency Inversion Principle:
From that principle, it follows that the"clients [...] own the abstract interfaces"
Robert C. Martin, APPP, chapter 11
Delete
method decides what IReservationsRepository.Delete
looks like. It seems that the Controller action doesn't need to tell the Repository
about the restaurantId
when calling its Delete
method. Supplying the reservation ID (rid
) is enough.
There are, however, various problems with the above code. If the DIP suggests that the restaurantId
is redundant when calling Repository.Delete
, then why is it required when calling PostOffice.EmailReservationDeleted
? This seems inconsistent.
Indeed it is.
As I often do, I arrived at the above Delete
method via outside-in TDD, but as I observed a decade ago, TDD alone doesn't guarantee good design. Even when following the red-green-refactor checklist, I often fail to spot problems right away.
That's okay. TDD doesn't guarantee perfection, but done well it should set you up so that you can easily make changes.
Possible remedies #
I can think of two ways to address the problem. The simplest solution is to make the interface consistent by adding a restaurantId
parameter to all methods:
public interface IReservationsRepository { Task Create(int restaurantId, Reservation reservation); Task<IReadOnlyCollection<Reservation>> ReadReservations( int restaurantId, DateTime min, DateTime max); Task<Reservation?> ReadReservation(int restaurantId, Guid id); Task Update(int restaurantId, Reservation reservation); Task Delete(int restaurantId, Guid id); }
This is the simplest solution, and the one that I prefer. In a future article, I'll show how it enabled me to significantly simplify the code base.
For good measure, though, I should also mention the opposite solution. Completely drain the interface of restaurantId
parameters:
public interface IReservationsRepository { Task Create(Reservation reservation); Task<IReadOnlyCollection<Reservation>> ReadReservations( DateTime min, DateTime max); Task<Reservation?> ReadReservation(Guid id); Task Update(Reservation reservation); Task Delete(Guid id); }
How can that work in practice? After all, an implementation must have a restaurant ID in order to create a new row in the database.
It's possible to solve that problem by making the restaurantId
an implementation detail. You could make it a constructor parameter for the concrete class, but this gives you another problem. Your Composition Root doesn't know the restaurant ID - after all, it's a run-time argument.
In a method like the above Delete
Controller action, you'd have to translate the restaurantId
run-time argument to an IReservationsRepository
instance. There are various ways around that kind of problem, but they typically involve some kind of factory. That'd be yet another interface:
public interface IReservationsRepositoryFactory { IReservationsRepository Create(int restaurantId); }
That just makes the API more complicated. Factories give Dependency Injection a bad reputation. For that reason, I don't like this second alternative.
Conclusion #
Leaky abstractions usually express themselves as APIs that expose too many details; the implementation details leak through.
In this example, however, a leaky abstraction manifested as a lack of consistency. Some methods require a restaurantId
argument, while others don't - because one particular implementation doesn't need that information.
It turned out, though, that when I was trying to simplify the overall code, this API design held me back. Consistently adding restaurantId
parameters to all repository methods solved the problem. A future article tells that tale.
Recommend
-
47
:computer: :book: hacker-laws Laws, Theories, Principles and Patterns that developers will find useful. The Hype Cycle & Amara's Law Hyrum's Law...
-
20
Bitcoin private keys Bitcoin wallets, just like those of other cryptocurrencies, are supposed to be highly secure. You, as the owner of a Bitcoin address - or wallet - are the only one in posession of a private...
-
6
iterators and iterables, a quick recapitulation In JavaScript, iterators and iterables provide an abstract interface for sequentially accessing values, such as we might find in collections like arrays or priority queues.
-
12
Popular posts from leaky bug-tracking systems My web server logs referrer data whenever it's available. It means I pick up a bunch of attempted referrer spam from some crazy Kyivstar GSM blocks, but it also means I can see when c...
-
12
Leaky entities, reduced productivity, and a filesystem I love finding strange things and then taking a picture or a screenshot to save it for later. Here are some of the items in my collection. Back when Buzz still existed...
-
6
Ruby 3.1 allows value omission in hash literals Sep 28, 2021 , by Akhil G Krishnan 1 minute read Ruby 3.1
-
6
What you need to know9to5Google reports it found evidence of three new Fitbit devices added to the Fitbit app: Hera, Rhea, and Nyota. The leaked display resolutions suggest they could be replacement models for the Fitbi...
-
2
Jake LazaroffTailwind is a Leaky AbstractionNovember 29, 2022I have to admit: as I've watched Tailwind enthusiastically ado...
-
11
Open and Closed, Omission and Collapse Were you born in a cave? This, from Open Versus Closed: A Cautionary Tale by Schroeder et a...
-
8
Apache APISIX is an API Gateway, which builds upon the OpenResty reverse-proxy to offer a plugin-based architecture. The main benefit...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK