2

Declarative and Immutable Pipeline of Transformations

 1 year ago
source link: https://www.yegor256.com/2022/08/10/xsline-immutable-pipeline.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Declarative and Immutable Pipeline of Transformations



A few months ago I made a small Java library, which is worth explaining since the design of its classes and interfaces is pretty unusual. It’s very much object-oriented for a pretty imperative task: building a pipeline of document transformations. The goal was to do this in a declarative and immutable way, and in Java. Well, as much as it’s possible.

Barfuss (2005) by Til Schweiger

Barfuss (2005) by Til Schweiger

Let’s say you have a document, and you have a collection of transformations, each of which will do something with the document. Each transformation, for example, is a small piece of Java code. You want to build a list of transformations and then pass a document through this list.

First, I made an interface Shift (instead of the frequently used and boring “transformation”):

interface Shift {
  Document apply(Document doc);
}

Then I made an interface Train (this is the name I made up for the collection of transformations) and its default implementation:

interface Train {
  Train with(Shift shift);
  Iterator<Shift> iterator();
}
class TrDefault implements Train {
  private final Iterable<Shift> list;
  @Override
  Train with(Shift shift) {
    final Collection<Shift> items = new LinkedList<>();
    for (final Shift item : this.list) {
        items.add(item);
    }
    items.add(shift);
    return new TrDefault(items);
  }
  @Override
  public Iterator<Shift> iterator() {
      return this.list.iterator();
  }
}

Ah, I forgot to tell you. I’m a big fan of immutable objects. That’s why the Train doesn’t have a method add, but instead has with. The difference is that add modifies the object, while with makes a new one.

Now, I can build a train of shifts with TrDefault, a simple default implementation of Train, assuming ShiftA and ShiftB are already implemented:

Train train = new TrDefault()
  .with(new ShiftA())
  .with(new ShiftB());

Then I created an Xsline class (it’s “XSL” + “pipeline”, since in my case I’m managing XML documents and transform them using XSL stylesheets). An instance of this class encapsulates an instance of Train and then passes a document through all its transformations:

Document input = ...;
Document output = new Xsline(train).pass(input);

So far so good.

Now, I want all my transformations to log themselves. I created StLogged, a decorator of Shift, which encapsulates the original Shift, decorates its method apply, and prints a message to the console when the transformation is completed:

class StLogged implements Shift {
  private final Shift origin;
  @Override
  Document apply(Document before) {
    Document after = origin.apply(before);
    System.out.println("Transformation completed!");
    return after;
  }
}

Now, I have to do this:

Train train = new TrDefault()
  .with(new StLogged(new ShiftA()))
  .with(new StLogged(new ShiftB()));

Looks like a duplication of new StLogged(, especially with a collection of a few dozen shifts. To get rid of this duplication I created a decorator for Train, which on the fly decorates shifts that it encapsulates, using StLogged:

Train train = new TrLogged(new TrDefault())
  .with(new ShiftA()))
  .with(new ShiftB());

In my case, all shifts are doing XSL transformations, taking XSL stylesheets from files available in classpath. That’s why the code looks like this:

Train train = new TrLogged(new TrDefault())
  .with(new StXSL("stylesheet-a.xsl")))
  .with(new StXSL("stylesheet-b.xsl")));

There is an obvious duplication of new StXSL(...), but I can’t simply get rid of it, since the method with expects an instance of Shift, not a String. To solve this, I made the Train generic and created TrClasspath decorator:

Train<String> train = new TrClasspath<>(new TrDefault<>())
  .with("stylesheet-a.xsl"))
  .with("stylesheet-b.xsl"));

TrClasspath.with() accepts String, turns it into StXSL and passes to TrDefault.with().

Pay attention to the snippet above: the train is now of type Train<String>, not Train<Shift>, as would be required by Xsline. The question now is: how do we get back to Train<Shift>?

Ah, I forgot to mention. I wanted to design this library with one important principle in mind, suggested in 2014: all objects may only implement methods from their interfaces. That’s why, I couldn’t just add a method getEncapsulatedTrain() to TrClasspath.

I introduced a new interface Train.Temporary<T> with a single method back() returning Train<T>. The class TrClasspath implements it and I can do this:

Train<Shift> train = new TrClasspath<>(new TrDefault<>())
  .with("stylesheet-a.xsl"))
  .with("stylesheet-b.xsl"))
  .back();

Next I decided to get rid of the duplication of .with() calls. Obviously, it would be easier to have the ability to provide a list of file names as an array of String and build the train from it. I created a new class TrBulk, which does exactly that:

Iterable<String> names = Arrays.asList(
  "stylesheet-a.xsl",
  "stylesheet-b.xsl"
);
Train<Shift> train = new TrBulk<>(
  new TrClasspath<>(
    new TrDefault<>()
  )
).with(names).back();

With this design I can construct the train in almost any possible way.

See, for example, how we use it here and here.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK