7

Eclipse Vert.x client for Apache Pinot

 1 year ago
source link: https://vertx.io/blog/soc-vertx-pinot-client/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Eclipse Vert.x client for Apache Pinot

The blog post is in­for­ma­tional and a teaser of up­com­ing changes to pinot-client which is under ac­tive de­vel­op­ment and thus, the fi­nally re­leased ver­sions may dif­fer. Cur­rently, the client is not avail­able for down­load on Maven Cen­tral.

Introduction

Eclipse Vert.x is a toolkit to build re­ac­tive ap­pli­ca­tions on the Java vir­tual ma­chine. It pro­vides asyn­chro­nous and non-​blocking clients for dif­fer­ent types of data­bases. Apache Pinot is a re­al­time dis­trib­uted data­s­tore for an­a­lyt­ics work­loads. The Apache Pinot Java Client uses AsyncHttp­Client as the de­fault trans­port. The Apache Pinot Java Client is un­prac­ti­cal to use in Eclipse Vert.x ap­pli­ca­tions be­cause every block­ing call needs to be wrapped with executeBlocking.

The pinot-client Re­ac­ti­verse project is a light­weight wrap­per around the ex­ist­ing Pinot Java client. It also pro­vides an al­ter­na­tive trans­port using vertx-web-client. This will make the Apache Pinot more ac­ces­si­ble to Vert.x users.

Advantages

  1. vertx-pinot-client of­fers an async and non-​blocking API, that is line with the gen­eral Vert.x ecosys­tem’s phi­los­o­phy.
  2. For lit­tle ef­fort we get RxJava and Mutiny bind­ings for the client as well, thanks to Vert.x code­gen.
  3. The vertx-pinot-client should offer bet­ter per­for­mance as it uses the Vert.x Web Client for the un­der­ly­ing trans­port. This al­lows the ap­pli­ca­tion to stay on the same event loop when in­ter­act­ing with Pinot.

Usage

Add the pinot-client de­pen­dency to your project.

<dependency>
    <groupId>io.reactiverse</groupId>
    <artifactId>pinot-client</artifactId>
    <version>1.0-SNAPSHOT</version>
</dependency>

You can cre­ate an in­stance of the client to in­ter­act with the Pinot clus­ter as fol­lows.

String brokerUrl = "localhost:8000";
VertxPinotClientTransport transport = new VertxPinotClientTransport(vertx);
VertxConnection connection = VertxConnectionFactory.fromHostList(vertx, List.of(brokerUrl), transport);

Now we can use the Connection to query the clus­ter. Here is an ex­am­ple to query the top 10 home run scor­ers from the sam­ple quick­start dataset.

String query = "select playerName, sum(homeRuns) AS totalHomeRuns from baseballStats where homeRuns > 0 group by playerID, playerName ORDER BY totalHomeRuns DESC limit 10";
connection
    .execute(query)
    .onSuccess(resultSetGroup -> {
        ResultSet results = resultSetGroup.getResultSet(0);
        System.out.println("Player Name\tTotal Home Runs");
        for (int i = 0; i < results.getRowCount(); i++) {
            System.out.println(results.getString(i, 0) + "\t" + results.getString(i, 1));
        }
    })
    .onFailure(Throwable::printStackTrace);

A more real-​life demo is avail­able at pizza-shop-demo. It in­cludes a pizza de­liv­ery ser­vice spe­cial­iz­ing in piz­zas with In­dian top­pings. They have a dash­board built using Apache Pinot to mon­i­tors their or­ders in re­al­time. The dash­board deals with three types of data: prod­ucts, users, and or­ders. The dash­board uses the Vert.x Pinot Client to query the Pinot clus­ter.

Order Dashboard Statistics
Order Dashboard Latest and Popular Items

Challenges

I worked on this project as a part of the Google Sum­mer of Code - 2023 pro­gram and dur­ing the im­ple­men­ta­tion, I en­coun­tered sev­eral chal­lenges.

Package-private org.apache.pinot.client.BrokerResponse class

The org.apache.pinot.client.BrokerResponse class is package-​private. How­ever sev­eral executeQuery/executeQueryAsync meth­ods in the PinotClientTransport re­turn this type. This makes it dif­fi­cult to im­ple­ment the in­ter­face in an ex­ter­nal pack­age.

To work around this issue, we cre­ated a org.apache.pinot.client pack­age in­side our wrap­per project. The VertxPinotClientTransport im­ple­men­ta­tion cur­rently lives in­side this pack­age for the same rea­son. This workaround only works for projects run­ning in classpath mode but not for projects using modulepath be­cause Java Mod­ules dis­al­low pack­age split­ting. To fix the issue as source, I opened a PR in apache/pinot repos­i­tory. The PR is now merged. Once a new re­lease of Pinot is avail­able, I will be up­dat­ing the client wrap­per as well.

java.concurrent.util.Future return type

Some meth­ods of the PinotClientTransport in­ter­face re­turn a java.concurrent.util.Future fu­ture. How­ever, Future.get is a block­ing call which is un­de­sir­able. To work around the issue, we wrap every re­quest and query call in an executeBlocking block. To avoid block­ing calls al­to­gether, we dis­cussed re­plac­ing Future with CompletabeFutre. The same change has al­ready been merged now and will be avail­able in the next re­lease. Once that is avail­able, we can up­date the wrap­per client to adapt CompletableFutures to Vert.x Futures with­out much has­sle and with­out hav­ing to of­fload some of the tasks to the Vert.x worker pool.

Missing ConnectionFactory method

The ConnectionFactory class is used to cre­ate Connection in­stances to in­ter­act with the Pinot Clus­ter. There are mul­ti­ple ways to in­ter­act with the clus­ter, using zookeeper, con­troller or a hard-​coded bro­ker list. Most of the meth­ods fea­ture an over­load to ac­cept a cus­tom PinotClientTransport im­ple­men­ta­tion but the same was no­table miss­ing for the con­troller vari­ant. Fur­ther, the con­struc­tors of the Connection class are package-​private pre­vent­ing us from di­rectly adding the miss­ing method in our wrap­per.

Again to fix this issue, I cre­ated a bridge class in the org.apache.pinot.client pack­age. Si­mul­ta­ne­ously, I opened a PR in apache/pinot repos­i­tory. The PR is also now merged and the Vert.x client will be up­dated once a new re­lease is avail­able.

Bonus Fixes

Missing double-checked locking in ConnectionFactory

While work­ing on the tasks out­lined ear­lier, I (more ac­cu­rately my IDE’s sta­tic check­ing) dis­cov­ered the double-​checked lock­ing mech­a­nism in ConnectionFactory class in pinot-client was bro­ken. I opened a PR to fix it.

Broken Docker health checks in pizza-shop-demo

The Docker Health Checks in the orig­i­nal pizza-shop-demo were bro­ken. The health checks had been writ­ten using nc util­ity but the same is not in­cluded in the apache/pinot docker image. It took me a lot of head scratch­ing to dis­cover that but the same is now fixed. The PR for fix­ing it up­stream is avail­able here.


Recommend

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK