4

Bigtable Pagination in Java

 2 years ago
source link: http://www.java-allandsundry.com/2022/12/bigtable-pagination-in-java.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Bigtable Pagination in Java

 Consider a set of rows stored in Bigtable table called “people”:

AVvXsEhkfy9v-tSj_xfZHDmbKp-GgAv0_hZWwApkrP_iFE2d_jHT6oBNJOqfGBFHb_eVkL6i2Q6VEXwmiV5vUO1-NIcQu0jCEff5B3srePpp-MnTHSXOVBRFmORovDOd1dLchVTN8vLyITWltVbsozE2SLR8tYbXsAvMmBfKdvzyKMTQpKyRuldV6w-45qI9=w640-h298

My objective is to be able to paginate a few records at a time, say with each page containing 4 records:

Page 1:

AVvXsEj3cF7JMGEPnQZmmoZKB5xZ0a8JSufzgonKp87s9NmiwxsfdIw2YDUTnmT7Sqy6XJdO6GscJ85t7Gz216Nou_ZnVvgv6SZ_1pGprEbjUwcZcQEumsBfpYdtSMDaimfsyfngz2A7ryVFHefUo6aamAkMANjeJqF8m-1gEJbS-4ex7fV0WRNrAA1_Nros=w640-h98

Page 2:

AVvXsEhFzkm6HZhc1KRL__wTCvNlYUfW4Vpi5zw5H1dvc80Ee5iFzoK8d5sl7nRBIoRskb7HEvIlKrPGhJ4A3N6TUQ_NMlYaURlQPrLJTS9wWeb7CGoHEY8evuPrLgrSifqL6lguVD2hPRunZ5tghZjiXYs8Jrl_8_mGr1Z03VfyuF11kjMJt67tlG1ZyOXY=w640-h104

Page 3:

AVvXsEg4bUKHuf95paHw--I2V7p7OY169y2gcJm7sSO4o_t2gIktpVzpJTkFSvByqODPWq-yVtCagkfcJ8VJA4_XVrKNkgHhz96iMGmHqxFjG4w78YSkLmPz1vYbaT1Q05daqTpK9M3UezW97aVrumzppvCODC1zbyL0d4YLiH4hWN1ZHgkQHgJ4pIBDXYSq=w640-h58

High-Level Approach

A high level approach to doing this is to introduce two parameters:
  • Offset — the point from which to retrieve the records.
  • Limit — the number of records to retrieve per page
Limit in all cases is 4 in my example. Offset provides some way to indicate where to retrieve the next set of records from. Bigtable orders the record lexicographically using the key of each row, so one way to indicate offset is by using the key of the last record on a page. Given this, and using a marker offset of empty string for the first page, offset and record for each page looks like this:
Page 1 — offset: “”, limit: 4
AVvXsEgReajLwEFRAy5PhRe3e3mgnejJVj-eVcJS6usuVpcLJ7dzA7e8Z_vhVlzcBzrSbkprF3Es9XaT_podm8pDQWaNDJqZSqBWyDQB8Lfo1emoqpDoXHZN8DbgRt4tgZLplOI2Z7FIR3IVeexzR-AT4A6BAOdovnktTE-pc7lFpiRjrlwDSOgxAUFOcCT9=w640-h98
Page 2 — offset: “person#id-004”, limit: 4
AVvXsEix9ddFOtJQxOnVNE-k7OImCWoTGzQD0AfyZCpY3ggKdGwiTDgqH4_puXuCQW9FVrNThGGmnKWHF2BrgyDvssudY88Tn7lg1WeSTg2C4RiK3E8mlmy78W0yFRNmIvwjvYZ9EghL_VqmHo20jkgF_AGqxWEpqA5Q7n2NiukGccWe0UwH1uN3KIDWjhli=w640-h104
Page 3 — offset: “person#id-008”, limit: 4
AVvXsEiHXqs2kmnSEOqr0HuhjhL1rBbMRRf7cIHVDdsC-9GPK23U5SCIFD_RH8DGfyYgbIGjJQJNxdxygijddrKmBssbb-CMObRHM9WfqI-GTpITW35jymfMyI8ZNcV78BFkkxBgVN7B5EtrskVRCtU8sO0_JwSJJZvI2c7vg-lkw7ONyK1miWW4B2GCcdxq=w640-h58
The challenge now is in figuring out how to retrieve a set of records given a prefix, an offset, and a limit.

Retrieving records given a prefix, offset, limit

Bigtable java client provides a “readRows” api, that takes in a Query and returns a list of rows.
import com.google.cloud.bigtable.data.v2.BigtableDataClient
import com.google.cloud.bigtable.data.v2.models.Query
import com.google.cloud.bigtable.data.v2.models.Row
val rows: List<Row> = bigtableDataClient.readRows(query).toList()
Now, Query has a variant that takes in a prefix and returns rows matching the prefix:
import com.google.cloud.bigtable.data.v2.BigtableDataClient
import com.google.cloud.bigtable.data.v2.models.Query
import com.google.cloud.bigtable.data.v2.models.Row
val query: Query = Query.create("people").limit(limit).prefix(keyPrefix)
val rows: List<Row> = bigtableDataClient.readRows(query).toList()       
This works for the first page, however, for subsequent pages, the offset needs to be accounted for.
A way to get this to work is to use a Query that takes in a range:
import com.google.cloud.bigtable.data.v2.BigtableDataClient
import com.google.cloud.bigtable.data.v2.models.Query
import com.google.cloud.bigtable.data.v2.models.Row
import com.google.cloud.bigtable.data.v2.models.Range
val range: Range.ByteStringRange =
Range.ByteStringRange
.unbounded()
.startOpen(offset)
.endOpen(end)
val query: Query = Query.create("people")
.limit(limit)
.range(range)
The problem with this is to figure out what the end of the range should be. This is where a neat utility that the Bigtable Java library provides comes in. This utility given a prefix of “abc”, calculates the end of the range to be “abd”
import com.google.cloud.bigtable.data.v2.models.Range
val range = Range.ByteStringRange.prefix("abc")
Putting this all together, a query that fetches paginated rows at an offset looks like this:
val query: Query =
Query.create("people")
.limit(limit)
.range(Range.ByteStringRange
.prefix(keyPrefix)
.startOpen(offset))
val rows: List<Row> = bigtableDataClient.readRows(query).toList()
When returning the result, the final key needs to be returned so that it can be used as the offset for the next page, this can be done in Kotlin by having the following type:
data class Page<T>(val data: List<T>, val nextOffset: String)

Conclusion

I have a full example available here — this pulls in the right library dependencies and has all the mechanics of pagination wrapped into a working sample.

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK