99

Java - Hashids

 6 years ago
source link: http://hashids.org/java/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Java - Hashids

Java version

Features

  1. Create short unique ids from numbers (positive numbers & zero).
  2. Allow custom alphabet as well as salt — so ids are unique only to you.
  3. Incremental input is mangled to stay unguessable.
  4. Code is tiny (~350 lines), fast and does not depend on external libraries.

How does it work?

Hashids works similarly to the way integers are converted to hex, but with a few exceptions:

  1. The alphabet is not base16, but base base62 by default.
  2. The alphabet is also shuffled based on salt.

This JavaScript function shows regular conversion of integers to hex. It's part of Hashids (although this is a modified example):

JavaScriptfunction toHex(input) {

  var hash = "",
    alphabet = "0123456789abcdef",
    alphabetLength = alphabet.length;

  do {
    hash = alphabet[input % alphabetLength] + hash;
    input = parseInt(input / alphabetLength, 10);
  } while (input);

  return hash;

}

#%@&$!!11

Are there any alternatives?

Yes, there are a few and you should pick the most appropriate for the job. Hashids is not perfect for everything.

  1. Base64 encode. This is the most straightforward approach — most languages have these functions. If you don't need all the fancy extras Hashids has to offer, this method will work just fine. It'll probably be faster too. Read more.
  2. Generate ids based on timestamp. If you can afford certain degree of collisions, you could compose an id that's built on the fly. Use a counter (if you have one) + timestamp (even better if in milliseconds) + some system value (either an IP address or some machine id) + a random integer. Many big companies implement this approach because it works well in distributed systems. These ids are generated independently of each other and the risk of collisions is so tiny it's negligible.
  3. If you need your ids to consist of only numbers, check out Optimus. It's based on Knuth's integer hash method and produces obfuscated integer ids (and does it faster too). There are PHP and Go implementations.
  4. Nano ID. Check out Nano ID, it's available for several languages as well.
  5. Check out how others do it:

    - How does Instagram generate ids?

What not to do

  1. Do not try to encode negative numbers. It won't work. The library currently supports only positive numbers and zero. If you're trying to use numbers as flags for something, simply designate the first N number of digits as internal flags.
  2. Do not encode strings. We've had several requests to add this feature — "it seems so easy to add". We will not add this feature for security purposes, doing so encourages people to encode sensitive data, like passwords. This is the wrong tool for that.
  3. Do not encode sensitive data. This includes sensitive integers, like numeric passwords or PIN numbers. This is not a true encryption algorithm. There are people that dedicate their lives to cryptography and there are plenty of more appropriate algorithms: bcrypt, md5, aes, sha1, blowfish. Here's a full list.

Collisions

There are no collisions because the method is based on integer to hex conversion. As long as you don't change constructor arguments midway, the generated output will stay unique to your salt.

Additionally, we encourage you to pre-generate a large number of ids for testing purposes — to analyze their uniqueness, whether they look random enough for your project and what your upper integer limit is (which usually depends on your system/language).

Please note that generated ids are case-sensitive by default ("Aaa" is not the same id as "aaa").

Why "hashids"?

Originally the project referred to generated ids as hashes, and obviously the name hashids has the word hash in it. Technically, these generated ids cannot be called hashes since a cryptographic hash has one-way mapping (cannot be decrypted).

However, when people search for a solution, like a "youtube hash" or "bitly short id", they usually don't really care of the technical details. So hashids stuck as a term — an algorithm to obfuscate numbers.

Decoding without salt

Do you have a question or comment that involves "security" and "hashids" in the same sentence? Don't use Hashids. Here are some ways to decode:

  1. Use a brute-force attack. These ids are short, get a big dictionary and this shouldn't be too hard.
  2. Use an even faster method with this blog post.

How to contribute

If you've found a bug, please open a github issue in the appropriate repository. Bonus points if you submit a pull request with it.

If an implementation in your favorite language is missing, feel free to port it over from one of the existing versions. There's still plenty of languages to contribute in: Nu, Groovy, Racket, Scheme, Tcl, Dylan, Lolcode, Factor?

We try to keep the library versions compatible. If you see an outdated version in an existing implementation, please open a github issue in that repository — show your +1 support for that issue.

Are you using Hashids?

Whether it's an open-source project or a commercial product, tell us at https://github.com/hashids/hashids.github.io/wiki/Who's-Using-Hashids !

Hashids elsewhere

If you have a question about Hashids, a lot has already been answered. Try checking one of these:

License

All hashids libraries are under MIT license. You can use them in open source projects and commercial products. Don't break the Internet. Kthxbye.

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK