Elasticsearch loves JAX-RS 2 and JSON-P
Elasticsearch proposes API for java developers. The two main ones are:
- Native
- HTTP
The first one forces you to make your application part of the cluster which is rarely a good idea or even what you want so we have the HTTP solution.
Before Elasticsearch 5.0 you were quite naked but since 5.0 elasticsearch provides a java REST client. It is built on top of http client and provides asynchronism through the async version of httpclient.
So now you should probably wonder why I spoke of JAX-RS? Simply cause this client - still in beta when I write this post - has the same pitfall than the historical HTTP client "Jest" which is to rely on String for the requests.
Why is it an issue? Simply cause it enforces you to build string and you can do a lot of errors - yes typos. You will probably answer me "fine but why not just using JSON-P". This is true but if you step back and check what the rest client really do (https://github.com/elastic/elasticsearch/tree/master/client/rest/src/main/java/org/elasticsearch/client) you will realize it can be overkill and if you are in a EE server you can just desire to rely on what is there: JAX-RS.
The nice things with JAX-RS for the HTTP part of the client are:
- it is there so no need to mess up dependencies
- if you need advanced async features you can still do it (typically CXF can also use http async client as transport)
- the API is stable, known and not a custom one
- it will be as expressive as the rest client since the API is still bound on HTTP at the moment
What do you loose with that solution compared to the rest client? Mainly the blacklisting of hosts...well this is not really true since CXF provides some failover and circuit breaker features you can integrate with CXF JAX-RS client (see http://cxf.apache.org/docs/jax-rs-failover.html).
So we justified we can go with JAX-RS so now why JSON-P? Here the answer is more trivial: cause it is a built-in and native solution to build JSON limiting error factors on the structure and this is what we'll use to communicate with Elasticsearch.
Simple Elasticsearch request with JAX-RS and JSON-P
So what will look our search API (we'll not handle other endpoints in this post since it is more or less the same thing but this one is the most common):
public JsonObject search(String index, String type,
JsonObject request,
long from, long size);
So we need to know what we query (index/type), what is the query (request) and the pagination we use (from/size).
How to implement it? Simply getting a client and send the payload on the right endpoint!
public class ElasticsearchHttpClient implements AutoCloseable {
private final Client client;
public ElasticsearchHttpClient() {
client = ClientBuilder.newClient();
}
public JsonObject search(final String base, // http://localhost:9200 for instance
final String index, final String type,
final String query,
final long from, final long size) {
// if you need security: builder.header("Authorization", token)
final WebTarget target = client.target(base)
.path("{index}/{type}/_search")
.resolveTemplate("index", index)
.resolveTemplate("type", type);
try {
return target.queryParam("from", from)
.queryParam("size", size)
.queryParam("q", query)
.request(APPLICATION_JSON_TYPE)
.post(entity(request, APPLICATION_JSON_TYPE), JsonObject.class);
} catch (final WebApplicationException wae) {
Object entity;
try {
entity = wae.getResponse().readEntity(JsonObject.class);
Logger.getLogger(getClass().getName())
.log(WARNING, "Elasticsearch error: " + entity, wae);
} catch (final RuntimeException e) { // unlikely
entity = wae.getResponse().getEntity();
}
throw new WebApplicationException(Response
.status(wae.getResponse().getStatus())
.entity(entity)
.header(HttpHeaders.CONTENT_TYPE, APPLICATION_JSON_TYPE)
.build());
}
}
@Override
public void close() {
client.close();
}
}
So what is important here is:
- we build a WebTarget on the search endpoint
- we bind our request in the post method and request to deserialize the response as a JsonObject
- don't forget to close the client, depending the transport it can be super important to not leak threads etc...
TIP: if you implement the bulk endpoint don't forget to parse the response even in HTTP 200 case since it can contain an error. This can look like:
final JsonObject result = ....;
if (result.containsKey("errors") && result.getBoolean("errors")) {
throw new MyElasticsearchBulkException(result);
}
Finally what is interesting is the usage: this API allows to use whatever Elasticsearch query format you want in a smooth way:
// can be a field in the client, avoid to load a new one for each request
final JsonBuilderFactory factory = Json.createBuilderFactory(config);
final JsonObject searched = client.search(index, type,
// and here the builder API allows to match whatever request format you need
factory.createObjectBuilder()
.add("query", factory.createObjectBuilder()
.add("term", factory.createObjectBuilder()
.add("name", "blog")))
.build(), 0, 10);
// and parsing the response is as easy as building it
final int total = searched.getJsonObject("hits").getInt("total");
// here you can use streams too ;)
final JsonObject firstResult = searched.getJsonObject("hits").getJsonArray("hits")
.getJsonObject(0).getJsonObject("_source");
// business field access
final String displayName = firstResult.getString("displayName");
Of course we can miss some binding (like JPA for instance) at the end for _source value but it would be easy to add if needed with any json mapper (johnzon having the advantage to be built on top of jsonp but jsonb module is also an option). Anyway, the central point of that proposal if that the query if really flexible and if you already use elasticsearch you know that trying to strongly type the query is very limiting and not usable very long so this solution is a quite good compromise in my opinion.
From the same author:
In the same category: