Skip to main content
Version: 2.2.6

Searching with Data Types

Although Riak Data Types function differently from other Riak objects in some respects, when you're using Search you can think of them as normal Riak objects with special metadata attached (metadata that you don't need to worry about as a user). Riak's counters, sets, and maps can be indexed and have their contents searched just like other Riak objects.

Data Type MIME Types

Like all objects stored in Riak, Riak Data Types are assigned content types. Unlike other Riak objects, this happens automatically. When you store, say, a counter in Riak, it will automatically be assigned the type application/riak_counter. The table below provides the full list of content types:

Data TypeContent Type
Countersapplication/riak_counter
Setsapplication/riak_set
Mapsapplication/riak_map

When using Search, you won't need to worry about this, as Riak Data Types are automatically indexed on the basis of these content types.

Data Type Schemas

There are two types of schemas related to Riak Data Types:

  • Top-level schemas relate to Data Types that are stored at the key level (counters and sets)
  • Embedded schemas relate to Data Types nested inside of maps (flags, counters, registers, and sets)

As you can see from the default Search schema, each of the Data Types has its own default schema, with the exception of maps, which means that the _yz_default schema will automatically index Data Types on the basis of their assigned content type. This means that there is no extra work involved in indexing Riak Data Types. You can simply store them and begin querying, provided that they are properly indexed, which is covered in the examples section below.

As mentioned above, there are no default schemas available for maps. This is because maps are essentially carriers for the other Data Types. Even when maps are embedded within other maps, all of the data that you might wish to index and search is contained in counters, sets, registers, and flags.

The sections immediately below provide the default schemas for each Riak Data Type. Because you will not need to manipulate these default schemas to search Data Types, they are provided only for reference.

Top-level Schemas

The default schema for counters indexes each counter as an integer.

<field name="counter" type="int" indexed="true" stored="true" multiValued="false" />

Constructing queries for counters involves prefacing the query with counter. Below are some examples:

QuerySyntax
Counters with a value over 10counter:[10 TO *]
Counters with a value below 10 and above 50counter:[* TO 10] AND counter:[50 TO *]
Counters with a value of 15counter:15
All counters within the indexcounter:*

The schema for sets indexes each element of a set as a string and indexes the set itself as multi-valued.

<field name="set" type="string" indexed="true" stored="false" multiValued="true" />

To query sets, preface the query with set. The table below shows some examples:

QuerySyntax
Sets that contain the value appleset:apple
Sets that contain an item beginning with levelset:level*
Sets that contain both apple and orangeset:apple AND set:orange
All sets within the indexset:*

Embedded Schemas

For searching within maps, there are four schemas for embedded, aka dynamic, fields. Flags are indexed as booleans:

<dynamicField name="*_flag" type="boolean" indexed="true" stored="true" multiValued="false" />

Counters, like their top-level counterparts, are indexed as integers:

<dynamicField name="*_counter" type="int" indexed="true" stored="true" multiValued="false" />

Registers are indexed as strings, but unlike sets they are not multi-valued.

<dynamicField name="*_register" type="string" indexed="true" stored="true" multiValued="false" />

Finally, sets at the embedded level are indexed as multi-valued strings.

<dynamicField name="*_set" type="string" indexed="true" stored="true" multiValued="true" />

To query embedded fields, you must provide the name of the field. The table below provides some examples:

QuerySyntax
Maps containing a set called hobbieshobbies_set:*
Maps containing a score counter over 50score_counter:[50 TO *]
Maps containing disabled advanced flagsadvanced_flag:false
Maps containing enabled advanced flags and score counters under 10advanced_flag:true AND score_counter:[* TO 10]

You can also query maps within maps, which is covered in the Querying maps within maps section below.

Data Types and Search Examples

In this section, we'll start with two simple examples, one involving counters and the other involving sets. Later on, we'll introduce a slightly more complex map example.

Counters Example

Let's say that we're storing scores in a multiplayer online game in Riak. The game is called Boulderdash and it involves smashing digital boulders armed with nothing but witty retorts and arcane trivia knowledge. We'll create and activate a bucket type for storing counters simply called counters, like so:

riak-admin bucket-type create counters '{"props":{"datatype":"counter"}}'
riak-admin bucket-type activate counters

Now, we'll create a search index called scores that uses the default schema (as in some of the examples above):

YokozunaIndex scoresIndex = new YokozunaIndex("scores", "_yz_default");
StoreIndex storeIndex = new StoreIndex.Builder(scoresIndex)
.build();
client.execute(storeIndex);

Now, we can modify our counters bucket type to associate that bucket type with our scores index:

riak-admin bucket-type update counters '{"props":{"search_index":"scores"}}'

At this point, all of the counters that we stored in any bucket with the bucket type counters will be indexed in our scores index. So let's start playing with some counters. All counters will be stored in the bucket people, while the key for each counter will be the username of each player:

Namespace peopleBucket = new Namespace("counters", "people");

Location christopherHitchensCounter = new Location(peopleBucket, "christ_hitchens");
CounterUpdate cu = new CounterUpdate(10);
UpdateCounter update = new UpdateCounter.Builder(christopherHitchensCounter, cu)
.build();
client.execute(update);

Location joanRiversCounter = new Location(peopleBucket, "joan_rivers");
CounterUpdate cu = new CounterUpdate(25);
UpdateCounter update = new UpdateCounter.Builder(joanRiversCounter, cu)
.build();
client.execute(update);

So now we have two counters, one with a value of 10 and the other with a value of 25. Let's query to see how many counters have a value greater than 20, just to be sure:

String index = "scores";
String query = "counter:[20 TO *]";
SearchOperation searchOp = new SearchOperation.Builder(BinaryValue.create(index), query)
.build();
cluster.execute(searchOp);
SearchOperation.Response results = searchOp.get();

And there we are: only one of our two stored sets has a value over 20. To find out which set that is, we can dig into our results:

// Using the "results" object from above:
int numberFound = results.numResults();
Map<String, List<String>> foundObject = results.getAllResults().get(0);
String key = foundObject.get("_yz_rk").get(0); // "joan_rivers"
String bucket = foundObject.get("_yz_rb").get(0); // "people"
String bucketType = foundObject.get("_yz_rt").get(0); // "counters"

Alternatively, we can see how many counters have values below 15:

String index = "scores";
String query = "counter:[* TO 15]";
SearchOperation searchOp = new SearchOperation
.Builder(BinaryValue.create("scores"), "counter:[* TO 15]")
.build();
cluster.execute(searchOp);
SearchOperation.Response results = searchOp.get();

Or we can see how many counters have a value of 17 exactly:

// Using the same method as above, just changing the query:
String query = "counter:17";

Sets Example

Let's say that we're storing information about the hobbies of a group of people in sets. We'll create and activate a bucket type for storing sets simply called sets, like so:

riak-admin bucket-type create sets '{"props":{"datatype":"set"}}'
riak-admin bucket-type activate sets

Now, we'll create a Search index called hobbies that uses the default schema (as in some of the examples above):

YokozunaIndex hobbiesIndex = new YokozunaIndex("hobbies");
StoreIndex storeIndex =
new StoreIndex.Builder(hobbiesIndex).build();
client.execute(storeIndex);

Now, we can modify our sets bucket type to associate that bucket type with our hobbies index:

riak-admin bucket-type update sets '{"props":{"search_index":"hobbies"}}'

Now, all of the sets that we store in any bucket with the bucket type sets will be automatically indexed as a set. So let's say that we store three sets for two different people describing their respective hobbies, in the bucket people:

Namespace peopleBucket = new Namespace("sets", "people");

Location mikeDitkaSet = new Location(peopleBucket, "ditka");
SetUpdate su1 = new SetUpdate()
.add("football")
.add("winning");
UpdateSet update1 = new UpdateSet.Builder(mikeDitkaSet, su1).build();

Location ronnieJamesDioSet = new Location(peopleBucket, "dio");
SetUpdate su2 = new SetUpdate()
.add("wailing")
.add("rocking")
.add("winning");
UpdateSet update2 = new UpdateSet.Builder(ronnieJamesDioSet, su2).build();

client.execute(update1);
client.execute(update2);

Now, we can query our hobbies index to see if anyone has the hobby football:

// Using the same method explained above, just changing the query:
String query = "set:football";

Let's see how many sets contain the element football:

// Using the same method explained above for getting search results:
int numberFound = results.numResults(); // 1

Success! We stored two sets, only one of which contains the element football. Now, let's see how many sets contain the element winning:

// Using the same method explained above, just changing the query:
String query = "set:winning";

// Again using the same method from above:
int numberFound = results.numResults(); // 2

Just as expected, both sets we stored contain the element winning.

Maps Example

This example will build on the example in the Using Data Types tutorial. That tutorial walks you through storing CMS-style user data in Riak maps, and we'd suggest that you familiarize yourself with that tutorial first. More specifically, user data is stored in the following fields in each user's map:

  • first name in a first_name register
  • last name in a last_name register
  • whether the user is an enterprise customer in an enterprise_customer flag
  • the number of times the user has visited the company page in a page_visits counter
  • a list of the user's interests in an interests set

First, let's create and activate a bucket type simply called maps that is set up to store Riak maps:

riak-admin bucket-type create maps '{"props":{"datatype":"map"}}'
riak-admin bucket-type activate maps

Now, let's create a search index called customers using the default schema:

YokozunaIndex customersIndex = new YokozunaIndex("customers", "_yz_default");
StoreIndex storeIndex =
new StoreIndex.Builder(customersIndex).build();
client.execute(storeIndex);

With our index created, we can associate our new customers index with our maps bucket type:

riak-admin bucket-type update maps '{"props":{"search_index":"customers"}}'

Now we can create some maps along the lines suggested above:

Namespace customersBucket = new Namespace("maps", "customers");

Location idrisElbaMap = new Location(customersBucket, "idris_elba");
MapUpdate mu = new MapUpdate()
.update("first_name", new RegisterUpdate("Idris"))
.update("last_name", new RegisterUpdate("Elba"))
.update("enterprise_customer", new FlagUpdate(false))
.update("page_visits", new CounterUpdate(10))
.update("interests", new SetUpdate().add("acting", "being Stringer Bell"));

Location joanJettMap = new Location(customersBucket, "joan_jett");
MapUpdate mu2 = new MapUpdate()
.update("first_name", new RegisterUpdate("Joan"))
.update("last_name", new RegisterUpdate("Jett"))
// Joan Jett is not an enterprise customer, so we don't need to
// explicitly disable the "enterprise_customer" flag, as all
// flags are disabled by default
.update("page_visits", new CounterUpdate(25))
.update("interests", new SetUpdate().add("loving rock and roll").add("being in the Blackhearts"));

UpdateMap update1 = new UpdateMap.Builder(idrisElbaMap, mu1).build();
UpdateMap update2 = new UpdateMap.Builder(joanJettMap, mu2).build();
client.execute(update1);
client.execute(update2);

Searching Counters Within Maps

We now have two maps stored in Riak that we can query. Let's query to see how many users have page visit counters above 15. Unlike the counters example above, we have to specify which counter we're querying:

// Using the same method explained above, just changing the query:
String query = "page_visits_counter:[15 TO *]";

// Again using the same method from above:
int numberFound = results.numResults(); // 1

As expected, one of our two stored maps has a page_visits counter above 15. Let's make sure that we have the right result:

// Using the same method from above:
String query = "page_visits_counter:[15 TO *]";

// Again using the same method from above:
String registerValue =
results.getAllResults().get(0).get("first_name_register").get(0); // Joan

Success! Now we can test out searching sets.

Searching Sets Within Maps

Each of the maps we stored thus far had an interests set. First, let's see how many of our maps even have sets called interests using a wildcard query:

// Using the same method from above:
String query = "interests_set:*";

As expected, both stored maps have an interests set. Now let's see how many maps have items in interests sets that begin with loving:

// Using the same method from above:
String query = "interests_set:loving*";

// Again using the same method from above:
int numberFound = results.numResults(); // 1
String registerValue =
results.getAllResults().get(0).get("first_name_register").get(0); // Joan

As expected, only our Joan Jett map has one item in its interests set that starts with loving.

Searching Maps Within Maps

Before we can try to search maps within maps, we need to actually store some. Let's add a alter_ego map to both of the maps we've stored thus far. Each person's alter ego will have a first name only.

Location idrisElbaMap = new Location(customersBucket, "idris_elba");
MapUpdate alterEgoUpdateName = new MapUpdate()
.update("name", new RegisterUpdate("John Luther"));
MapUpdate alterEgoUpdate = new MapUpdate()
.update("alter_ego", alterEgoUpdateName);
UpdateMap addSubMap = new UpdateMap.Builder(idrisElbaMap, alterEgoUpdate);
client.execute(addSubMap);

Querying maps within maps involves construct queries that separate the different levels of depth with a single dot. Here's an example query for finding maps that have a name register embedded within an alter_ego map:

// Using the same method from above:
String query = "alter_ego_map.name_register:*";

// Again using the same method from above:
int numberFound = results.numResults(); // 2

Once we know how to query embedded fields like this, we can query those just like any other. Let's find out which maps have an alter_ego sub-map that contains a name register that ends with PLant, and display that customer's first name:

// Using the same method from above:
String query = "alter_ego_map.name_register:*Plant";

// Again using the same method from above:
int numberFound = results.numResults(); // 1
String registerValue =
results.getAllResults().get(0).get("first_name_register").get(0); // Joan

Success! We've now queried not just maps but also maps within maps.