How to create and use indexes in DynamoDB

Let's learn how to create and use indexes in DynamoDB

Table of Contents
  1. Creating the index
    1. On new table creation
    2. Waiting for a table
  2. Updating an existing table
    1. Using the AWS CLI
    2. Using aioboto3
  3. Using Indexes
  4. The end

Creating indexes in DynamoDB allows you to index your data for a specific attribute(s), enabling you to create new access patterns. Let's assume that in your DynamoDB table, you are storing Books. You may have your table set up like this:

  • Partition key: Book Title
  • Sort key: Publisher

Each book may have different attributes:

  • Author (string)
  • Pages (integer)
  • Language (string)
  • PublicationDate (string)
  • ISBN10 (integer)

If you want to access any book that a particular Author wrote, you can create an index for this attribute, which allows you to access any book that an Author wrote.

Let's see how you can do that. This article will use Python library aioboto3.

Creating the index

You can create indexes by specifying an attribute as a partition key or two attributes, one for the partition key and one for the sort key.

On new table creation

If you are starting with a new table, you can create your index alongside your table. The only requirement is to specify the attribute you want to use in your index. Note that when making your indexes at Table creation, your partition key must have the Key Type HASH .

For example:

1async def create_table_and_index()
2 """Create DynamoDB table with index."""
3 session = aioboto3.Session()
4 async with session.resource("dynamodb") as dynamodb:
5 table = await dynamodb.create_table(
6 TableName="Books",
7 KeySchema=[
8 {"AttributeName": "BookTitle", "KeyType": "HASH"}, # Partition key
9 {"AttributeName": "Publisher", "KeyType": "RANGE"}, # Sort key
10 ],
11 AttributeDefinitions=[
12 #
13 # Table attributes.
14 #
15 {"AttributeName": "BookTitle", "AttributeType": "S"},
16 {"AttributeName": "Publisher", "AttributeType": "S"},
17 #
18 # Index attributes.
19 #
20 {"AttributeName": "Author", "AttributeType": "S"},
21 ],
22 GlobalSecondaryIndexes=[
23 {
24 "IndexName": "author-index",
25 "KeySchema": [
26 {"AttributeName": "Author", "KeyType": "HASH"},
27 {"AttributeName": "Publisher", "KeyType": "RANGE"},
28 ],
29 "Projection": {
30 "ProjectionType": "ALL",
31 },
32 }
33 ],
34 # There are required on table creation
35 ProvisionedThroughput={
36 "ReadCapacityUnits": 10,
37 "WriteCapacityUnits": 10,
38 },
39 )
41 await table.wait_until_exists()

Let's look at what we are doing:

  • Key Schema: These are our partition and short keys. The partition key must be of type HASH.
  • Attribute Definitions: Here, we can specify all the attributes. If you are starting from fresh, you should include your attributes at table creation. In this example, we only add Author to create the index.
  • Global Secondary Indexes: This is a list of secondary indexes to be created. We need to give the index a name, a key schema and a Projection.

For the KeySchema, we included Author as the partition key and Publisher as the short key in, allowing us to query the index by Author to get all books or by Author and a specific Publisher.

The ProjectionType we chose ALL which will copy all the attributes into the index. Note that by doing this, you will consume double the space, but it means that you don't need to copy attributes manually. You can also choose which attributes to copy over with INCLUDE.

You may want to look at the boto3 - create_table documentation for a full description of all the things you can specify when creating your table and indexes.

Waiting for a table

You need to wait for the table to be created before you can use it.

1await table.wait_until_exists()

This is why we are calling wait_until_exists. This method will call to describe_table every 20 seconds until either the table is created or it is called describe_table 25 times (roughly after 8 minutes). An error is returned if the table isn't created after this time.

Updating an existing table

You can create an index on an existing table as well. There are a few ways to do it:

  • Through the AWS console
  • Using the AWS CLI
  • Through another boto call

Using the AWS CLI

If you have the AWS CLI installed on your machine, you can run the following command to create an index on your table:

1aws dynamodb update-table \
2 --table-name Books \
3 --attribute-definitions AttributeName=Author,AttributeType=S \
4 --global-secondary-index-updates \
5 "[{\"Create\":{\"IndexName\": \"author-index\",\"KeySchema\":[{\"AttributeName\":\"Author\",\"KeyType\":\"HASH\"}], \
6 \"ProvisionedThroughput\": {\"ReadCapacityUnits\": 10, \"WriteCapacityUnits\": 5 },\"Projection\":{\"ProjectionType\":\"ALL\"}}}]"

Using aioboto3

If you prefer to use aioboto3 to update your table, you can use the following command.

1async def create_index()
2 """Create DynamoDB index."""
3 session = aioboto3.Session()
4 async with session.resource("dynamodb") as dynamodb:
5 table = await dynamodb.update_table(
6 TableName="Books",
7 AttributeDefinitions=[
8 #
9 # Table attributes.
10 #
11 {"AttributeName": "BookTitle", "AttributeType": "S"},
12 {"AttributeName": "Publisher", "AttributeType": "S"},
13 #
14 # Index attributes.
15 #
16 {"AttributeName": "Author", "AttributeType": "S"},
17 ],
18 GlobalSecondaryIndexes=[
19 {
20 "IndexName": "author-index",
21 "KeySchema": [
22 {"AttributeName": "Author", "KeyType": "HASH"},
23 {"AttributeName": "Publisher", "KeyType": "RANGE"},
24 ],
25 "Projection": {
26 "ProjectionType": "ALL",
27 },
28 }
29 ],
30 )
32 await table.wait_until_exists()

As you can see, the code to create an index on an existing table is the same as the one that we use when creating a table.

Using Indexes

Now that we have seen how we can create indexes to query our data by a specific attribute. Let's see how we can query the index. You can use the query method and specify which index to query by passing the index name.

1async def get_author(author_name: str):
2 """Get Author by using its index."""
3 session = aioboto3.Session()
4 async with session.resource("dynamodb") as dynamodb:
5 result = await table.query(
6 IndexName="author-index",
7 ExpressionAttributeValues={
8 ":author": author_name,
9 },
10 KeyConditionExpression=(
11 "Author=:author"
12 ),
13 )

In this example, we create a variable author by using : under the ExpressionAttributeValue. We could call this anything we wanted, for example, :author_name.

Then under KeyConditionExpression, we are saying that if Author equals the variable :author, return that.

The end

At first, when working with DynamoDB, indexes seemed difficult to understand - mostly because I didn't believe it would be this easy to create and query one.

I hope this article is helpful for you and that you can use it in your DynamoDB journey!


0 Like 0 Comment

You might also like these

This article will show you how to setup DynamoDB locally, so you can test your code without having to use your AWS account.

Read More

How to setup DynamoDB locally

How to setup DynamoDB locally

Using the command django makemigrations failed, so I had to figure out how to use psql on a docker container and add those columns manually.

Read More

Add column to postgres table

Add column to postgres table

Exploration on how to run Pyscript in a React (NextJS) app, this article explores issues and solutions to run PyScript in a React app.

Read More

How to run PyScript in React

How to run PyScript in React

While working on adding tests to Pyscript I came across a use case where I had to check if an example image is always generated the same.

Read More

How to compare two images using NumPy

How to compare two images using NumPy