Data Modeling

Warning

For an RDBMS, you can create a normalized data model without thinking about access patterns. You can then extend it later when new questions and query requirements arise. By contrast, in Amazon DynamoDB, you shouldn’t start designing your schema until you know the questions that it needs to answer. Understanding the business problems and the application use cases up front is absolutely essential.

DynamoDB Approach¶

As explained in the project overview, DynamoDB approaches data modeling differently from relational databases.

Access Patterns¶

In each DynamoDB table, data is organized for locality of expected access patterns. This gives you excellent performance if the structure of the data and access patterns agree and terrible performance otherwise.

To design a DynamoDB table that scales efficiently, you must first identify the access patterns required by the business logic.

Suppose we are building a music library with this entity model.

Data Model

This music library needs to support the following access patterns.

The Most Common Access Patterns
1	Load album and its tracks by album token
2	Load track by track token
3	List tracks in a playlist
4	List albums by artist name
5	Find album by album title
6	Find track by track title
...

Efficient Schema¶

An efficient schema that keeps related data in close proximity has a major impact on cost and performance. Instead of distributing related data items across multiple tables, you should keep related items in your table as close together as possible. Typically, this means storing related rows in the same table, and with the same partition key. Within a partition key, related rows should share a prefix of the sort key.

We can design an efficient schema using these patterns:

Denormalization: Schema flexibility lets DynamoDB store structured data, such as lists, sets, and nested objects, in a single item.
Composite key aggregation: Deliberate key design puts related entities close together.
Sort order: Related items can be grouped together and queried efficiently if their key design causes them to sort together.
Global secondary indexes: By creating specific global secondary indexes, you can enable different queries than your main table can support, and that are still fast and inexpensive.
Adjacency list: Adjacency lists are a design pattern that is useful for modeling many-to-many relationships. More generally, they provide a way to represent graph data (nodes and edges).

Primary Key		Attributes
partition_key	sort_key	Attributes
ALBUM_1	INFO	album_title	album_artist	release_date	genre
ALBUM_1	INFO	The Dark Side of the Moon	Pink Floyd	1973-03-01	Progressive rock
ALBUM_1	TRACK_1	track_title	run_length
ALBUM_1	TRACK_1	Speak to Me	PT1M13S
ALBUM_1	TRACK_2	track_title	run_length
ALBUM_1	TRACK_2	Breathe	PT2M43S
ALBUM_1	TRACK_3	track_title	run_length
ALBUM_1	TRACK_3	On the Run	PT3M36S
...
ALBUM_2	INFO	album_title	album_artist	release_date	genre
ALBUM_2	INFO	The Wall	Pink Floyd	1979-11-30	Progressive rock
ALBUM_2	TRACK_1	track_title	run_length
ALBUM_2	TRACK_1	In the Flesh?	PT3M20S
...
PLAYLIST_1	INFO	playlist_name	playlist_size	playlist_tracks	playlist_version
PLAYLIST_1	INFO	Psychedelic Rock Essentials	100	ALBUM_1/TRACK_1, ALBUM_1322/TRACK_9, ALBUM_3423/TRACK_3, ALBUM_84/TRACK_10, ALBUM_2/TRACK_5, ...	12
...

This table uses a composite primary key, (parition_key, sort_key), to identify each item.

The key ("ALBUM_1", "INFO") identifies ALBUM_1’s metadata.
The key ("ALBUM_1", "TRACK_1") identifies ALBUM_1’s first track.
The key ("PLAYLIST_1", "INFO") identifies PLAYLIST_1’s content.

It uses secondary indexes to answer additional queries.

GSI Primary Key		Projected Attributes
artist_name	partition_key	Projected Attributes
Pink Floyd	ALBUM_1	album_title	sort_key	release_date	genre
Pink Floyd	ALBUM_1	The Dark Side of the Moon	INFO	1973-03-01	Progressive rock
Pink Floyd	ALBUM_2	album_title	sort_key	release_date	genre
Pink Floyd	ALBUM_2	The Wall	INFO	1979-11-30	Progressive rock
...
The Beatles	ALBUM_232	album_title	sort_key	release_date	genre
The Beatles	ALBUM_232	Revolver	INFO	1966-06-21	Rock
...

This global secondary index groups AlbumInfo by artist_name and sorts them by the primary index parition_key.

Tempest¶

Tempest lets you define strongly typed data models on top of your DynamoDBMapper classes.

Kotlin - SDK 2.xJava - SDK 2.xKotlin - SDK 1.xJava - SDK 1.x

interface MusicDb : LogicalDb {
   @TableName("music_items")
   val music: MusicTable
}

interface MusicTable : LogicalTable<MusicItem> {
  val albumInfo: InlineView<AlbumInfo.Key, AlbumInfo>
  val albumTracks: InlineView<AlbumTrack.Key, AlbumTrack>

  val playlistInfo: InlineView<PlaylistInfo.Key, PlaylistInfo>

  // Global Secondary Indexes.
  val albumInfoByGenre: SecondaryIndex<AlbumInfo.GenreIndexOffset, AlbumInfo>
  val albumInfoByArtist: SecondaryIndex<AlbumInfo.ArtistIndexOffset, AlbumInfo>

  // Local Secondary Indexes.
  val albumTracksByTitle: SecondaryIndex<AlbumTrack.TitleIndexOffset, AlbumTrack>
}

@DynamoDbBean
class MusicItem {
  // Primary key.
  @get:DynamoDbPartitionKey
  @get:DynamoDbSecondarySortKey(indexNames = ["genre_album_index", "artist_album_index"])
  var partition_key: String? = null
  @get:DynamoDbSortKey
  var sort_key: String? = null
  // Attributes...
}

public interface MusicDb extends LogicalDb {
  @TableName("music_items")
  MusicTable music();
}

public interface MusicTable extends LogicalTable<MusicItem> {
  InlineView<AlbumInfo.Key, AlbumInfo> albumInfo();
  InlineView<AlbumTrack.Key, AlbumTrack> albumTracks();

  InlineView<PlaylistInfo.Key, PlaylistInfo> playlistInfo();

  // Global Secondary Indexes.
  SecondaryIndex<AlbumInfo.GenreIndexOffset, AlbumInfo> albumInfoByGenre();
  SecondaryIndex<AlbumInfo.ArtistIndexOffset, AlbumInfo> albumInfoByArtist();

  // Local Secondary Indexes.
  SecondaryIndex<AlbumTrack.TitleIndexOffset, AlbumTrack> albumTracksByTitle();
}

@DynamoDbBean
public class MusicItem {
  // All Items.
  String partition_key = null;
  String sort_key = null;
  // Attributes...

  @DynamoDbAttribute("partition_key")
  @DynamoDbPartitionKey
  @DynamoDbSecondarySortKey(indexNames = {"genre_album_index", "artist_album_index"})
  public String getPartitionKey() {
    return partition_key;
  }

  public void setPartitionKey(String partition_key) {
    this.partition_key = partition_key;
  }

  @DynamoDbAttribute("sort_key")
  @DynamoDbSortKey
  public String getSortKey() {
    return sort_key;
  }

  public void setSortKey(String sort_key) {
    this.sort_key = sort_key;
  }
  // Getters and setters...
}

interface MusicDb : LogicalDb {
  val music: MusicTable
}

interface MusicTable : LogicalTable<MusicItem> {
  val albumInfo: InlineView<AlbumInfo.Key, AlbumInfo>
  val albumTracks: InlineView<AlbumTrack.Key, AlbumTrack>

  val playlistInfo: InlineView<PlaylistInfo.Key, PlaylistInfo>

  // Global Secondary Indexes.
  val albumInfoByGenre: SecondaryIndex<AlbumInfo.GenreIndexOffset, AlbumInfo>
  val albumInfoByArtist: SecondaryIndex<AlbumInfo.ArtistIndexOffset, AlbumInfo>

  // Local Secondary Indexes.
  val albumTracksByTitle: SecondaryIndex<AlbumTrack.TitleIndexOffset, AlbumTrack>
}

@DynamoDBTable(tableName = "music_items")
class MusicItem {
  // Primary key.
  @DynamoDBHashKey
  @DynamoDBIndexRangeKey(globalSecondaryIndexNames = ["genre_album_index", "artist_album_index"])
  var partition_key: String? = null
  @DynamoDBRangeKey
  var sort_key: String? = null
  // Attributes...
}

public interface MusicDb extends LogicalDb {
  MusicTable music();
}

public interface MusicTable extends LogicalTable<MusicItem> {
  InlineView<AlbumInfo.Key, AlbumInfo> albumInfo();
  InlineView<AlbumTrack.Key, AlbumTrack> albumTracks();

  InlineView<PlaylistInfo.Key, PlaylistInfo> playlistInfo();

  // Global Secondary Indexes.
  SecondaryIndex<AlbumInfo.GenreIndexOffset, AlbumInfo> albumInfoByGenre();
  SecondaryIndex<AlbumInfo.ArtistIndexOffset, AlbumInfo> albumInfoByArtist();

  // Local Secondary Indexes.
  SecondaryIndex<AlbumTrack.TitleIndexOffset, AlbumTrack> albumTracksByTitle();
}

@DynamoDBTable(tableName = "music_items")
public class MusicItem {
  // All Items.
  String partition_key = null;
  String sort_key = null;
  // Attributes...

  @DynamoDBHashKey(attributeName = "partition_key")
  @DynamoDBIndexRangeKey(globalSecondaryIndexNames = {"genre_album_index", "artist_album_index"})
  public String getPartitionKey() {
    return partition_key;
  }

  public void setPartitionKey(String partition_key) {
    this.partition_key = partition_key;
  }

  @DynamoDBRangeKey(attributeName = "sort_key")
  public String getSortKey() {
    return sort_key;
  }

  public void setSortKey(String sort_key) {
    this.sort_key = sort_key;
  }
  // Getter and setters...
}

Tempest has these components:

Logical DB
- Logical tables (1 to 1 with your DynamoDBMapper classes)
  - Inline views
    - Key type
    - Item type
  - Secondary indexes
    - Offset type
    - Item type

Logical DB¶

A LogicalDb is a collection of tables that implement the DynamoDB best practice of putting multiple item types into the same storage table. This makes it possible to perform aggregate operations and transactions on those item types.

For example, you can batch load up to 100 items in a single request.

KotlinJava

val items = musicDb.batchLoad(
  AlbumTrack.Key("ALBUM_1", "TRACK_5"),
  AlbumTrack.Key("ALBUM_2", "TRACK_3"),
  PlaylistInfo.Key("PLAYLIST_1"))

ItemSet items = db.batchLoad(
    List.of(
        new AlbumTrack.Key("ALBUM_1", "TRACK_5"),
        new AlbumTrack.Key("ALBUM_2", "TRACK_3"),
        new PlaylistInfo.Key("PLAYLIST_1")));

To create a LogicalDb, you need to pass in an instance of DynamoDBMapper.

Kotlin - SDK 2.xJava - SDK 2.xKotlin - SDK 1.xJava - SDK 1.x

val enhancedClient = DynamoDbEnhancedClient.create()
val db: MusicDb = LogicalDb(enhancedClient)

DynamoDbEnhancedClient enhancedClient = DynamoDbEnhancedClient.create();
MusicDb db = LogicalDb.create(MusicDb.class, enhancedClient);

val client: AmazonDynamoDB = AmazonDynamoDBClientBuilder.standard().build()
val mapper: DynamoDBMapper = DynamoDBMapper(client)
val db: MusicDb = LogicalDb(mapper)

AmazonDynamoDB client = AmazonDynamoDBClientBuilder.standard().build();
DynamoDBMapper mapper = new DynamoDBMapper(client);
MusicDb db = LogicalDb.create(MusicDb.class, mapper);

Optional Configuration¶

When you create an instance of DynamoDBMapper, it has certain default behaviors; you can override these defaults by using the DynamoDBMapperConfig class.

The following code snippet creates a DynamoDBMapper with custom settings:

Kotlin - SDK 2.xJava - SDK 2.xKotlin - SDK 1.xJava - SDK 1.x

val client = DynamoDbClient.create()
val enhancedClient = DynamoDbEnhancedClient.builder()
  .dynamoDbClient(client)
  .extensions(listOf(/* ... */))
  .build()
val db: MusicDb = LogicalDb(enhancedClient)

DynamoDbClient client = DynamoDbClient.create();
DynamoDbEnhancedClient enhancedClient = DynamoDbEnhancedClient.builder()
    .dynamoDbClient(client)
    .extensions(List.of(/* ... */))
    .build();
MusicDb db = LogicalDb.create(MusicDb.class, enhancedClient);

val client = AmazonDynamoDBClientBuilder.standard().build()
val mapperConfig = DynamoDBMapperConfig.builder()
  .withSaveBehavior(SaveBehavior.CLOBBER)
  .withConsistentReads(ConsistentReads.CONSISTENT)
  .withTableNameOverride(null)
  .withPaginationLoadingStrategy(PaginationLoadingStrategy.EAGER_LOADING)
  .build()
val mapper = DynamoDBMapper(client, mapperConfig)
val db: MusicDb = LogicalDb(mapper)

AmazonDynamoDB client = AmazonDynamoDBClientBuilder.standard().build();
DynamoDBMapperConfig mapperConfig = DynamoDBMapperConfig.builder()
    .withSaveBehavior(DynamoDBMapperConfig.SaveBehavior.CLOBBER)
    .withConsistentReads(DynamoDBMapperConfig.ConsistentReads.CONSISTENT)
    .withTableNameOverride(null)
    .withPaginationLoadingStrategy(DynamoDBMapperConfig.PaginationLoadingStrategy.EAGER_LOADING)
  .build();
DynamoDBMapper mapper = new DynamoDBMapper(client, mapperConfig);
MusicDb db = LogicalDb.create(MusicDb.class, mapper);

For more information, see the DynamoDBMapper documentation

Logical Table¶

A LogicalTable is a collection of views on a DynamoDB table that makes it easy to model heterogeneous items using strongly typed data classes.

Inline View¶

An InlineView lets you perform CRUD operations, queries, and scans on an entity type.

KotlinJava

interface MusicTable : LogicalTable<MusicItem> {
  val albumInfo: InlineView<AlbumInfo.Key, AlbumInfo>
}

data class AlbumInfo(
  @Attribute(name = "partition_key")
  val album_token: String,
  val album_title: String,
  val artist_name: String,
  val release_date: LocalDate,
  val genre_name: String
) {
  @Attribute(prefix = "INFO_")
  val sort_key: String = ""

  data class Key(
    val album_token: String
  ) {
    val sort_key: String = ""
  }
}

public interface MusicTable extends LogicalTable<MusicItem> {
  InlineView<AlbumInfo.Key, AlbumInfo> albumInfo();
}

public class AlbumInfo {
  @Attribute(name = "partition_key")
  public final String album_token;
  public final String album_title;
  public final String artist_name;
  public final LocalDate release_date;
  public final String genre_name;

  @Attribute(prefix = "INFO_")
  public final String sort_key = "";

  public AlbumInfo(
      String album_token,
      String album_title, 
      String artist_name,
      LocalDate release_date,
      String genre_name) {
    this.album_token = album_token;
    this.album_title = album_title;
    this.artist_name = artist_name;
    this.release_date = release_date;
    this.genre_name = genre_name;
  }

  public static class Key {
    public final String album_token;
    public final String sort_key = "";

    public Key(String album_token) {
      this.album_token = album_token;
    }
  }
}

The albumInfo view is a type-safe way to access AlbumInfo entities:

Primary Key		Attributes
partition_key	sort_key
ALBUM_1	INFO	album_title	album_artist	release_date	genre
		The Dark Side of the Moon	Pink Floyd	1973-03-01	Progressive rock
...
ALBUM_2	INFO	album_title	album_artist	release_date	genre
		The Wall	Pink Floyd	1979-11-30	Progressive rock
...

Prefixes are 1:1 with Types

A LogicalTable can have multiple InlineViews. Tempest requires you to declare a prefix on the sort key in each entity type. It uses the prefix to determine the entity type.

Prefix	Type
INFO_	AlbumInfo
TRACK_	AlbumTrack

KotlinJava

interface MusicTable : LogicalTable<MusicItem> {
  val albumInfo: InlineView<AlbumInfo.Key, AlbumInfo>
  val albumTracks: InlineView<AlbumTrack.Key, AlbumTrack>
}

data class AlbumInfo(
  @Attribute(name = "partition_key")
  val album_token: String,
  // ...
) {
  @Attribute(prefix = "INFO_")
  val sort_key: String = ""
}

data class AlbumTrack(
  @Attribute(name = "partition_key")
  val album_token: String,
  @Attribute(name = "sort_key", prefix = "TRACK_")
  val track_token: String,
  // ...
)

public interface MusicTable extends LogicalTable<MusicItem> {
  InlineView<AlbumInfo.Key, AlbumInfo> albumInfo();
  InlineView<AlbumTrack.Key, AlbumTrack> albumTracks();
}

public class AlbumInfo {
  @Attribute(name = "partition_key")
  public final String album_token;
  // ...
  @Attribute(prefix = "INFO_")
  public final String sort_key = "";
  // ...
}

public class AlbumTrack {
  @Attribute(name = "partition_key")
  public final String album_token;
  @Attribute(name = "sort_key", prefix = "TRACK_")
  public final String track_token;
  // ...
}

Secondary Index¶

An SecondaryIndex lets you perform queries, and scans on an entity type.

KotlinJava

interface MusicTable : LogicalTable<MusicItem> {
  val albumInfoByArtist: SecondaryIndex<AlbumInfo.ArtistIndexOffset, AlbumInfo>
}

data class AlbumInfo(
  @Attribute(name = "partition_key")
  val album_token: String,
  val album_title: String,
  val artist_name: String,
  val release_date: LocalDate,
  val genre_name: String
) {
  @Attribute(prefix = "INFO_")
  val sort_key: String = ""

  @ForIndex("artist_album_index")
  data class ArtistIndexOffset(
    val artist_name: String,
    val album_token: String? = null,
    // To uniquely identify an item in pagination.
    val sort_key: String? = null
  )
}

public interface MusicTable extends LogicalTable<MusicItem> {
  SecondaryIndex<AlbumInfo.ArtistIndexOffset, AlbumInfo> albumInfoByArtist();
}

public class AlbumInfo {
  @Attribute(name = "partition_key")
  public final String album_token;
  public final String album_title;
  public final String artist_name;
  public final LocalDate release_date;
  public final String genre_name;

  @Attribute(prefix = "INFO_")
  public final String sort_key = "";

  @ForIndex(name = "artist_album_index")
  public static class ArtistIndexOffset {
    public final String artist_name;
    @Nullable
    public final String album_token;
    // To uniquely identify an item in pagination.
    @Nullable
    public final String sort_key;

    public ArtistIndexOffset(String artist_name) {
      this(artist_name, null, null);
    }

    public ArtistIndexOffset(String artist_name, String album_token) {
      this(artist_name, album_token, null);
    }

    public ArtistIndexOffset(String artist_name, @Nullable String album_token,
        @Nullable String sort_key) {
      this.artist_name = artist_name;
      this.album_token = album_token;
      this.sort_key = sort_key;
    }
  }
}

DynamoDB secondary indexes allows duplicate values. In order to uniquely identify an item in pagination, a secondary index offset type needs to include the primary index partition key and sort key in addition to the secondary index partition key and sort key.

Secondary index offset types are also required to have a @ForIndex annotation that tells Tempest the index name.

Properties are always mapped by name

Our secondary index data class properties have the exact same name as the properties in our DynamoDB mapper class. Tempest uses name equality to bind indexes, keys, and items to the DynamoDB mapper class. Each Tempest type represents a different logical subset of the available attributes.

The mapper class is just the union of the fields in each item, key, and secondary index.

MusicItem	AlbumInfo.Key	AlbumInfo.ArtistIndexOffset	AlbumInfo	AlbumTrack.Key	AlbumTrack
partition_key	partition_key	partition_key	partition_key	partition_key	partition_key
sort_key	sort_key	sort_key	sort_key	sort_key	sort_key
album_title			album_title
artist_name		artist_name	artist_name
release_date			release_date
genre			genre
track_title					track_title
run_length					run_length

Custom Attribute Types¶

Tempest uses DynamoDBMapper to encode and decode entities.

DynamoDBMapper supports these primitive Java types.

You may use DynamoDBTypeConverter to support custom attribute types.

Kotlin - SDK 2.xJava - SDK 2.xKotlin - SDK 1.xJava - SDK 1.x

@DynamoDbBean
class MusicItem {
  // ...
  @get:DynamoDbAttribute
  @get:DynamoDbConvertedBy(LocalDateTypeConverter::class)
  var release_date: LocalDate? = null
  // ...
}

internal class LocalDateTypeConverter : AttributeConverter<LocalDate> {
  override fun transformFrom(input: LocalDate): AttributeValue {
    return AttributeValue.builder().s(input.toString()).build()
  }

  override fun transformTo(input: AttributeValue): LocalDate {
    return LocalDate.parse(input.s())
  }

  override fun type(): EnhancedType<LocalDate> {
    return EnhancedType.of(LocalDate::class.java)
  }

  override fun attributeValueType(): AttributeValueType {
    return AttributeValueType.S
  }
}

@DynamoDbBean
public class MusicItem {
  // ...
  @DynamoDbAttribute("release_date")
  @DynamoDbConvertedBy(LocalDateTypeConverter.class)
  public String getReleaseDate() {
    return release_date;
  }
  // ...
}

class LocalDateTypeConverter implements AttributeConverter<LocalDate> {
  @Override public AttributeValue transformFrom(LocalDate input) {
    return AttributeValue.builder().s(input.toString()).build();
  }

  @Override public LocalDate transformTo(AttributeValue input) {
    return LocalDate.parse(input.s());
  }

  @Override public EnhancedType<LocalDate> type() {
    return EnhancedType.of(LocalDate.class);
  }

  @Override public AttributeValueType attributeValueType() {
    return AttributeValueType.S;
  }
}

@DynamoDBTable(tableName = "music_items")
class MusicItem {
  // ...
  @DynamoDBAttribute
  @DynamoDBTypeConverted(converter = LocalDateTypeConverter::class)
  var release_date: LocalDate? = null
  // ...
}

class LocalDateTypeConverter : DynamoDBTypeConverter<String, LocalDate> {
  override fun unconvert(string: String): LocalDate {
    return LocalDate.parse(string)
  }

  override fun convert(localDate: LocalDate): String {
    return localDate.toString()
  }
}

@DynamoDBTable(tableName = "music_items")
public class MusicItem {
  // ...
  @DynamoDBAttribute(attributeName = "release_date")
  @DynamoDBTypeConverted(converter = LocalDateTypeConverter.class)
  public LocalDate getReleaseDate() {
    return release_date;
  }
  // ...
}

class LocalDateTypeConverter implements DynamoDBTypeConverter<String, LocalDate> {
  @Override public String convert(LocalDate object) {
    return object.toString();
  }

  @Override public LocalDate unconvert(String object) {
    return LocalDate.parse(object);
  }
}

Check out the code samples on Github:

Music Library - SDK 1.x (.kt, .java)
Music Library - SDK 2.x (.kt, .java)