A small library for importing/exporting BigTable instance schemas and row data.

Related tags

Spring Boot Vandelay
Overview

Vandelay

An Importer/Exporter for BigTable

Lifecycle

A small library for BigTable instance and data management. Vandelay allows you to export the full configuration of a BigTable instance and data for all rows present in every table. Instance configurations and table data an also be imported to recreate BigTable instances as they were at the time of export.

Features

  • Capture all configuration information for an instance (clusters, backups, app profiles, IAM policies)
  • Capture all information for tables -- Row keys, column family, qualifiers, timestamps, values, labels, IAM policies, table backups
  • Export data and automatically delete BigTable instances
  • Import data and automatically (re)create BigTable instances from a saved export configuration
  • SLF4J integration for customized logging
  • Lightweight with minimal dependencies

Getting Started

Include the latest version of Vandelay into your project (Maven):

<dependency>
  <groupId>com.spotify</groupId>
  <artifactId>vandelay-bigtable</artifactId>
  <version>{latest.version.here}</version>
</dependency>

Overview

Vandelay provides a simple set of APIs to perform a full export and import of BigTable instances. The project is split across several modules to allow for better customization and future extensibility for other databases. As it stands, the project heirarchy is as follows:

  • vandelay-core: Common set of interfaces and objects used by the other modules. These can be extended/implemented to support other database types.
  • vandelay-serialization-core: Common set of interfaces, objects, and utility classes to support serialization. These can be extended/implemented to support custom serialization of database configurations and row data.
  • vandelay-serialization-jackson: A wrapper around Jackson that supports JSON serialization.
  • vandelay-bigtable: A library that exposes APIs for exporting and importing a BigTable instance.
  • vandelay-bigtable-demo: A demo project that has a sample implementation for exporting and importing a BigTable instance using the vandelay-bigtable library.

Architecture

Exporting

The export flow is shown in the diagram below.

Exporter architecture

Importing

The import flow is shown in the diagram below.

Importer architecture

Usage

Exporting an instance

Before beginning an export, verify that your GCP credentials have the appropriate IAM permissions.

The export process requires several steps to set up the export process. These steps are explained in further detail below. There is a vandelay-bigtable-demo project that contains a command-line demo project for exporting and importing a BigTable instance.

  1. Create an export configuration
  2. Create the BigTable admin and data client wrapper
  3. Create the exporter
  4. Subscribing to receive export messages
  5. Running the exporter
  6. Saving the configuration and row data

Create an export configuration

An export configuration is the configuration that the exporter uses to perform the export. This configuration contains four options that must be specified

  public VandelayBigTableExportConfiguration(
    /* Project ID where the BigTable instance resides */
    final String projectId,
    /* Instance ID of the BigTable instance */
    final String instanceId,
    /* True if the instance should be deleted after exporting */
    final boolean deleteOnExport,
    /* True if the export should be multi-threaded */
    final boolean parallelExport)

Instantiating an export configuration simply requires creating an instance of the export configuration

  final VandelayBigTableExportConfiguration configuration =
    new VandelayBigTableExportConfiguration(projectId, instanceId, false, false);

Create the BigTable admin and data client wrapper

This class wraps the BigTable admin and data clients and is passed to the exporter. Simply create an instance as it requires no configuration.

  final VandelayBigTableClient client = new VandelayBigTableClient();

Create the exporter

With the configuration and client wrapper created, the exporter is now ready to instantiate. The exporter takes in the client wrapper as its only argument

  final VandelayBigTableExporter exporter = new VandelayBigTableExporter(client);

Subscribing to receive export messages

The exporter will emit messages when certain events happen during the export process. These events will be wrapped in a VandelayBigTableExportEvent object containing the specific event object that was emitted. There are four type of export events defined in VandelayBigTableExportEvent

  /* A row has been exported */
  private VandelayEvent<VandelayBigTableRow> rowExportEvent;
  
  /* A BigTable instance configuration has been exported */
  private VandelayEvent<VandelayBigTableGCPConfiguration> infoEvent;
  
  /* An export has finished */
  private VandelayEvent<VandelayBigTableRowsExportFinishedEvent> exportFinishedEvent;
  
  /* A BigTable instance has been deleted */
  private VandelayEvent<VandelayBigTableDeleteInstanceEvent> deleteInstanceEvent;

To subscribe to receive events, register as a listener after instantiating the exporter. You can then handle any events that you are interested in. At a basic level, a handler can look like the following:

  final VandelayBigTableRowBatch rows = new VandelayBigTableRowBatch();
  exporter.addListener(
    event -> {
      if (event.getInfoEvent().isPresent()) {
        // Handle instance information event
      } else if (event.getRowExportEvent().isPresent()) {
        // Handle row export event
      } else if (event.getExportFinishedEvent().isPresent()) {
        // Handle export finished event
      } else if (event.getDeleteEvent().isPresent()) {
        // Handle delete instance event
      }
    });

Running the exporter

After subscribing to receive the export messages, the export process can begin. This happens with a call to exportTo

  final Status exportStatus = exporter.exportTo(configuration);

Saving the configuration and row data

The last step is to save the information that is exported from the BigTable instance: the instance configuration and the row data. Instance configuration information is provided in a VandelayBigTableGCPConfiguration object and row information is provided in a VandelayBigTableRow object (see above). After subscribing to listen to events from the exporter, you can serialize these objects in any format of your choosing. There is a VandelayJacksonSerializer provided in the vandelay-serialization-jackson project that allows you to serialize and deserialize from JSON. To perform serialization, instantiate a VandelayJacksonSerializer instance, call the serialize method with the desired object, and save the resulting content to a file.

  final VandelayJacksonSerializer<VandelayBigTableGCPConfiguration> serializer =
    new VandelayJacksonSerializer<>();
  final String config = serializer.serialize(event.getInfoEvent().get().getValue());
  writeToFile(configOutputPath, config);
    
  ...
    
  final VandelayJacksonSerializer<VandelayBigTableRowBatch> serializer =
    new VandelayJacksonSerializer<>();
  final String data = serializer.serialize(rows);
  writeToFile(dataOutputPath, data);

Importing an instance

Once you have a configuration and row data file, you can perform an import. This will re-create the BigTable instance as it was and populate it with the row data for each table.

The import process is as follows; the steps are mostly the reverse of the export process. As with exporting, you can reference the vandelay-bigtable-demo project for a command-line demo project for exporting and importing a BigTable instance.

  1. Read and deserialize the instance configuration
  2. Read and deserialize the row data
  3. Create an import configuration
  4. Create the BigTable admin and data client wrapper
  5. Create the importer
  6. Import the instance configuration
  7. Import the row data

Read and deserialize the instance configuration

Once you have a serialized configuration file, you can simply read it in and deserialize it back into a VandelayBigTableGCPConfiguration object. The serialization format is at the user's discretion, but this project provides a default JSON serializer in the VandelayJacksonSerializer class. To deserialize, instantiate the VandelayJacksonSerializer and call the deserialize method.

  final VandelayJacksonSerializer<VandelayBigTableGCPConfiguration> configSerializer =
    new VandelayJacksonSerializer<>();
  final String configContent = Files.readString(Path.of(configInputPath));
  final VandelayBigTableGCPConfiguration config =
    configSerializer.deserialize(configContent, VandelayBigTableGCPConfiguration.class);

Read and deserialize the row data

Reading and deserializing the row data follows the same process as above: simply instantiate the serialize and deserialize the content

  final VandelayJacksonSerializer<VandelayBigTableRowBatch> dataSerializer =
    new VandelayJacksonSerializer<>();
  final String dataContent = Files.readString(Path.of(dataInputPath));
  final VandelayBigTableRowBatch rowBatch =
    dataSerializer.deserialize(dataContent, VandelayBigTableRowBatch.class);

Create an import configuration

Once you have a deserialized configuration in a VandelayBigTableGCPConfiguration object, you can create an import configuration with it. Like the export configuration, the import configuration contains several options that must be specified

  public VandelayBigTableImportConfiguration(
    /* Project ID where the BigTable instance resides */
    final String projectId,
    /* Instance ID of the BigTable instance */
    final String instanceId,
    /* The instance configuration that was exported */
    final VandelayBigTableGCPConfiguration gcpConfiguration,
    /* True to create the instance if it doesn't exist already */
    final boolean createInstanceIfDoesntExist,
    /* True to create tables if they don't exist */
    final boolean createTableIfDoesntExist)

The import configuration is instantiated as follows:

  final VandelayBigTableImportConfiguration importConfiguration =
    new VandelayBigTableImportConfiguration(projectId, instanceId, config, true, true);

Create the BigTable admin and data client wrapper

This class wraps the BigTable admin and data clients and is passed to the importer. Simply create an instance as it requires no configuration.

  final VandelayBigTableClient client = new VandelayBigTableClient();

Create the importer

With the configuration and client wrapper created, the importer is now ready to instantiate. The importer takes in the client wrapper as its only argument

  final VandelayBigTableImporter importer = new VandelayBigTableImporter(client);

Import the instance configuration

Before importing row data, the instance configuration should be imported so that the BigTable instance can be recreated as it was. To perform this step of the import process, simply pass in the created VandelayBigTableImportConfiguration to the importer's importFrom method.

  final Status status = importer.importFrom(importConfiguration);

if the result in Status is successful, then the BigTable instance has been recreated. At this point, the import of the row data can begin.

Import the row data

Importing the row data is the last step in the import process. The importer defines an addRows method to do this, which takes in two arguments

  public Status addRows(
    /* The import configuration for the BigTable instance */
    final VandelayBigTableImportConfiguration importConfiguration,
      
    /* A list of rows to import */
    final List<VandelayBigTableRow> rows)

The list of VandelayBigTableRow contains the row data to import with the call. The API is designed in this manner to allow users to define their own batching policies for importing. You can import all rows with one call, or import data across multiple calls to addRows with a subset of rows.

Contributing

Contributions to the project are always welcome! This project uses Maven to build and release.

To build all modules, run the following:

mvn clean compile

To run the tests locally, run the following:

mvn clean verify

If you are a maintainer, you can release a new version by running mvn release:prepare

Code of Conduct

This project adheres to the Open Code of Conduct. By participating, you are expected to honor this code.

You might also like...

In this project, we will implement two Spring Boot Java Web application called, streamer-data-jpa and streamer-data-r2dbc.

In this project, we will implement two Spring Boot Java Web application called, streamer-data-jpa and streamer-data-r2dbc.

In this project, we will implement two Spring Boot Java Web application called, streamer-data-jpa and streamer-data-r2dbc. They both will fetch 1 million of customer's data from MySQL and stream them to Kafka. The main goal is to compare the application's performance and resource utilization.

Nov 2, 2022

Repositório referente ao código de uma classe data, com testes JUNIT, classe de exceção própria e classe aplicação para demonstrar as diversas funcionalidades da classe data

Exercicio-Data Repositório referente ao código de uma classe data, com testes JUNIT, classe de exceção própria e classe aplicação para demonstrar as d

May 4, 2021

Let Litematica be able to paste tile entity data of block / entity data in a server

Litematica Server Paster Let Litematica be able to paste tile entity data of block / entity data in a server By using a custom chat packet to bypass t

Dec 24, 2022

Split into data blocks,In this format, efficient reading can be realized,Avoid unnecessary data reading operations.

Split into data blocks,In this format, efficient reading can be realized,Avoid unnecessary data reading operations.

dataTear 切换至:中文文档 knowledge base dataTear Split into data fragments for data management. In this format, efficient reading can be achieved to avoid un

Dec 15, 2022

A bitcoin explorer Java library that utilizes multiple data sources at once.

Bitcoin Explorer A bitcoin explorer library that utilizes multiple data sources at once. Branch Build Unit Tests master development Table of Contents

Jan 31, 2022

OBKV Table Client is Java Library that can be used to access table data from OceanBase storage layer.

OBKV Table Client OBKV Table Client is Java Library that can be used to access table data from OceanBase storage layer. Its access method is different

Dec 16, 2022

A unit testing library for varying test data.

Burst A unit testing library for varying test data. DEPRECATED: Burst remains stable and functional, but you should check out TestParameterInjector fr

Oct 9, 2022

An Open-Source repository 🌎 that contains all the Data Structures and Algorithms concepts and their implementation, programming questions and Interview questions

An Open-Source repository 🌎 that contains all the Data Structures and Algorithms concepts and their implementation, programming questions and Interview questions

An Open-Source repository 🌎 that contains all the Data Structures and Algorithms concepts and their implementation, programming questions and Interview questions. The main aim of this repository is to help students who are learning Data Structures and Algorithms or preparing for an interview.

Dec 29, 2022

Tinker is a hot-fix solution library for Android, it supports dex, library and resources update without reinstall apk.

Tinker is a hot-fix solution library for Android, it supports dex, library and resources update without reinstall apk.

Tinker Tinker is a hot-fix solution library for Android, it supports dex, library and resources update without reinstalling apk. Getting started Add t

Dec 30, 2022
Owner
Spotify
Spotify
This is a small library written in Java for minecraft login.

minecraft-auth-library This is a small library written in Java for minecraft login. Usage: Login with microsoft MinecraftAuthenticator minecraftAuthen

Sandro 7 Feb 5, 2022
A small library for parsing ItemStacks from a human-readable format

easy-item A small library for parsing ItemStacks from a human-readable format (1.16.5+, Java 11) TODO: Maybe add serialization (item to human-readable

Maximilian Dorn 3 Dec 4, 2021
A small mod to improve support for architectures and libraries officially unsupported by Minecraft. Mainly targeting Apple Macs using arm processors.

fabric-loom-native-support A small mod to improve support for architectures and libraries officially unsupported by Minecraft. Mainly targeting Apple

FabricMC 5 Oct 17, 2022
Small app to create icon sets for Linux, Windows, OSX, Android and IOS from a single PNG image

FXIconcreator Small app to create icon sets (multi resolution) for Linux, Windows, OSX from a single PNG image Reason for creating such an app was tha

null 18 Aug 4, 2022
A fun way to learn Camunda and win a small price

Camunda-Coding-Challenge A fun way to learn about Camunda and win a small prize. The coding challenge is designed for the Camunda Code Studio. Results

null 3 Oct 2, 2021
Data extraction from smartphones and GPS and Accelerometer data "fusion" with Kalman filter.

This is library for GPS and Accelerometer data "fusion" with Kalman filter. All code is written in Java. It helps to increase position accuracy and GP

Rahul Goel 4 Nov 22, 2022
Small example repo for looking into log4j CVE-2021-44228

log4j CVE-2021-44228 Lame useless repo to look into log4j CVE-2021-44228. Setup The repository contains a .idea/ folder which is a IntelliJ IDEA proje

null 65 Dec 13, 2022
Customer app for small scale retail shops

Hello guys, I am Manvitha Roy ⚡ I have developed an app named as Mirchi ?? Mirchi is an online grocery,clothes customer side app for small scale retai

Manvitha roy 2 Jul 15, 2022
Critter Chronologer a Software as a Service application that provides a scheduling interface for a small business that takes care of animals

Critter Chronologer a Software as a Service application that provides a scheduling interface for a small business that takes care of animals. This Spring Boot project will allow users to create pets, owners, and employees, and then schedule events for employees to provide services for pets.

Rasha Omran 1 Jan 28, 2022
Jarm is a small Tech Minecraft Mod I'm coding

jarm Jarm is a small Tech Minecraft Mod I'm coding. This is currently WIP and may get discontinued. Also, I'm not a professional coder. #Installation

null 2 May 8, 2022