Gno.land Transaction Indexer User Guide

Data stored on Gno.land and other blockchains is not readily searchable, since it is stored within blocks and requires cryptographic work to read. Transaction indexers store enough data off-chain to provide other applications with powerful, low latency query access to on-chain data, and the ability to subscribe to events in real-time. This is a critical service for data-rich applications such as social networks, DeFi platforms, supply chain logistics projects, and security applications, just to name a few.

The Gno.land transaction indexer, tx-indexer, is a tool for indexing and serving block and transaction data from Gno.land and other Tendermint2 blockchains. It provides a powerful foundation of data access in the form of a GraphQL query interface, allowing developers to build applications to efficiently serve on-chain data.

This page will guide the user through installation, cover basic usage and best practices, and discuss considerations for advanced use cases by way of two examples: 1) an application for detecting Sybil attacks against projects running on Gno.land, and 2) a predictive analytics platform for supply chain logistics.

Key Features:

Uses the JSON-RPC 2.0 standard.
Supports both synchronous HTTP requests and 2-way WebSocket connections for asynchronously subscribing to new events.
Utilizes asynchronous workers for low-latency indexing. Data is available for serving queries as soon as it is fetched from the remote chain.
Built on an embedded PebbleDB back-end for quick on-disk data access and migration.

Contents:

Installation and deployment
Basic usage
Best practices
Advanced use-cases
Troubleshooting
Appendix: GraphQL Query Schemas

Additional Resources:

Installation and deployment

Prerequisite: Go >=1.22.

First, clone the repository.

 git clone https://github.com/gnolang/tx-indexer.git
 cd tx-indexer

Then, either 1) build and run a binary, or 2) run directly from Go source.

In either case, you must point the indexer at a remote, your blockchain. In this case, we’ll use Gno.land Test3, by setting our remote to http://test3.gno.land.

Option 1:

Build the binary
```
 make build
```

To view all features and flags, run:

./build/tx-indexer start --help

Output:

 DESCRIPTION
   Starts the indexer service

 USAGE
   start [flags]

 Starts the indexer service, which includes the fetcher and JSON-RPC server

 FLAGS
   -db-path indexer-db             the absolute path for the indexer DB (embedded)
   -http-rate-limit 0              the maximum HTTP requests allowed per minute per IP, unlimited by default
   -listen-address 0.0.0.0:8546    the IP:PORT URL for the indexer JSON-RPC server
   -log-level info                 the log level for the CLI output
   -max-chunk-size 100             the range for fetching blockchain data by a single worker
   -max-slots 100                  the amount of slots (workers) the fetcher employs
   -remote http://127.0.0.1:26657  the JSON-RPC URL of the Gno chain

Run the Indexer

 ./build/tx-indexer start --remote http://test3.gno.land:36657 --db-path indexer-db

Option 2:

Run the indexer from source code.

 go run cmd/main.go cmd/start.go cmd/waiter.go start --remote http://test3.gno.land:36657 --db-path indexer-db

Output:

Either way you run the indexer, you should see output such as the following as it begins to run:

2024/08/10 09:33:00 [JOB 1] WAL file indexer-db/000376.log with log number 000376 stopped reading at offset: 0; replayed 0 keys in 0 batches
2024-08-10T09:33:00.782-0700  INFO  http-server serve/server.go:46  HTTP server started {"address": "[::]:8546"}
2024-08-10T09:33:01.059-0700  INFO  fetcher fetch/fetch.go:111  Fetching range  {"from": 520958, "to": 521047}
2024-08-10T09:33:01.648-0700  INFO  fetcher fetch/fetch.go:224  Added to batch block and tx data for range  {"from": 520958, "to": 521047}

Basic usage

The Gno.land transaction indexer offers a GraphQL query interface for block and transaction data on Gno.land (and other Tendermint2 blockchains).

See Appendix: GraphQL Query Schemas or browse the GraphQL playground’s embedded Documentation Explorer for reference.

Use the GraphQL playground

Visit the Transaction Indexer’s GraphQL playground at http://{host}:8546/graphql.

Open the Documentation Explorer by clicking the button in the upper left corner of the page to browse the specifications of available fields and filters.

Enter a query and hit the play button.

query {
 transactions(filter: {
 }) {
   hash
   messages {
     value {
       __typename
       ... on BankMsgSend {
         from_address
         to_address
         amount
       }
     }
   }
 }
}

Use the GraphQL API

Submit queries to the API at http://{host}:8546/graphql/query.

Synchronous queries

Use curl HTTP for synchronous queries. For example, the following will return creator, package name, and path for all transactions with add_package messages:

curl -X POST http://localhost:8546/graphql/query \
     -H "Content-Type: application/json" \
     -d '{"query": "{ transactions(filter: { message: {vm_param: {add_package: {}}}}) { index hash block_height gas_used messages { route typeUrl value { __typename ... on MsgAddPackage { creator package { name path } } } } } }"  
}'|jq

Output:

...
{
  "index": 0,
  "hash": "ONzGf1xTNdh/1KeWK72pTI5y8CfsehdVkmbEYM6M4ew=",
  "block_height": 480685,
  "gas_used": 68947,
  "messages": [
    {
      "route": "vm",
      "typeUrl": "add_package",
      "value": {
        "__typename": "MsgAddPackage",
        "creator": "g1778y2yphxs2wpuaflsy5y9qwcd4gttn4g5yjx5",
        "package": {
          "name": "moodtest1",
          "path": "gno.land/r/michelle22/moodtest1"
        }
      }
    }
  ]
},
{
  "index": 0,
  "hash": "hTkVl8eaFfqNLdhDgzFhHYPLit/XCNwZdPaL9o5fN/Y=",
  "block_height": 480688,
  "gas_used": 68997,
  "messages": [
    {
      "route": "vm",
      "typeUrl": "add_package",
      "value": {
        "__typename": "MsgAddPackage",
        "creator": "g1778y2yphxs2wpuaflsy5y9qwcd4gttn4g5yjx5",
        "package": {
          "name": "moodtest1",
          "path": "gno.land/r/michelle22/moodtest1"
        }
      }
    }
  ]
}

Asynchronous queries

You can also use websockets to subscribe to results for asynchronous queries.

Open the websocket connection to the GraphQL query interface:
```
 wscat -c ws://localhost:8546/graphql/query
```
Initialize the connection.
```
 { "type": "connection_init", "payload": {} }
```
This should return an acknowledged message, and you will see periodic ‘keep-alive’ messages, ka.

Output:
```
 {"type":"connection_ack"}

 < {"type":"ka"}
```

Subscribe to a query.

The following example watches for all new transactions and returns some relevant data depending on the message type, for each type of message: BankMsgSend, MsgCall, MsgAddPackage, or MsgRun.

 {"id":"1","type":"start","payload":{"query":"subscription { transactions(filter: {}) { hash gas_used messages { route typeUrl value { __typename ... on BankMsgSend { from_address to_address amount } ... on MsgCall { caller send pkg_path func args } ... on MsgAddPackage { creator package { name path files { name body } } deposit } ... on MsgRun { caller send package { name path files { name body } } } } } } }"}}

Best practices

Mitigate injection attacks

When exposing a GraphQL interface, it is essential to implement safeguards against injection attacks. Any opportunity for users to submit queries must be treated with care. Validating and sanitizing inputs, setting query complexity limits, configuring rate and package size limits, and requiring authentication and sufficient authorization to define custom queries, are possible tactics for avoiding disruptions.

Keep in mind that either malicious or accidental behavior can potentially overwhelm the performance capabilities of a transaction indexer service that insufficiently prepared.

Back up indexed data

Backing up indexed data does not appear to be possible as far as I can see (I see no mention of creating backups searching in the repo), so I’m curious what the expert recommendation is. For running any kind of dependable application, given that it would take a while (potentially days or more) for the indexing to happen, I would think that a either physical or logical replication would be desirable to avoid having to re-index? However, PebbleDB doesn’t appear to support backups. For that matter, I’m curious why the tx-indexer does use PebbleDB in the first place? Is the intent to eventually evolve the tx-indexer to operate on a distributed datastore? (Just guessing here, since PebbleDB is the custom back-end for CockroachDB, which is built to be distributed).

The indexer appears to be immediately available for queries–although I notice that transactions all still have “index: 0”, which seems to mean that an index has not been assigned yet (but then how is the data available to search, isn’t that what the index/record is for?) I’m not sure what that’s about, or how querying would be possible immediately without the indexer needing to do the indexing work, but I suspect that it may just be that indexing is very fast (seconds scale) on this test network with relatively little data, whereas it would take significant time (hours to eventually days/weeks once there is a lot of data) on a production main net. I’m guessing at a lot of background assumptions here, but that’s my expectation from what I’ve read about chain indexing in general.

But is there a backup functionality planned for the indexer, or what is the recommendation here? Without another option, I would try the below strategy of just making sure you have enough instances running independently to cover the likely rate of crash, given the likely latency to recovery. But given that re-indexing latency might be on the order of days, one would need to be extremely careful with overloading instances. For that reason, you would want to really carefully configure your rate limits, which would make it difficult to maintain availability if demand is bursty or unpredictable.

Ensure service availability

Off the cuff, I would suggest using Kubernetes (or an alternative) to run a few of these things on different machines with a load balancer in front of them. But I don’t know what happens if a node crashes, how likely it’ll be to need to re-index and how to manage that. What happens as the indexing is happening/how do you know when an indexer node is ready to serve data, i.e. when it completed indexing? It’s one thing to subscribe to new blocks and transactions–perhaps that can happen before indexing of old blocks is complete, but presumably multi-block searches either perform strangely or are blocked before indexing is complete, right? So to configure the node health-check for k8s, you’d just have to know how to ask the tx-indexer if it is finsihed indexing/ready to serve queries on old data (i.e. the ‘readiness check’), or only ready to support subscriptions (‘liveness check’).

Also, given that spin-up time in production could be very long, a very generous predictive allocation of resources would be appropriate. If it takes a day to spin up a new server, you need to be able to predict additional demand a day in advance.

Monitor instance health and performance

To run a reliable transaction indexer service, you’d need to monitor the performance and health of your transaction indexer instances in terms of things like memory usage, indexing speed, and query response times.

Aside from health checks, if nothing else, you need to learn to tune your http-rate-limit, and max-chunk-size, max-slots to effectively utilize your resources without risking crashes or service limitations.

Ideally this doc would provide some guidance along these lines as well.

Stay updated

Keep tx-indexer updated to make use of the latest features and security patches. Regular updates ensure that the indexer benefits from the ongoing improvements in the Gno.land ecosystem, but the upgrade process needs to be carefully managed to avoid service disruption.

What are the plans for the evolution of the indexer and its feature set? Information about that could be helpful if things are in motion.

Advanced use cases

This section explores two advanced use cases for running a transaction indexer, a system to detect Sybil attacks against decentralized autonomous organizations (DAOs) and applications running on Gno.land, and a supply chain prediction tool. Both project architectures demonstrate the importance of the Gno.land transaction indexer for enabling valuable data-driven applications with efficient access to on-chain transaction data.

Sybil attack detection

A Sybil attack is when a single entity creates multiple fake identities. This type of threat is significant to decentralized computing and governance projects, as it can allow bad actors to gain disproportionate influence over a system, undermining its fairness and/or its ability to operate.

As an advanced use case for the transaction indexer, consider a Sybil detection system that would use data from Gno.land to identify suspicious patterns indicative of coordinated manipulation of multiple identities. As described by Jethava and Udai (2022), a hybrid framework combining behavior-based and graph-based analysis is promising for detecting sybil attacks.

(Jethava, Gordhan, and Udai Pratap Rao. “User behavior-based and graph-based hybrid approach for detection of sybil attack in online social networks.” Computers and Electrical Engineering 99 (2022): 107753.)

Architecture:

Data Collection: Use the transaction indexer to query and subscribe to transactions involving social, financial, and governance-related activities on Gno.land. Focus on transactions that involve token transfers, governance votes, or any action that can indicate identity creation or manipulation.
Relationship-Graph Analysis: Model the strengh and types of relationships among accounts in the network.
Behavioral Analysis: Use machine learning to flag suspicious patterns of behavior, such coordinated bursts of transactions with similar characteristics originating from different accounts.
Detection and response: When the model detects suspicious behavior it can launch remediations or follow-up investigations, alert administrators or application developers, or take other automated actions.

Supply chain prediction

In complex supply chains, predicting shortages and managing inventory efficiently is crucial. This use case describes a tool that predicts supply chain shortages and automatically adjusts sourcing strategies based on data stored on Gno.land. The tool can prioritize preferred vendors but will switch to alternative suppliers if necessary to prevent disruptions in production.

Architecture:

Data Aggregation: The transaction indexer supplies relevant on-chain transactions data to an analysis pipeline, such as payments, transfer of tokens representing real-world goods and materials, and execution of other related smart contracts.
Predictive analysis: Predictive models analyze transaction data to forecast potential surpluses and shortages.
Monitoring and response: The system continuously monitors supply chain transactions to detect deviations from expected behavior. Based on the predictions, the tool automatically adjusts sourcing strategies. For example, if a shortage is predicted, the system can initiate orders with secondary suppliers or trigger other remediations by invoking smart contracts on Gno.land.

Appendix: GraphQL Query Schemas

This section details the query schemas for blocks and transactions and their accessory types. These schema are defined in the codebase here.

Block

Represents a blockchain block.

hash (String): A unique identifier for the block, computed as a Merkle tree from the block’s header.
height (Int): A unique integer identifier representing the block’s position in the blockchain, strictly increasing with each new block.
version (String): The software version of the node that created the block, indicating the specific implementation and version of the blockchain protocol used.
chain_id (String): An identifier for the specific blockchain network (e.g., mainnet, testnet) to which this block belongs.
time (Time): The timestamp in UTC when the block was proposed and finalized.
num_txs (Int): The number of transactions included in this block.
total_txs (Int): The cumulative total number of transactions that have occurred up to and including this block.
app_version (String): The version of the application running on the blockchain.
last_block_hash (String): The hash of the last committed block, providing a link to the previous block.
last_commit_hash (String): The commit hash from the validators for the last block.
validators_hash (String): The hash of the validators for the current block.
next_validators_hash (String): The hash of the validators for the next block, indicating the upcoming validator set.
consensus_hash (String): The hash of the consensus parameters for the current block.
app_hash (String): The hash representing the state of the application after processing transactions from the previous block.
last_results_hash (String): The root hash of all results from the transactions in the previous block.
proposer_address_raw (String): The encoded blockchain address of the proposer who submitted this block. This data is raw and requires decoding for human readability.
txs ([BlockTransaction]): A list of transactions included in this block, detailing the execution specifics and content.

BlockTransaction

Defines a transaction within a block.

hash (String): The hash computed from the TMHASH algorithm applied to the wire-encoded transaction.
fee (TxFee): Information regarding the fee associated with the transaction, including the amount and denomination.
memo (String): A string field storing additional information within a transaction, often used for distinguishing or identifying specific transactions.
content_raw (String): The raw transaction payload, typically containing the instructions and any data necessary for execution.

Transactions

Represents a transaction.

index (Int: The sequential order of the transaction within its block.
hash (String): Base64 encoded hash of the transaction content.
success (Boolean): Indicates if the transaction succeeded or failed.
block_height (Int): The height of the block containing this transaction.
gas_wanted (Int): The maximum computational effort the sender is willing to pay for.
gas_used (Int): The actual computational effort consumed.
gas_fee (Coin): The fee paid for gas usage, including the coin denomination and amount.
content_raw (String): The raw transaction payload, typically containing instructions and data in an encoded format.
messages ([TransactionMessage]): The messages within the transaction, detailing the operations executed, which may vary depending on the type of message.
memo (String): An optional string field for additional transaction metadata, often used to distinguish transactions.
response (TransactionResponse): The processing result of the transaction, including logs, info, errors, and emitted events.

TransactionMessage

Defines the content and type of messages within a transaction.

typeUrl (String): The type URL of the message (send, exec, add_package, run).
route (String): The route of the message (bank, vm).
value (MessageValue): The content of the message, which can be one of the following types:
- BankMsgSend: Used for fund transfers.
- MsgCall: Used for method invocation.
- MsgAddPackage: Used for package deployment.
- MsgRun: Used for executing arbitrary GNO code.
- UnexpectedMessage: A fallback type for undefined or unrecognized messages.

MessageRoute

The valid route types for a transaction.

vm: Executes a function in realm or package that is deployed in the GNO chain.
bank: Used when sending native tokens.

MessageType

The valid message types for a transaction.

send: A message used for sending native tokens (BankMsgSend).
exec: A message used for executing a function in a realm or package on the GNO chain (MsgCall).
add_package: A message used for deploying a package to the GNO chain (MsgAddPackage).
run: A message used for executing arbitrary GNO code (MsgRun).

BankMsgSend

A message used for fund transfers:

from_address (String): The sender’s address.
to_address (String): The receiver’s address.
amount (String): The amount and denomination of funds sent.

MsgCall

A message used for method invocation on the GNO chain:

caller (String): The address of the function caller.
send (String): The amount of funds sent with the transaction.
pkg_path (String): The GNO package path.
func (String): The name of the function being invoked.
args ([String]): The arguments passed to the function.

MsgAddPackage

A message used for deploying a package to the GNO chain:

creator (String): The address of the package deployer.
package (MemPackage): The package metadata.
deposit (String): The amount of funds deposited during deployment.

MsgRun

A message used for executing arbitrary GNO code:

caller (String): The address of the function caller.
send (String): The amount of funds sent with the transaction.
package (MemPackage): The package being executed.

MemPackage

Metadata information for a package or realm deployment:

name (String): The name of the package.
path (String): The GNO path of the package.
files ([MemFile]): The associated GNO source files.

MemFile

Metadata for a single GNO package or realm file:

name (String): The name of the source file.
body (String): The content of the source file.

TxFee

Information about the gas fee and limit for a transaction:

gas_wanted (Int): The gas limit specified by the user.
gas_fee (Coin): The fee paid for gas usage.

Coin

Defines the quantity and denomination of a coin:

amount (Int): The amount of coins.
denom (String): The denomination of the coin.

TransactionResponse

The processing result of a transaction, including logs, information, errors, and emitted events:

log (String): The execution log.
info (String): Additional info about the execution.
error (String): Error details, if any.
data (String): Response data from the execution.
events ([Event]): The events emitted during the transaction execution.

Event

A union type representing events emitted during transaction execution, which can be either:

GnoEvent: An event emitted by the Gno VM.
UnknownEvent: A fallback type for unrecognized events.

GnoEvent

An event emitted by the Gno VM:

type (String): The type of the event.
pkg_path (String): The package path associated with the event.
func (String): The function name that emitted the event.
attrs ([GnoEventAttribute]): The attributes of the event.

GnoEventAttribute

Attributes of a Gno VM event:

key (String): The key of the attribute.
value (String): The value of the attribute.

UnknownEvent

A type for unrecognized events:

value (String): The raw event string.