> ## Documentation Index
> Fetch the complete documentation index at: https://docs.bytebase.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Document Database Masking

<Info>
  For document databases (**MongoDB** and **Elasticsearch**), masking is configured **per-collection / per-index** through the **Catalog** using an `objectSchema`, not the column-based configuration used by relational databases. [Global Masking Rule](/security/data-masking/global-masking-rule) and [Masking Exemption](/security/data-masking/access-unmasked-data) are **not** supported for document databases at this time.
</Info>

[Dynamic Data Masking](/security/data-masking/overview) applies to document databases at the **field level**. When a user queries a collection or index through the [SQL Editor](/sql-editor/overview), fields tagged with a [Semantic Type](/security/data-masking/semantic-types) are masked in the result, while other fields are returned as-is. It works at the query layer — no changes to your application or data are required.

Masking supports:

* Fields in nested objects (e.g. `contact.phone`).
* Elements within arrays.
* Fields in joined collections via `$lookup` and `$graphLookup`.

## Configure field masking

1. Create the [Semantic Types](/security/data-masking/semantic-types) you want to apply. The masking algorithm is determined by the semantic type. For document databases, **Full mask** is the most common choice since it works with all field types (strings, numbers, objects, arrays).

2. Go to **Databases**, select your MongoDB or Elasticsearch database, and open a collection (MongoDB) or index (Elasticsearch).

3. In the collection (MongoDB) / index (Elasticsearch) detail, scroll to the **Catalog** section and edit the `objectSchema` JSON. Assign a semantic type to each field you want masked by setting its `semanticType` to the semantic type's ID (found on the **Data Access** > **Semantic Types** page).

Example `objectSchema` for a `users` collection:

```json theme={null}
{
  "name": "users",
  "objectSchema": {
    "type": "OBJECT",
    "structKind": {
      "properties": {
        "email": {
          "type": "STRING",
          "semanticType": "<semantic-type-id>"
        },
        "contact": {
          "type": "OBJECT",
          "structKind": {
            "properties": {
              "phone": { "type": "STRING", "semanticType": "<semantic-type-id>" },
              "city": { "type": "STRING" }
            }
          }
        },
        "tags": {
          "type": "ARRAY",
          "arrayKind": {
            "kind": { "type": "STRING", "semanticType": "<semantic-type-id>" }
          }
        },
        "profile": {
          "type": "OBJECT",
          "semanticType": "<semantic-type-id>",
          "structKind": {
            "properties": {
              "ssn": { "type": "STRING" },
              "address": { "type": "STRING" }
            }
          }
        }
      }
    }
  }
}
```

| Field                   | Description                                                                                              |
| :---------------------- | :------------------------------------------------------------------------------------------------------- |
| `name`                  | The collection or index name.                                                                            |
| `type`                  | Field type: `STRING`, `NUMBER`, `BOOLEAN`, `OBJECT`, or `ARRAY`. The root type is always `OBJECT`.       |
| `semanticType`          | The ID of the [Semantic Type](/security/data-masking/semantic-types) to apply. Omit for unmasked fields. |
| `structKind.properties` | A map of field names to their schema, for `OBJECT` types.                                                |
| `arrayKind.kind`        | The schema of array elements, for `ARRAY` types.                                                         |

<Note>
  When you assign a `semanticType` to an **object** field (like `profile` above), the entire subtree is replaced with a single masked value regardless of child field configurations.

  For **arrays**, you can mask at two levels:

  * **Item-level** — set `semanticType` on `arrayKind.kind`. Each element is masked individually (e.g. `["******", "******"]`).
  * **Array-level** — set `semanticType` on the array field itself. The entire array is replaced with a single masked value.
</Note>

## Query behavior

Masking is applied automatically to query results in the SQL Editor. Given the schema above, `db.users.find({ name: "Alice" })` returns `email`, `contact.phone`, and every element of `tags` masked, `contact.city` visible, and the whole `profile` object replaced with a single masked value.

To prevent users from inferring masked values with targeted filters, **querying on a masked field in a filter predicate is rejected**:

```
using field "email" tagged by semantic type "<id>" in query predicate is not allowed
```

## Supported operations

Masking only applies to operations that return documents while preserving their shape. Operations that reshape documents, return aggregate counts, or write data are **rejected** when masking is configured on the target collection or index.

### MongoDB

| Operation                                                               | Supported                                     |
| :---------------------------------------------------------------------- | :-------------------------------------------- |
| `find()`, `findOne()`                                                   | Yes                                           |
| `aggregate()`                                                           | Only with shape-preserving stages (see below) |
| `countDocuments()`, `estimatedDocumentCount()`, `count()`, `distinct()` | No — return aggregate values, not documents   |

For `aggregate()`, **shape-preserving** stages such as `$match`, `$sort`, `$limit`, `$skip`, `$unwind`, `$addFields` / `$set`, `$unset`, and the basic form of `$lookup` / `$graphLookup` (with `localField` / `foreignField`) are supported. Stages that reshape documents — `$group`, `$project`, `$replaceRoot`, `$count`, `$facet`, `$bucket` — the pipeline form of `$lookup`, and stages that write to another collection (`$out`, `$merge`) are rejected:

```
MongoDB aggregate() with stage "$group" on collection "users" is not supported for dynamic masking. Supported operations are find(), findOne(), and aggregate() with shape-preserving stages only
```

### Elasticsearch

The `_search` / `_msearch` APIs are supported, with masking applied to `_source`, `fields`, `highlight`, `sort`, and `inner_hits`. Single- and multi-document retrieval (`GET /<index>/_doc/<id>`, `GET /<index>/_source/<id>`, `_mget`) and `_explain` are also supported.

The following are **rejected** when masking is configured, because their results can bypass field-level masking:

* **APIs** — `_async_search`, `_search/scroll`, `_search_template`, `_msearch/template`, `_sql`, `_eql/search`, `_esql/query`, `_terms_enum`, `_termvectors`. For OpenSearch: `_plugins/_asynchronous_search`, `_plugins/_sql`, `_plugins/_ppl`.
* **`_search` request-body features** — `aggs` / `aggregations`, `suggest`, `script_fields`, `runtime_mappings`, `stored_fields`, `docvalue_fields`.

```
this Elasticsearch API is not supported when data masking is configured on the target index
```

## Troubleshooting

If masking is not applied to your results, check that:

* The collection or index has an `objectSchema` configured in its **Catalog** (using `objectSchema`, not `columns`), with the correct field types and `semanticType` values.
* At least one field has a `semanticType` assigned.
* The instance has an **Enterprise** license assigned. Verify on **Settings** > **Subscription** and ensure the license toggle is enabled for the instance under **Instances**.
