# defineData

> Define data type schemas for your domain

The `defineData` function creates and validates data type schema definitions. Each data type is defined in its own file under the `entity-types/` directory.

```typescript
import { defineData } from 'struere'

export default defineData({
  name: "Teacher",
  slug: "teacher",
  schema: {
    type: "object",
    properties: {
      name: { type: "string", description: "Full name" },
      email: { type: "string", format: "email", description: "Email address" },
      subjects: {
        type: "array",
        items: { type: "string" },
        description: "Subjects they can teach",
      },
      hourlyRate: { type: "number", description: "Rate per hour in cents" },
    },
    required: ["name", "email"],
  },
  searchFields: ["name", "email"],
  displayConfig: {
    titleField: "name",
    subtitleField: "email",
  },
})
```

## EntityTypeConfig

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `name` | `string` | Yes | Display name for the data type |
| `slug` | `string` | Yes | URL-safe identifier, used in API queries and tool calls |
| `schema` | `JSONSchema` | Yes | JSON Schema definition for the data |
| `searchFields` | `string[]` | No | Fields indexed for text search (defaults to `[]`) |
| `displayConfig` | `object` | No | Controls how records are displayed in the dashboard |
| `boundToRole` | `string` | No | Binds this data type to a role name for automatic user linking |
| `userIdField` | `string` | No | Field that stores the Clerk user ID (defaults to `"userId"` when `boundToRole` is set) |

### Validation

`defineData` throws errors if:

- `name`, `slug`, or `schema` is missing
- `schema.type` is not `"object"`
- Any nested object property is missing `properties`
- `boundToRole` is an empty string
- `userIdField` is set without `boundToRole`

## JSON Schema

Data type schemas use the JSON Schema format with the following type system:

```typescript
interface JSONSchema {
  type: 'object'
  properties: Record<string, JSONSchemaProperty>
  required?: string[]
}

interface JSONSchemaProperty {
  type: 'string' | 'number' | 'boolean' | 'array' | 'object'
  description?: string
  format?: string
  enum?: string[]
  items?: JSONSchemaProperty
  properties?: Record<string, JSONSchemaProperty>
  required?: string[]
  references?: string
}
```

The root schema must always be `type: "object"`. Nested objects must declare their `properties`.

### Supported Property Types

**String fields:**

```typescript
{
  name: { type: "string", description: "Full name" },
  email: { type: "string", format: "email" },
  status: {
    type: "string",
    enum: ["active", "inactive", "suspended"],
  },
}
```

**Number fields:**

```typescript
{
  hourlyRate: { type: "number", description: "Rate in cents" },
  age: { type: "number" },
}
```

**Boolean fields:**

```typescript
{
  isActive: { type: "boolean", description: "Whether the record is active" },
}
```

**Array fields:**

```typescript
{
  subjects: {
    type: "array",
    items: { type: "string" },
    description: "List of subjects",
  },
  tags: {
    type: "array",
    items: { type: "string", enum: ["math", "science", "english"] },
  },
}
```

**Nested object fields:**

```typescript
{
  address: {
    type: "object",
    properties: {
      street: { type: "string" },
      city: { type: "string" },
      postalCode: { type: "string" },
    },
    required: ["street", "city"],
  },
}
```

## References

Fields with `references` enforce foreign key constraints. When an entity is created or updated, any field with `references` is validated to ensure the referenced entity exists.

```typescript
import { defineData } from 'struere'

export default defineData({
  name: "Session",
  slug: "session",
  schema: {
    type: "object",
    properties: {
      studentId: { type: "string", references: "student" },
      teacherId: { type: "string", references: "teacher" },
      startTime: { type: "number" },
      duration: { type: "number" },
      subject: { type: "string" },
    },
    required: ["studentId", "teacherId", "startTime", "duration"],
  },
})
```

When the agent calls `entity.create` or `entity.update` with a `studentId` or `teacherId` value, the platform validates that:

- The referenced entity exists
- The referenced entity is not deleted
- The referenced entity belongs to the same organization and environment
- The referenced entity is of the correct entity type (e.g., `studentId` must reference a `student` entity)

If any validation fails, the operation throws an error identifying the invalid reference field.

## Display Configuration

The `displayConfig` field controls how records appear in the dashboard:

```typescript
displayConfig: {
  titleField: "name",
  subtitleField: "email",
  descriptionField: "notes",
}
```

| Field | Description |
|-------|-------------|
| `titleField` | Primary display field (shown as heading) |
| `subtitleField` | Secondary display field (shown below title) |
| `descriptionField` | Extended description field |

## Search Fields

The `searchFields` array specifies which fields are indexed for text search via `entity.query`:

```typescript
searchFields: ["name", "email", "phone"]
```

When an agent uses `entity.query` with a search term, only these fields are matched.

## Role Binding

The `boundToRole` and `userIdField` fields create an automatic link between a data type and a user role. When a user with the bound role logs in, they are associated with the matching record:

```typescript
export default defineData({
  name: "Teacher",
  slug: "teacher",
  schema: {
    type: "object",
    properties: {
      name: { type: "string" },
      email: { type: "string", format: "email" },
      userId: { type: "string", description: "Clerk user ID" },
    },
    required: ["name", "email"],
  },
  boundToRole: "teacher",
  userIdField: "userId",
})
```

When `boundToRole` is set and `userIdField` is omitted, it defaults to `"userId"`.

## Filtering entities

Entity queries (via the Data API, `entity.query` tool, or the JS client) accept a `filters` object. Two rules govern which fields are filterable:

- **Top-level fields are restricted to indexed columns.** The Data API only accepts top-level filter fields that are indexed on the entity table (e.g., `matchId` when present, plus a small set of system fields). Filtering on a non-indexed top-level field returns `400 Bad Request` with the list of fields that ARE queryable.
- **Use `data.<fieldName>` for everything in the JSON payload.** Domain fields declared in your `schema.properties` live inside the entity's `data` blob. Reference them as `data.status`, `data.teamId`, `data.guardianId`, etc. These filters apply in-memory after the indexed scan.

```typescript
await struere.data.query('session', {
  filters: {
    'data.status': 'scheduled',
    'data.teacherId': 'usr_123',
  },
})
```

### Foot-gun: top-level `status` is the entity lifecycle column

The top-level `status` field on an entity is the platform-managed lifecycle column (`active`, `archived`, `deleted`). If your data type also defines a domain `status` field (e.g., `pending`, `confirmed`, `paid`), they are not the same column.

A top-level `status` filter is **rejected** with `400 Bad Request`. The error tells you to use `data.status` instead:

```typescript
await struere.data.query('session', {
  filters: { 'data.status': 'scheduled' },
})
```

If you really do want to filter the lifecycle column, pass `status` as a top-level option (alongside `filters`, `limit`, `cursor`) — not as a key inside `filters`.

### Foot-gun: filtering on a non-indexed top-level field

If you write `filters: { teacherId: 'usr_123' }` and `teacherId` is a domain field stored in the JSON payload, the API returns `400 Bad Request` listing the indexed fields it accepts. The fix is the same: prefix with `data.`:

```typescript
await struere.data.query('session', {
  filters: { 'data.teacherId': 'usr_123' },
})
```

## Unsupported JSON Schema features

`defineData` accepts a deliberate subset of JSON Schema. The fields shown in the type definition above (`type`, `description`, `format`, `enum`, `items`, `properties`, `required`, `references`) are the only ones recognised — anything else will be rejected by the SDK type or silently dropped.

- `additionalProperties` — schemas are closed. To persist a `Record<id, X>` shape, reshape it into an array of records with an `id` field.
- `oneOf` / `anyOf` / `allOf` — model variants as a discriminated union via an `enum` field, then branch on that field in your code.
- `if` / `then` / `else` — conditional shapes are not supported. Express conditional logic in the application layer or split into separate entity types.
- `$ref` / `definitions` — schemas are inlined. Copy shared shapes into each entity type that needs them.
- `pattern`, `minimum`, `maximum`, `minLength`, `maxLength`, `multipleOf`, etc. — only `enum` and `format` constrain values at the schema level. Enforce additional validation in your code or via tools.

## Full Examples

### Student Data Type

```typescript
import { defineData } from 'struere'

export default defineData({
  name: "Student",
  slug: "student",
  schema: {
    type: "object",
    properties: {
      name: { type: "string" },
      grade: { type: "string" },
      subjects: {
        type: "array",
        items: { type: "string" },
      },
      notes: { type: "string" },
      guardianId: { type: "string" },
      preferredTeacherId: { type: "string" },
    },
    required: ["name"],
  },
  searchFields: ["name"],
  displayConfig: {
    titleField: "name",
    subtitleField: "grade",
  },
})
```

### Session Data Type

```typescript
import { defineData } from 'struere'

export default defineData({
  name: "Session",
  slug: "session",
  schema: {
    type: "object",
    properties: {
      teacherId: { type: "string" },
      studentId: { type: "string" },
      guardianId: { type: "string" },
      startTime: { type: "number", description: "Unix timestamp" },
      duration: { type: "number", description: "Duration in minutes" },
      subject: { type: "string" },
      status: {
        type: "string",
        enum: [
          "pending_payment",
          "scheduled",
          "in_progress",
          "completed",
          "cancelled",
          "no_show",
        ],
      },
      notes: { type: "string" },
      teacherReport: { type: "string" },
    },
    required: ["teacherId", "studentId", "guardianId", "startTime", "duration"],
  },
  searchFields: ["subject"],
  displayConfig: {
    titleField: "subject",
    subtitleField: "status",
  },
})
```

### Entitlement Data Type (Credits System)

```typescript
import { defineData } from 'struere'

export default defineData({
  name: "Entitlement",
  slug: "entitlement",
  schema: {
    type: "object",
    properties: {
      guardianId: { type: "string" },
      studentId: { type: "string" },
      totalCredits: { type: "number" },
      remainingCredits: { type: "number" },
      expiresAt: { type: "number", description: "Unix timestamp" },
    },
    required: ["guardianId", "studentId", "totalCredits", "remainingCredits"],
  },
  displayConfig: {
    titleField: "remainingCredits",
    subtitleField: "totalCredits",
  },
})
```
