BIO.RE
Notifications

Admin Tools — Health, Failure Trend, Job Detail

Three observability endpoints powering the admin notification operations view. Health = subsystem checks (stuck jobs, queue depth, rollup freshness, dead-job watchdog). Failure trend = hourly fail counts for the last N days. Job detail = full row for one NotificationJob.

Three read-only endpoints used by the admin notification operations view. Shipped in Wave C4 (jobs detail, failure-trend page wiring) on top of pre-existing health + rollup infrastructure.

All three are class-level notification:read. No per-endpoint permission override. Same throttle as the rest of the admin notification surface (200 req / minute).

GET /admin/notifications/health — subsystem health

Returns aggregated notification health. Four checks, each { ok: boolean; detail: string }. The top-level healthy is true only when all checks pass.

Response — ApiResponseOf<NotificationHealthResponseDto>

{
  "success": true,
  "data": {
    "healthy": true,
    "checks": {
      "stuckJobs":      { "ok": true, "detail": "0 stuck jobs" },
      "queueDepth":     { "ok": true, "detail": "42 pending" },
      "rollupFreshness":{ "ok": true, "detail": "3min ago" },
      "recentDeadJobs": { "ok": true, "detail": "0 dead in last hour" }
    }
  }
}

Check semantics

CheckSourcePass condition
stuckJobsNotificationJob where status='PROCESSING' and startedAt < now() - 10mincount === 0 — a stuck job indicates worker died mid-claim
queueDepthNotificationJob where status='PENDING'count < 10000 — alerting threshold for backlog
rollupFreshnessNotificationEventRollup ORDER BY updatedAt DESC LIMIT 1last row updatedAt > now() - 15min — rollup cron healthy
recentDeadJobsNotificationJob where status='DEAD' and updatedAt > now() - 1hcount === 0 — any DEAD job in the last hour is a regression signal

Errors

HTTPReason
401Missing / invalid admin session
403Permission denied (notification:read required)

Side effects

None. 4 read-only count / findFirst queries.

GET /admin/notifications/failure-trend — hourly fails

Returns failure counts grouped by hour for the last N days (1 ≤ N ≤ 30, default 7). Data is sourced from NotificationEventRollup — pre-aggregated by an hourly cron — so this query is constant-time regardless of event volume.

Query

ParamTypeValidationNotes
daysstring (parsed int)`parseInt(days)

Response — ArrayApiResponseOf<FailureTrendItemDto>

{
  "success": true,
  "data": [
    { "hour": "2026-06-22T14:00:00.000Z", "count": 3 },
    { "hour": "2026-06-22T15:00:00.000Z", "count": 0 },
    { "hour": "2026-06-22T16:00:00.000Z", "count": 1 }
  ]
}
FieldTypeNotes
hourstring (ISO 8601)Hour bucket — NotificationEventRollup.hourBucket (UTC, truncated to hour)
countnumberSum across failed, bounced, dead statuses in that hour

Errors

HTTPReason
401Missing / invalid admin session
403Permission denied (notification:read required)

Side effects

None. Single NotificationEventRollup.findMany over the rolling window.

GET /admin/notifications/jobs/:id — single job detail

Returns the full NotificationJob row for one job ID. Used by the admin queue table's row-detail drill-down.

Path parameters

ParamTypeNotes
idstring (UUID)NotificationJob.id. Validated via ParseUUIDPipe.

Response — ApiResponseOf<NotificationJobItemDto>

{
  "success": true,
  "data": {
    "id": "nj-uuid",
    "queue": "notification:email",
    "jobName": "send-email",
    "channel": "EMAIL",
    "status": "COMPLETED",
    "attempts": 1,
    "maxAttempts": 3,
    "lastError": null,
    "scheduledAt": "2026-06-22T14:00:00.000Z",
    "startedAt": "2026-06-22T14:00:00.500Z",
    "completedAt": "2026-06-22T14:00:01.200Z",
    "createdAt": "2026-06-22T13:59:59.000Z"
  }
}

The full Prisma row is returned — including payload (job parameters), result (response from provider), latencyMs (Wave E Batch 2 observability), costUsd (reserved, always null), and dedupeKey. The DTO shape above documents the typed surface used by NotificationJobItemDto; additional Prisma fields are present in the raw response. See Observability for latencyMs / costUsd semantics.

Errors

HTTPReason / i18nKey
401Missing / invalid admin session
403Permission denied (notification:read required)
404error.notification.job_not_found — no NotificationJob row with the supplied ID

Side effects

None. Single notificationJob.findUnique.

Code samples

# Health snapshot — auto-refresh every 30s in the admin UI
curl 'https://api.bio.re/api/v1/admin/notifications/health' \
  -H 'Cookie: admin_session=...'

# Failure trend for the last 30 days
curl 'https://api.bio.re/api/v1/admin/notifications/failure-trend?days=30' \
  -H 'Cookie: admin_session=...'

# Single job detail (deep-link from queue table)
curl 'https://api.bio.re/api/v1/admin/notifications/jobs/nj-uuid' \
  -H 'Cookie: admin_session=...'
type HealthResponse = {
  healthy: boolean;
  checks: Record<string, { ok: boolean; detail: string }>;
};

type FailureTrendItem = { hour: string; count: number };

type JobDetail = {
  id: string;
  queue: string;
  jobName: string;
  channel: string;
  status: string;
  attempts: number;
  maxAttempts: number;
  lastError: string | null;
  scheduledAt: string | null;
  startedAt: string | null;
  completedAt: string | null;
  createdAt: string;
  // Wave E Batch 2 observability columns
  latencyMs: number | null;
  costUsd: string | null;
};

async function getHealth(): Promise<HealthResponse> {
  const res = await fetch('/api/v1/admin/notifications/health', { credentials: 'include' });
  return (await res.json()).data;
}

async function getFailureTrend(days = 7): Promise<FailureTrendItem[]> {
  const res = await fetch(`/api/v1/admin/notifications/failure-trend?days=${days}`, {
    credentials: 'include',
  });
  return (await res.json()).data;
}

async function getJob(id: string): Promise<JobDetail> {
  const res = await fetch(`/api/v1/admin/notifications/jobs/${id}`, { credentials: 'include' });
  return (await res.json()).data;
}

Source

SourcePathLines
Controller (GET /health)apps/api-core/src/modules/notification/admin-notification.controller.ts185–192
Controller (GET /failure-trend)apps/api-core/src/modules/notification/admin-notification.controller.ts225–232
Controller (GET /jobs/:id)apps/api-core/src/modules/notification/admin-notification.controller.ts367–375
Response DTO (health)apps/api-core/src/modules/notification/dto/admin-notification-response.dto.ts9–15 (NotificationHealthResponseDto)
Response DTO (failure-trend item)apps/api-core/src/modules/notification/dto/admin-notification-response.dto.ts87–93 (FailureTrendItemDto)
Response DTO (job item)apps/api-core/src/modules/notification/dto/admin-notification-response.dto.ts125–161 (NotificationJobItemDto)
Service (getHealth)apps/api-core/src/modules/notification/admin-notification.service.ts181–224
Service (getFailureTrend)apps/api-core/src/modules/notification/admin-notification.service.ts290–310
Service (getJobDetail)apps/api-core/src/modules/notification/admin-notification.service.ts641–645
Prisma modelpackages/prisma/prisma/schema.prismaNotificationJob (2203), NotificationEventRollup (1805)
Live responseNOT verified — sourced from DTOs + service shapes cited above

On this page