Data Extraction API

Demeterics stores all your LLM interactions in BigQuery and provides APIs to extract and analyze this data programmatically. This guide covers how to export interaction data, query usage metrics, and integrate with your analytics pipeline.

Overview

Your interaction data is stored in BigQuery and can be extracted via:

Export API (POST /api/v1/exports) - Bulk export to JSON, CSV, or Avro
Stream API (GET /api/v1/exports/{request_id}/stream) - Stream large datasets
Dashboard UI - Download exports directly from the web interface

All exports are scoped to your user account and respect data retention policies.

Authentication

All export endpoints require authentication via your Demeterics API key:

Authorization: Bearer dmt_your_api_key

Your API key must have the export scope enabled. To check or update scopes, visit Settings → API Keys.

Export API

POST /api/v1/exports

Create a new data export job. Returns immediately with a request ID for streaming large datasets.

Request Body

Field	Type	Required	Description
`format`	string	No	Output format: `json`, `csv`, or `avro`. Default: `csv`
`start_date`	string	No	Start date filter (ISO 8601: `YYYY-MM-DD`)
`end_date`	string	No	End date filter (ISO 8601: `YYYY-MM-DD`)
`tables`	array	No	Tables to export: `interactions`, `eval_runs`, `eval_results`. Default: all
`use_gcs`	boolean	No	Export to GCS bucket instead of streaming. Default: `false`
`gcs_bucket`	string	No	Target GCS bucket (required if `use_gcs` is true)

Example: Export last 30 days as JSON

curl -X POST https://api.demeterics.com/api/v1/exports \
  -H "Authorization: Bearer dmt_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "format": "json",
    "start_date": "2025-11-01",
    "end_date": "2025-11-30",
    "tables": ["interactions"]
  }'

Response

{
  "status": "ok",
  "request_id": "550e8400-e29b-41d4-a716-446655440000",
  "row_count": 1542,
  "bytes_size": 2048576,
  "message": "Export ready for streaming"
}

GET /api/v1/exports/{request_id}/stream

Stream the exported data. Use the request_id from the export response.

Example: Stream as CSV

curl -X GET "https://api.demeterics.com/api/v1/exports/550e8400-e29b-41d4-a716-446655440000/stream" \
  -H "Authorization: Bearer dmt_your_api_key" \
  -o interactions.csv

Query Parameters

Parameter	Description
`format`	Override format: `json` or `csv`

Interaction Data Schema

Exported interactions include the following fields:

Field	Type	Description
`transaction_id`	string	Unique interaction identifier (ULID)
`request_id`	string	Client-provided request ID for idempotency
`session_id`	string	Session identifier for grouping conversations
`user_id`	int64	Your Demeterics user ID
`model`	string	LLM model used (e.g., `llama-3.3-70b-versatile`)
`question`	string	Input prompt/question
`question_time`	timestamp	When the question was sent
`answer`	string	LLM response
`answer_time`	timestamp	When the answer was received
`latency_ms`	int64	Response time in milliseconds
`prompt_tokens`	int64	Input token count
`completion_tokens`	int64	Output token count
`cached_tokens`	int64	Cached token count (if applicable)
`total_tokens`	int64	Total tokens used
`estimated_cost`	float64	Estimated cost in USD
`status`	string	`success`, `error`, or `timeout`
`error_message`	string	Error details (if status is `error`)
`application`	string	Application name from API key
`metadata`	json	Custom metadata attached to the interaction
`tags`	array	Tags for categorization

Export Examples

Python: Export and Analyze

import requests
import pandas as pd
from io import StringIO

API_KEY = "dmt_your_api_key"
BASE_URL = "https://api.demeterics.com"

# Create export job
response = requests.post(
    f"{BASE_URL}/api/v1/exports",
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={
        "format": "csv",
        "start_date": "2025-11-01",
        "end_date": "2025-11-30",
        "tables": ["interactions"]
    }
)
export = response.json()
request_id = export["request_id"]

# Stream the data
stream_response = requests.get(
    f"{BASE_URL}/api/v1/exports/{request_id}/stream",
    headers={"Authorization": f"Bearer {API_KEY}"}
)

# Load into pandas
df = pd.read_csv(StringIO(stream_response.text))

# Analyze
print(f"Total interactions: {len(df)}")
print(f"Total cost: ${df['estimated_cost'].sum():.2f}")
print(f"Avg latency: {df['latency_ms'].mean():.0f}ms")
print(f"\nTop models:")
print(df['model'].value_counts().head())

Node.js: Stream to File

const fs = require('fs');
const https = require('https');

const API_KEY = 'dmt_your_api_key';

// Create export
fetch('https://api.demeterics.com/api/v1/exports', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${API_KEY}`,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    format: 'json',
    start_date: '2025-11-01',
    end_date: '2025-11-30'
  })
})
.then(res => res.json())
.then(data => {
  // Stream to file
  const file = fs.createWriteStream('interactions.json');
  https.get(
    `https://api.demeterics.com/api/v1/exports/${data.request_id}/stream`,
    { headers: { 'Authorization': `Bearer ${API_KEY}` } },
    response => response.pipe(file)
  );
});

Shell: Daily Export Script

#!/bin/bash
# daily_export.sh - Export yesterday's interactions

API_KEY="dmt_your_api_key"
YESTERDAY=$(date -d "yesterday" +%Y-%m-%d)
TODAY=$(date +%Y-%m-%d)
OUTPUT_FILE="interactions_${YESTERDAY}.csv"

# Create export
REQUEST_ID=$(curl -s -X POST https://api.demeterics.com/api/v1/exports \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"format\": \"csv\",
    \"start_date\": \"$YESTERDAY\",
    \"end_date\": \"$TODAY\",
    \"tables\": [\"interactions\"]
  }" | jq -r '.request_id')

# Download
curl -s "https://api.demeterics.com/api/v1/exports/$REQUEST_ID/stream" \
  -H "Authorization: Bearer $API_KEY" \
  -o "$OUTPUT_FILE"

echo "Exported to $OUTPUT_FILE"

GCS Export (Enterprise)

For large datasets, export directly to a Google Cloud Storage bucket:

curl -X POST https://api.demeterics.com/api/v1/exports \
  -H "Authorization: Bearer dmt_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "format": "avro",
    "use_gcs": true,
    "gcs_bucket": "gs://your-bucket/exports/",
    "start_date": "2025-01-01",
    "end_date": "2025-11-30"
  }'

Response

{
  "status": "ok",
  "request_id": "550e8400-e29b-41d4-a716-446655440000",
  "url": "gs://your-bucket/exports/interactions_2025-11-30.avro",
  "expires_at": "2025-12-07T00:00:00Z",
  "row_count": 150000,
  "message": "Export complete"
}

Note: Contact support to enable GCS export for your account.

Rate Limits

Endpoint	Limit
`POST /api/v1/exports`	10 requests/minute
`GET /api/v1/exports/{id}/stream`	100 requests/minute

Export jobs are cached for 10 minutes. Repeated requests with the same parameters will return the cached result.

Best Practices

Use date filters - Always specify start_date and end_date to limit data volume
Export incrementally - Run daily/weekly exports instead of full history dumps
Use CSV for analysis - Easier to work with in spreadsheets and pandas
Use Avro for pipelines - More efficient for BigQuery, Spark, or data warehouses
Store exports - Export jobs expire after 10 minutes; save the data locally

Troubleshooting

401 Unauthorized

Check that your API key is valid
Ensure the key has export scope enabled

403 Forbidden

Your API key lacks the export scope
Update key permissions in Settings → API Keys

404 Not Found

Export request expired (10 minute TTL)
Re-create the export job

500 Internal Server Error

Date range may be too large
Try a smaller date range or specific tables