Moderation

Classifies if text is potentially harmful across multiple categories.

Endpoint

POST /v1/moderations

Request Body

Parameter	Type	Required	Description
`model`	string	Yes	Model ID (e.g., `text-moderation-latest`)
`input`	string/array	Yes	Text to moderate (string or array of strings)

Example

curl -X POST https://cryptgpt.co/v1/moderations \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-moderation-latest",
    "input": "I want to hurt someone."
  }'

Response

{
  "id": "modr-abc123",
  "model": "text-moderation-latest",
  "results": [
    {
      "flagged": true,
      "categories": {
        "hate": false,
        "hate/threatening": false,
        "harassment": true,
        "self-harm": false,
        "sexual": false,
        "sexual/minors": false,
        "violence": true,
        "violence/graphic": false
      },
      "category_scores": {
        "hate": 0.001,
        "hate/threatening": 0.000,
        "harassment": 0.750,
        "self-harm": 0.001,
        "sexual": 0.000,
        "sexual/minors": 0.000,
        "violence": 0.850,
        "violence/graphic": 0.001
      }
    }
  ]
}

Categories

Category	Description
`hate`	Content expressing hate toward a group
`hate/threatening`	Hateful content with threats
`harassment`	Harassing or bullying content
`self-harm`	Content promoting self-harm
`sexual`	Sexual content
`sexual/minors`	Sexual content involving minors
`violence`	Violent content
`violence/graphic`	Graphic violent content

A result is flagged if any category score exceeds 0.5.