Back to Home

Bielik Guard — A Small Experiment in Polish Content Moderation

#laravel #docker #content-moderation #bielik #poc

Context

I was curious how far you can get with a small, self-hosted moderation layer for Polish text without reaching for an external API. The goal was simple: catch obviously problematic input like hate speech, profanity, sexual content, criminal instructions, or self-harm promotion, while keeping the setup cheap and easy to reason about.

This is very much an experiment, not a polished production story. But it already works end-to-end, and that alone made it worth writing down.

What Is Bielik Guard

Bielik Guard (codename Sójka) is an open-source Polish AI guardrails model that classifies text into five threat categories:

CategoryLabelWhat it catches
Hate speechhateVerbal aggression, discrimination
ProfanityvulgarVulgarities, offensive language
Sexual contentsexExplicit material
Criminal activitycrimeInstructions or promotion of illegal acts
Self-harmself-harmSuicide, self-injury promotion

There are two model sizes available, both under Apache 2.0:

  • Bielik-Guard-0.1B — fast, tiny footprint, good for short texts (based on sdadas/mmlw-roberta-base)
  • Bielik-Guard-0.5B — better precision, broader context window (based on PKOBP/polish-roberta-8k)

I picked the 0.1B version because I wanted the smallest possible thing that still felt useful. For short-form user input, it seems like a very reasonable tradeoff.

The Integration Pattern

The nice part is that the integration stays boring in a good way. It’s just three pieces:

  • a Docker sidecar running the model
  • a Laravel service talking to it over HTTP
  • a validation Rule that plugs into form requests

Docker Sidecar

The model runs as a separate container in compose.yaml, reachable from the app over Docker networking at http://bielik_guard:8000. No GPU, no special infrastructure, no drama. The 0.1B model is perfectly happy on CPU.

# compose.yaml (relevant excerpt)
services:
  bielik-guard:
    image: bielik-guard:latest  # or build from HuggingFace model
    container_name: bielik_guard
    networks:
      - sail

The app references it in config/services.php:

'bielik_guard' => [
    'enabled' => env('BIELIK_GUARD_ENABLED', false),
    'url'     => env('BIELIK_GUARD_URL', 'http://bielik_guard:8000'),
],

It’s disabled by default, which I liked during experimentation. Flip BIELIK_GUARD_ENABLED=true in .env and the app starts using it.

Laravel Service

BielikGuardService is intentionally thin. It sends text to /classify, gets back [{label, score}, ...], and wraps the response in a ClassificationResult DTO:

// App\Services\BielikGuardService (simplified)
$response = Http::baseUrl($this->baseUrl)
    ->timeout(10)
    ->post('/classify', ['text' => $request->text])
    ->throw();

return new ClassificationResult(
    text: $request->text,
    scores: $this->parseScores($response->json()),
);

That DTO exposes isSafe(threshold), highestCategory(), and highestScore(), backed by a ContentCategory enum matching Bielik’s labels. Small touch, but it makes the rest of the code much nicer to read.

Laravel Validation Rule

My favorite part is the Laravel integration. SafeContent is just a standard ValidationRule, so you can drop it into any FormRequest without inventing a new abstraction:

// In a FormRequest rules() method:
'name' => ['required', 'string', 'max:255', new SafeContent(threshold: 0.5)],
'description' => ['nullable', 'string', new SafeContent],

The rule degrades gracefully: if Bielik Guard is unreachable or disabled, validation simply passes through. For an experiment, that felt like the right default. I wanted to observe the integration, not make the whole app depend on it on day one.

// App\Rules\SafeContent (key logic)
if (! is_string($value) || ! config('services.bielik_guard.enabled')) {
    return; // skip if disabled
}

$result = $service->classify(new ClassificationRequest(text: $value));

if (! $result->isSafe($this->threshold)) {
    Log::warning('Content validation failed', [
        'category' => $result->highestCategory()->value,
        'score'    => $result->highestScore(),
    ]);
    $fail(__('validation.safe_content'));
}

Why This Pattern Works

  • No external API — the model lives inside your Docker stack, so there are no API keys and no extra service in the loop
  • Surprisingly lightweight — the 0.1B model has a tiny footprint and feels very practical for sidecar-style use
  • Actually Polish-native — which matters a lot for profanity, slang, and messy real-world phrasing
  • Laravel-friendlyValidationRule is exactly the kind of extension point you want here
  • Safe to experiment with — fail-open behavior means you can test the idea without making the rest of the app fragile

Final Thought

What I liked most here wasn’t some breakthrough in model quality. It was how unexciting the integration felt once the pieces were in place. A small sidecar, a thin service, a validation rule, done.

That’s usually a good sign.

If you want to experiment with local AI features in a Laravel app, this kind of setup feels like a very approachable place to start.

References

Comments