Metra AI — Production SaaS for content automation in Telegram
We built a turnkey SaaS platform with multi-agent LLM orchestration. From architecture to launch in 3 months.
01Structure parserparses anatomy, preserves links
02Content rewriterparagraph by paragraph, isolated calls
03Stylingemoji and channel formatting
04Rule validatorchannel-specific rules
The business problem
Telegram channel owners and content teams spend an enormous amount of time manually creating posts. Standard solutions either don't fit Telegram's specifics (Buffer, Hootsuite are tailored for Instagram/Twitter) or rely on raw ChatGPT output, which produces low-quality, generic content that requires lengthy manual editing.
The main pain point: content teams spend 60–80% of their time on production instead of strategy, and quality suffers because AI-generated content usually lacks brand voice, channel context, and real-time relevance.
What we built
A full-fledged SaaS platform that automates the entire content workflow in Telegram:
- AI-powered post generation that preserves brand voice and channel lore
- A scheduling system with reusable weekly presets
- Real-time data integration for news content
- A built-in Telegram CRM for working with leads without exposing the owner's account data
- Multi-account infrastructure with multiple operators for agencies running many channels
Architecture: why multi-agent, not a single LLM call
The key technical innovation is a multi-stage LLM orchestration pipeline instead of single API calls. This is a deliberate architectural choice grounded in an important insight:
LLMs perform poorly when given too many simultaneous constraints. A single 3000-token prompt asking to "rewrite the post in voice X, with lore Y, in format Z, following rules A/B/C" produces unstable results, because attention is diluted across the requirements.
The solution: break post generation into specialized stages, each with one clear area of responsibility.
Standard posting pipeline
- Structure parser — extracts the anatomy of the post and preserves links (which premium LLMs strip out by default)
- Content rewriter — processes each paragraph with isolated calls, preserving structural integrity
- Styling — adds emoji and formatting to match the channel's persona
- Rule validator — applies channel-specific rules (for example, remove punctuation, limit length)
Extended pipeline (generation from scratch)
- Archetype selector — chooses the post structure based on content type and length parameters
- Block-by-block generator — writes each section with focused context
- Applying style and formatting
- Validation against auto-rules
This architecture eliminates the typical AI failures — mid-text hallucinations, structural drift, over-application of lore, format breakage — that single LLM calls produce.
Key technical findings
1. The prompt normalization layer
The image model rejected legitimate prompts too often — false positives on ordinary requests like generating people. Instead of switching to a more expensive model, we built a prompt normalization layer that rephrases harmless user input so it isn't blocked by mistake, while preserving the original meaning. This let us keep quality high without hosting more expensive alternatives.
2. Lore compression and translation
The channel lore provided by the user (often 3000+ tokens of unstructured text) is compressed and translated into the channel's posting language at upload time. The AI receives a structured summary in the required language instead of raw lore — this sharply increases content relevance and reduces token costs.
3. Lore as context, not a constraint
Through iteration we discovered that the AI over-applies lore when it's given as a direct instruction. We designed lore as soft context that influences but does not dominate generation — content comes out more natural.
4. Premium LLM selection strategy
We selected specific LLMs for specific tasks:
- Fewer false refusals on legitimate edge cases
- Better real-time news integration via a Perplexity-style API
- Cost optimization at scale
Infrastructure
- Several Ubuntu servers in production (main service, CRM, staging)
- 16 Docker containers with a clean separation of responsibilities
- The backend as the single source of truth — all client requests go through the backend, never directly to AI providers or the database
- Monitoring stack: Prometheus, Grafana, Sentry for errors, Uptime Kuma for service health
- Encryption: all sensitive data (phone numbers, messages, passwords) is encrypted with proper salt+pepper and GPU-resistant hashing. Decryption keys are stored off-server.
- Security: 2FA, JWT rotation, session fingerprinting, domain proxying
The result
3 mo
From development to launch
16
Docker containers in production
25/day
Auto-posts per channel on the paid plan
Launched into production within the 3-month development window. The multi-account CRM lets agencies manage operators without exposing channel owner data. Active early traction with a growing user base.
Technology stack
| Backend | FastAPI · Python · Celery |
| Frontend | React · TypeScript · Next.js |
| Database | PostgreSQL · Redis |
| Infrastructure | Docker · Nginx · Multiple Ubuntu servers |
| Monitoring | Prometheus · Grafana · Sentry · Uptime Kuma |
| AI / LLM | Multi-provider stack (proprietary + open source) |
What this demonstrates
- The ability to design and build a turnkey production SaaS
- A deep understanding of LLM constraints and how to work around them with architecture
- Production-grade security and infrastructure
- Pragmatic cost optimization at the architectural level
- End-to-end product thinking: business problem → technical solution → deployment → operations