AI Infrastructure Architect
Summary
I have been building, scaling, and occasionally breaking infrastructure for 20 years.
If there is one thing this industry has taught me, it's that engineering is fundamentally a business function. Nobody cares how elegant your Kubernetes architecture is if it doesn't save money, ship features faster, or stop the midnight alerts. My focus is on making platforms highly observable, heavily secured, and actually cost-efficient.
Right now, my absolute obsession is the intersection of Site Reliability Engineering and Artificial Intelligence. I don't just "call APIs." I build production-grade, strictly bounded AI infrastructure. I am currently architecting centralized, autonomous multi-agent systems and running 2-stage fine-tuning pipelines (4-bit QLoRA, NVFP4, GRPO/SFT) on bleeding-edge NVIDIA Blackwell and Grace ARM architectures. I deploy localized, air-gapped LLM environments (vLLM, Ollama) that guarantee zero data leakage for enterprise workloads.
But I am an SRE at heart. Before diving deep into LLMOps, some of the work I am most proud of includes:
Building an end-to-end observability stack (Coroot/eBPF, Prometheus, ClickHouse) processing billions of rows. When a system goes down, we don't guesswe know exactly why.
Successfully leading the infrastructure side of high-stakes, rigorous audits like PCI-DSS and TCMB 6493 without bringing delivery to a standstill.
Dropping resource usage by 16% and pushing service health up by 7% for enterprise SaaS platforms, proving that saving money and increasing uptime can happen simultaneously.
Whether it is optimizing a multi-cloud Kubernetes cluster, writing runbooks people actually use, or spinning up a self-hosted AI control-plane with strict context governance - I build systems that let developers sleep at night.
Always open to chatting about platform engineering, SRE culture, local AI deployments, or how to stop throwing unnecessary money at the cloud. Open to full-time and B2B contract roles globally.
Expectations
I have been building, scaling, and occasionally breaking infrastructure for 20 years.
If there is one thing this industry has taught me, it's that engineering is fundamentally a business function. Nobody cares how elegant your Kubernetes architecture is if it doesn't save money, ship features faster, or stop the midnight alerts. My focus is on making platforms highly observable, heavily secured, and actually cost-efficient.
Right now, my absolute obsession is the intersection of Site Reliability Engineering and Artificial Intelligence. I don't just "call APIs." I build production-grade, strictly bounded AI infrastructure. I am currently architecting centralized, autonomous multi-agent systems and running 2-stage fine-tuning pipelines (4-bit QLoRA, NVFP4, GRPO/SFT) on bleeding-edge NVIDIA Blackwell and Grace ARM architectures. I deploy localized, air-gapped LLM environments (vLLM, Ollama) that guarantee zero data leakage for enterprise workloads.
But I am an SRE at heart. Before diving deep into LLMOps, some of the work I am most proud of includes:
Building an end-to-end observability stack (Coroot/eBPF, Prometheus, ClickHouse) processing billions of rows. When a system goes down, we don't guesswe know exactly why.
Successfully leading the infrastructure side of high-stakes, rigorous audits like PCI-DSS and TCMB 6493 without bringing delivery to a standstill.
Dropping resource usage by 16% and pushing service health up by 7% for enterprise SaaS platforms, proving that saving money and increasing uptime can happen simultaneously.
Whether it is optimizing a multi-cloud Kubernetes cluster, writing runbooks people actually use, or spinning up a self-hosted AI control-plane with strict context governance - I build systems that let developers sleep at night.
Always open to chatting about platform engineering, SRE culture, local AI deployments, or how to stop throwing unnecessary money at the cloud. Open to full-time and B2B contract roles globally.
Employment Preferences
Relocation destinations:
- Austria
- United Kingdom
- Belgium
- Sweden
- Berlin, Germany
- Ireland
- Luxembourg
- Netherlands
- Norway
Spoken Languages
- English - Fluent
- Turkish - Native
Expected Base Salary
**,000 USD
Expected Total Compensation
**5,000 USD
Expected Hourly Rate
** USD/hr
Academic Degree
Experience
Total Professional Experience
Startup Experience
Big-Tech Companies
Enterprise Experience
Skills
- Linux
- Kubernetes
- Docker
- Rancher
- Helm
- Azure DevOps
- Bash
- Zabbix
- Grafana
- Site24x7
- Nginx
- Azure Infra
- AWS Essentials
- GCP Essentials
- Windows
- LLM
- VLLM
- LiteLLM
- Cuda
- Huggingface
- Fine-tune
- MLOps
- LLMOps
- GraceBlackwell
- NVIDIA
- Openai
- Anthropic
- Agentic Development Frameworks
- Agentic Workflows
- Langchain
- Langgraph
- Langfuse
- Qdrant
- Vectordb
- Graph
- Rag
Contacts are hidden
Send a connection request to the candidate to get their contact details.
Contact Candidate
