Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
cb21aa0
Add K8s Agent: Streamlit-based on-prem Kubernetes cluster management UI
devin-ai-integration[bot] Apr 6, 2026
dc1fb13
Clean up unused imports and dependencies
devin-ai-integration[bot] Apr 6, 2026
5566c5e
Add CRI-O custom storage paths and proxy settings for master node
devin-ai-integration[bot] Apr 6, 2026
b47ee9e
Add gitignore entries for profile data and pycache
devin-ai-integration[bot] Apr 6, 2026
cb3658f
Add step-by-step SSH provisioning with granular per-step progress
devin-ai-integration[bot] Apr 6, 2026
3abe2dd
Replace streamlit-option-menu with native st.radio for reliable sideb…
devin-ai-integration[bot] Apr 6, 2026
067861e
Make LLM fully optional with graceful fallbacks and add offline manif…
devin-ai-integration[bot] Apr 6, 2026
5834226
Add imported cluster support, upgrade planner, version 1.35, PSS expl…
devin-ai-integration[bot] Apr 6, 2026
fc3b900
Fix kubeconfig import: move file_uploader outside st.form to prevent …
devin-ai-integration[bot] Apr 6, 2026
d68e269
Add kubectl path detection and namespace auto-fetch for imported clus…
devin-ai-integration[bot] Apr 6, 2026
07a1b72
Improve kubectl detection: drop os.access check, add subprocess which…
devin-ai-integration[bot] Apr 6, 2026
b239963
Add metrics install, deployment scaling, pod shell, resource dropdown…
devin-ai-integration[bot] Apr 7, 2026
2285ea5
Add Node Containers (crictl) tab: view containers per node via SSH or…
devin-ai-integration[bot] Apr 7, 2026
252e941
Add cluster reset/teardown feature with re-provision option
devin-ai-integration[bot] Apr 7, 2026
88b1984
Add Resource Requests/Limits tab: tabular view of CPU, memory, epheme…
devin-ai-integration[bot] Apr 7, 2026
b63824a
Fix Devin Review: Disk Usage key mismatch in CATEGORY_MAP, add timest…
devin-ai-integration[bot] Apr 7, 2026
23324fd
Add proper feedback messages for buttons: imported cluster guards, su…
devin-ai-integration[bot] Apr 7, 2026
6a43a3a
Add flash messages for Import Cluster and Create Profile so success/e…
devin-ai-integration[bot] Apr 7, 2026
ad0ad39
Enrich Cluster Details for imported clusters: show node IPs, roles, k…
devin-ai-integration[bot] Apr 7, 2026
937bc79
Add Multi-Cluster Dashboard, Certificate Manager, Cost Optimizer, Pod…
devin-ai-integration[bot] Apr 7, 2026
2df2413
Add Smart Log Analysis (LogAI-inspired): clustering, anomaly detectio…
devin-ai-integration[bot] Apr 7, 2026
85f833a
Fix proxy /etc/environment format: use KEY=VALUE for pam_env, source …
devin-ai-integration[bot] Apr 7, 2026
95c61e3
Fix unquoted paths in rm -rf reset commands and sanitize profile name…
devin-ai-integration[bot] Apr 7, 2026
f91b129
Add pod/container dropdowns to Pod Logs tab and fix fetch button feed…
devin-ai-integration[bot] Apr 7, 2026
c70cd7f
Add Istio/Envoy access log analysis: response time analytics, status …
devin-ai-integration[bot] Apr 7, 2026
63ecf08
Remove Helm/Network Policy tabs, remove init containers from Resource…
devin-ai-integration[bot] Apr 7, 2026
368b50d
Fix: restrict profile JSON file permissions to 0600 to protect kubeco…
devin-ai-integration[bot] Apr 8, 2026
b06dbf3
Fix Set Active profile button (delete widget state before rerun), quo…
devin-ai-integration[bot] Apr 10, 2026
0d9734d
Add 'Collect pod logs' mode to Smart Log Analysis for Istio access lo…
devin-ai-integration[bot] Apr 10, 2026
45560a6
Fix profile switching: sync widget key with active_profile before sel…
devin-ai-integration[bot] Apr 10, 2026
c89f8cc
Add Ollama LLM support: provider selection (OpenAI/Ollama), local Oll…
devin-ai-integration[bot] Apr 10, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -22,3 +22,6 @@ charts/*/charts/
*.pem
*.key
kubeconfig*
k8s-agent/__pycache__/
k8s-agent/data/profiles/*.json
k8s-agent/modules/__pycache__/
87 changes: 87 additions & 0 deletions k8s-agent/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
# K8s Agent — On-Prem Kubernetes Cluster Management

A Streamlit-based UI for managing on-premises Kubernetes clusters with CRI-O container runtime and Flannel CNI.

## Features

1. **Profile Manager** — Create and manage profiles for multiple clusters with node definitions (control-plane / worker), SSH credentials, and K8s configuration.

2. **Cluster Creation** — SSH into nodes and provision a full Kubernetes cluster:
- Installs CRI-O container runtime
- Installs kubeadm, kubelet, kubectl
- Initializes control plane with best-practice kubeadm config
- Deploys Flannel CNI
- Joins worker nodes automatically
- Applies security hardening (NetworkPolicies, RBAC, ResourceQuotas, PodSecurity)

3. **Cluster Debugger** — Run diagnostic commands and get AI-powered analysis:
- Pre-built checks for nodes, pods, networking, storage, certificates
- Category-based scanning (Cluster Overview, Networking, Security, etc.)
- Custom command execution via SSH
- AI-powered root cause analysis and remediation recommendations

4. **Monitoring Setup** — Deploy Prometheus + Grafana with production-ready configuration:
- One-click kube-prometheus-stack installation
- Grafana dashboard imports (cluster overview, node exporter, pods, etcd, API server, etc.)
- Alerting rules for node health, pod crashes, disk pressure, etcd latency
- AI-powered monitoring recommendations

5. **Log Analysis** — Collect, parse, and correlate logs across cluster components:
- System component logs (kubelet, CRI-O, API server, etcd, Flannel, CoreDNS)
- Pod-level log collection with previous container support
- Automated error pattern extraction and grouping
- Cross-source error correlation
- AI-powered deep log analysis and root cause identification

6. **AI Assistant** — Chat interface for Kubernetes questions powered by your LLM.

## Quick Start

```bash
cd k8s-agent
pip install -r requirements.txt

# Set your LLM API key
export LLM_API_KEY="your-api-key"
# Or use the Infosys AI Gateway key
export INFOSYS_CODER_API_KEY="your-key"

# Run the app
streamlit run app.py
```

## Configuration

Environment variables:

| Variable | Description | Default |
|----------|-------------|---------|
| `LLM_API_URL` | LLM API endpoint | Infosys AI Gateway |
| `LLM_API_KEY` | LLM API key | Falls back to `INFOSYS_CODER_API_KEY` |
| `LLM_MODEL` | Model name | `gpt-4` |
| `LLM_TEMPERATURE` | Response temperature | `0.3` |
| `LLM_MAX_TOKENS` | Max response tokens | `4096` |

## Architecture

```
k8s-agent/
├── app.py # Main Streamlit application
├── config.py # Configuration and environment variables
├── requirements.txt # Python dependencies
├── modules/
│ ├── llm_client.py # LLM API integration (query + streaming)
│ ├── profile_manager.py # Cluster profile CRUD operations
│ ├── cluster_creator.py # SSH-based cluster provisioning
│ ├── cluster_debugger.py # Diagnostic commands and AI analysis
│ ├── monitoring_setup.py # Prometheus/Grafana deployment
│ └── log_analyzer.py # Log collection, parsing, correlation
├── templates/ # Configuration templates
└── data/profiles/ # Stored cluster profiles (JSON)
```

## Requirements

- Python 3.10+
- SSH access to target nodes (for cluster operations)
- LLM API endpoint (Infosys AI Gateway or compatible OpenAI-style API)
Loading