flea/_deploy/README.md
ruberoid 169acd2181
All checks were successful
continuous-integration/drone/tag Build is passing
release
2025-10-16 17:38:25 +04:00

302 lines
7.5 KiB
Markdown

# 🚀 Nocr CI/CD Pipeline Documentation
## 📋 Overview
The Nocr project uses a modern, multi-pipeline CI/CD setup powered by Drone CI on Kubernetes. This document describes the 5 specialized pipelines and how to use them.
---
## 🎯 Pipeline Architecture
### Pipeline 1: **Feature Validation**
**Trigger:** Push to `feature/*` or `fix/*` branches
**Purpose:** Fast feedback for developers
**Duration:** ~3-5 minutes
**What it does:**
- Clones repo with submodules
- Restores all NuGet packages (shared cache)
- Builds all 4 services in Release mode
- Runs unit and integration tests with Testcontainers
**Example workflow:**
```bash
git checkout -b feature/add-new-filter
# Make changes...
git add .
git commit -m "Add new filter functionality"
git push origin feature/add-new-filter
```
Drone automatically runs tests. Check results before creating PR.
---
### Pipeline 2: **Main Validation**
**Trigger:** Push to `main` branch
**Purpose:** Validate main branch after merge
**Duration:** ~3-5 minutes
**What it does:**
- Same as Feature Validation
- Ensures main branch is always in working state
**Example workflow:**
```bash
# After PR is merged to main
# Pipeline runs automatically
```
---
### Pipeline 3: **Contracts-Only Publish**
**Trigger:** Tag with commit message containing `contracts_only:<service>`
**Purpose:** Fast publish of contract packages without building images
**Duration:** ~2 minutes
**What it does:**
- Packs specified service contracts into NuGet packages
- Publishes to internal NuGet feed
- Skips Docker image builds
**Example workflow:**
```bash
# Update telegram-listener contracts
cd telegram-listener
# Make changes to Async.Api.Contracts...
git add .
git commit -m "contracts_only:telegram_listener - Add MessageEdited event"
git push origin main
# Create tag
git tag v1.2.4-contracts
git push origin v1.2.4-contracts
```
**Supported markers:**
- `contracts_only:telegram_listener`
- `contracts_only:text_matcher`
- `contracts_only:users`
---
### Pipeline 4: **Full Release**
**Trigger:** Tag on main WITHOUT `contracts_only` or `deploy_only` in commit message
**Purpose:** Complete release cycle
**Duration:** ~8-10 minutes
**What it does:**
1. **Stage 1:** Publish all contracts to NuGet (parallel)
2. **Stage 2:** Build all Docker images with Kaniko (3 parallel streams)
3. **Stage 3:** Deploy to Kubernetes (only for tags matching `v*`)
**Example workflow:**
```bash
# Ready to release
git tag v1.3.0
git push origin v1.3.0
# Drone will:
# 1. Publish contracts
# 2. Build images (tagged with v1.3.0, commit SHA, and latest)
# 3. Deploy to k8s (if tag starts with 'v')
```
**Image tags created:**
- `hub.musk.fun/k8s/nocr/telegram_listener:v1.3.0`
- `hub.musk.fun/k8s/nocr/telegram_listener:abc1234` (commit SHA)
- `hub.musk.fun/k8s/nocr/telegram_listener:latest`
---
### Pipeline 5: **Deploy-Only**
**Trigger:** Tag with commit message containing `deploy_only:`
**Purpose:** Fast deploy of already-built images
**Duration:** ~1 minute
**What it does:**
- Skips building
- Deploys specified images to Kubernetes
- Useful for rolling back or promoting existing images
**Example workflow:**
```bash
# Deploy existing images
git commit --allow-empty -m "deploy_only: Deploy v1.2.9"
git tag v1.2.9-deploy
git push origin v1.2.9-deploy
```
---
## 🛠️ Deployment Scripts
All deployment scripts are located in `_deploy/scripts/`:
### `deploy.sh`
**Purpose:** Deploy services to Kubernetes
**Usage:**
```bash
./deploy.sh <tag> <commit-sha>
./deploy.sh v1.3.0 abc1234
```
**Features:**
- Updates deployment manifests with new image tags
- Applies manifests to cluster
- Waits for rollouts to complete with timeout
- Runs health checks after deployment
- Shows pod status
### `rollback.sh`
**Purpose:** Rollback deployments to previous version
**Usage:**
```bash
# Rollback single service
./rollback.sh telegram-listener
# Rollback all services
./rollback.sh all
```
**Features:**
- Shows revision history
- Performs kubectl rollout undo
- Waits for rollback to complete
- Runs health checks after rollback
### `health-check.sh`
**Purpose:** Check health of all Nocr services
**Usage:**
```bash
./health-check.sh
```
**Checks:**
- Pod status (Running/Ready)
- Health endpoints (/health)
- Recent events for failed pods
---
## 📦 Optimizations
### Shared NuGet Cache
All pipelines use a shared temp volume for NuGet packages:
- First `dotnet restore` downloads packages
- Subsequent builds reuse cached packages
- **~60% faster** than individual restores per service
### Parallel Execution
- Contract publishing: 3 services in parallel
- Docker builds: 3 parallel streams
- Independent operations never block each other
### Kaniko Caching
All Kaniko builds use:
- `--cache=true` - Layer caching enabled
- `--cache-repo=hub.musk.fun/k8s/cache/*` - Shared cache repo
- `--compressed-caching=true` - Faster cache transfer
---
## 🧪 Testcontainers Support
Feature and Main validation pipelines include Docker-in-Docker service for Testcontainers:
```yaml
services:
- name: docker
image: docker:27-dind
privileged: true
```
Tests can use Testcontainers to spin up real databases, message queues, etc.
---
## 🔒 Required Secrets
Configure these in Drone:
- `hub_username` - Docker registry username
- `hub_password` - Docker registry password
- `nuget_musk_api_key` - NuGet feed API key
---
## 📊 Pipeline Decision Tree
```
Push to feature/* → Feature Validation (build + test)
Push to main → Main Validation (build + test)
Tag + "contracts_only:" → Contracts Publish
Tag + "deploy_only:" → Deploy Only
Tag (no markers) → Full Release (contracts → images → deploy)
```
---
## 🎓 Best Practices
1. **Feature Branches**
- Always create feature branches for new work
- Let CI validate before merging to main
2. **Contracts Changes**
- Use `contracts_only:` for quick contract updates
- Other services can update references immediately
3. **Release Process**
- Tag only from main branch
- Use semantic versioning (v1.2.3)
- Tags starting with `v` auto-deploy to k8s
4. **Emergency Rollback**
```bash
# Quick rollback via deploy-only
git commit --allow-empty -m "deploy_only: Rollback to v1.2.8"
git tag v1.2.8-rollback
git push origin v1.2.8-rollback
# Or use rollback script directly on the cluster
kubectl exec -it deploy-pod -- bash
cd /flea/_deploy/scripts
./rollback.sh all
```
5. **Monitoring Deployments**
- Watch Drone UI for pipeline progress
- Check pod logs: `kubectl logs -f deployment/telegram-listener -n nocr`
- Run health checks: `./_deploy/scripts/health-check.sh`
---
## 🐛 Troubleshooting
### Pipeline stuck on "Waiting for contracts"
**Cause:** Contract publish failed
**Solution:** Check NuGet feed, verify API key
### Docker build fails with "unauthorized"
**Cause:** Invalid registry credentials
**Solution:** Update `hub_username` and `hub_password` secrets
### Tests fail with "Cannot connect to Docker daemon"
**Cause:** Testcontainers can't reach Docker-in-Docker service
**Solution:** Check `DOCKER_HOST` environment variable is set correctly
### Deployment fails with "ImagePullBackOff"
**Cause:** Image not found in registry
**Solution:** Verify image was built and pushed successfully in previous step
---
## 📚 Additional Resources
- [Drone CI Documentation](https://docs.drone.io/)
- [Kaniko Documentation](https://github.com/GoogleContainerTools/kaniko)
- [Testcontainers for .NET](https://dotnet.testcontainers.org/)
- [Kubernetes Deployments](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/)