04 — Repository structure¶
For developers about to edit code. Maps every shipped directory to its purpose, names the runtime-only artefacts, and fixes the naming and variable-prefix rules every new file must obey.
The architectural why lives in plan
§7 (Repository),
§2.5 (Repository boundary) and
§2.6 (Ansible role contract).
Top-level tree¶
What is shipped in the repo (what you git clone):
k8s-lab/
├── Makefile # the only entry point for local workflows (§7, §10)
├── README.md # repo-level pointer to doc/
├── ansible/ # 14 roles + collection requirements + ansible.cfg
├── charts/ # 5 Helm charts — every K8s object lives here
├── clusterctl/ # reserved; runtime clusterctl.yaml is rendered by role templates
├── doc/ # this documentation set
├── plans/ # PLAN-stage1-*.md (English) — source of truth for "why"
├── PLAN-stage1-*.md # Russian originals at repo root
├── scripts/ # local-harness Python + shell helpers
├── terraform/ # exactly ONE module: workload_cluster (§16.4)
├── tests/ # Molecule + Vagrant + libvirt harness, fixtures
├── .ansible-lint # role lint config
├── .yamllint # repo-wide YAML lint config
└── .gitignore
What appears only at runtime (gitignored, never committed):
k8s-lab/
├── .artifacts/ # mgmt.kubeconfig, mgmt.auto.tfvars.json, clusters/* (mode 0600)
├── ansible/collections/ # ansible-galaxy collection install -p
├── tests/vagrant/debian13/.vagrant/ # Vagrant state
├── tests/molecule/<scenario>/.molecule/ # Molecule scenario state
└── tests/fixtures/terraform/**/.terraform/ + tfstate* # local TF state
The runtime split is deliberate: the repo carries no environment data,
no real kubeconfigs, no concrete tfvars. See plan
§2.5 for the boundary contract.
ansible/¶
ansible/
├── ansible.cfg # roles_path, collections_path, ssh_args, pipelining
├── requirements.yml # collection deps with lower-bound pins (§2.11)
├── collections/ # gitignored — populated by `make deps`
└── roles/ # 14 roles, snake_case directory names — see table below
The 14 roles in canonical-flow order (full details in
09-roles-reference.md):
| Role | One-line purpose |
|---|---|
base_system |
Host packages, kernel modules, sysctls, /opt/capi-lab root, btrfs mount contract. |
binary_fetch |
Download + checksum-verify pinned kubectl, clusterctl, and k3s into /opt/capi-lab/bin. |
lxd_host |
Install LXD via snap, pin channel, refresh policy, create the br-ext6 host bridge. |
lxd_project |
Reconcile the capi-lab LXD project. |
lxd_storage_pools |
btrfs pool capi-fast on a dedicated block device. |
lxd_network_int_managed |
LXD-managed capi-int bridge (internal dual-stack plane). |
lxd_profiles |
CAPN unprivileged-kubeadm profiles capi-controlplane / capi-worker. |
lxd_bootstrap_instance |
Transient capi-bootstrap-0 LXC + LXD proxy device for runner reach. |
bootstrap_k3s |
Single-node k3s server inside the bootstrap LXC. |
bootstrap_clusterctl |
clusterctl init on bootstrap k3s — CAPI/CABPK/KCP/CAPN. |
bootstrap_capn_secret |
CAPN identity Secret (incus-identity). |
export_artifacts |
Emit .artifacts/mgmt.kubeconfig + mgmt.auto.tfvars.json. |
pivot_clusterctl_move |
clusterctl init on mgmt-1 + clusterctl move from bootstrap. |
cleanup_bootstrap |
Destroy capi-bootstrap-0. |
Role contract (plan §2.6):
tasks/main.yml is dispatcher-only (include_tasks per topic);
defaults/main.yml carries the role's public vars prefixed with the
snake_case directory name; meta/main.yml declares dependencies:
(no ordering via prepare.yml or pre_tasks); each role has its
own README.md. Test harness:
tests/molecule/<role-in-kebab-case>/.
terraform/¶
terraform/
└── modules/
└── workload_cluster/ # the ONLY TF module shipped here (§16.4)
├── main.tf # 5 helm_release blocks + null_resource helm-tests
├── locals.tf
├── variables.tf
├── outputs.tf
├── providers.tf # hashicorp/helm + hashicorp/kubernetes (read-side)
└── versions.tf
No environment-specific TF roots exist in this repo —
terraform/environments/, terraform/live/, prod/, dev/ are
absent by design. The single module is consumed by:
tests/fixtures/terraform/workload-clusters/lab-default/(test consumer, the only TF root in this repo);- private consumer repos for real sites, which wire their own tfvars, backends, and credentials.
Boundary rule: plan §2.5. Module
inputs/outputs: 10-modules-and-charts.md.
charts/¶
charts/
├── capi-cluster-class/ # ClusterClass + Kubeadm*/LXC* templates (§16.2)
├── capi-workload-cluster/ # Cluster CR instance (§16.3)
├── cni-calico/ # wrapper over projectcalico/tigera-operator + Gate B (§17.2)
├── metallb/ # subchart wrapper over upstream metallb/metallb (§17.3)
└── metallb-config/ # IPAddressPool + L2Advertisement + Gate A (§17.3)
Per chart (schemas in 10-modules-and-charts.md):
| Chart | One-line purpose |
|---|---|
capi-cluster-class |
Topology template — ClusterClass, KubeadmControlPlaneTemplate, KubeadmConfigTemplate, LXC*MachineTemplate. Chart-version-as-CR-name pattern lives here. |
capi-workload-cluster |
The Cluster CR that references a ClusterClass — one release per workload. |
cni-calico |
Wraps tigera-operator as a subchart, sets Installation values, ships Gate B (CNI viability) helm-test Pod. |
metallb |
Thin wrapper over upstream metallb/metallb (CRDs + controller + speaker). |
metallb-config |
IPAddressPool + L2Advertisement bound to eth1, plus Gate A (external L2) helm-test Pod. |
Helm-first delivery is a closed contract — plan
§2.9. No raw YAML lives outside
charts/; Ansible never creates Kubernetes objects via
kubernetes.core.k8s state=present; Terraform never uses
kubernetes_manifest.
tests/¶
tests/
├── molecule/
│ ├── Makefile # scenario pattern rules (`<scenario>-delegated-*`)
│ ├── shared/
│ │ ├── converge.yml # universal — reads MOLECULE_SCENARIO_NAME
│ │ ├── verify.yml
│ │ ├── inventory/group_vars/k8slab_host.yml # the ONE substrate group_vars file (§9.5)
│ │ └── tasks/ # prepare.yml, prepare-btrfs-pool.yml,
│ │ # prepare-clean-disk.yml, ext6-ra-source.yml (in-VM radvd),
│ │ # wait-services.yml
│ ├── base-system/ # one scenario per role — kebab-case
│ ├── binary-fetch/ lxd-host/ lxd-project/ lxd-storage-pools/
│ ├── lxd-network-int-managed/ lxd-profiles/ lxd-bootstrap-instance/
│ ├── bootstrap-k3s/ bootstrap-clusterctl/ bootstrap-capn-secret/
│ ├── export-artifacts/ cleanup-bootstrap/
│ └── e2e-local/ # full canonical flow (§3.1) in one scenario
│
├── vagrant/debian13/
│ ├── Vagrantfile # libvirt provider; the shared local VM
│ ├── inventory.py # ad-hoc inventory generator for the running VM
│ ├── Makefile # `up`, `destroy`, `ssh`
│ └── libvirt-networks/ # mgmt-nat.xml, probe-ext6.xml (latter dormant)
│
└── fixtures/terraform/workload-clusters/lab-default/ # the only TF root in this repo (§16.5)
└── main.tf, variables.tf, providers.tf, outputs.tf
Three contracts make this layout work:
- Scenario name == role directory name (snake-to-kebab).
tests/molecule/base-system/targetsansible/roles/base_system/;shared/converge.ymlreadsMOLECULE_SCENARIO_NAMEto pick the role. Plan§9.5. - One substrate group_vars file.
shared/inventory/group_vars/k8slab_host.ymlholds every prod-like substrate value — uplink, storage pool spec, wait budgets, the LXDproxydevice. A scenario adds<scenario>/host_vars/k8slab-host.ymlonly for a true override. Plan§9.5.1. tests/fixtures/carries no implementation — only invokes reusable roles/modules/charts with synthetic inputs. Plan§2.7.
Workflow: 12-testing.md. Entry points: top-level
Makefile.
scripts/¶
scripts/
├── _harness.py # shared helpers: REPO_ROOT, vagrant ssh-config, run_make
├── molecule_run.py # Molecule wrapper — brings up shared VM, exports K8SLAB_HOST_*
├── render_kubeconfig.py # rewrite kubeconfig server URL into .artifacts/clusters/<name>.kubeconfig
├── export_bootstrap_facts.py # emit .auto.tfvars.json from bootstrap cluster facts (§11.1)
└── wait_for_cluster.sh # poll a kubeconfig until kube-apiserver returns Ready
_harness.py— shared helpers (REPO_ROOT,VAGRANT_DIR,PLATFORM_HOST_NAME = "k8slab-host", wrappers aroundvagrant ssh-configandmake). Never invoked directly.molecule_run.py— invoked bytests/molecule/Makefilepattern rules; brings up the shared VM, exportsK8SLAB_HOST_*, thenexecsmolecule <action> -s <scenario>.render_kubeconfig.py— rewrite kubeconfig server URL into.artifacts/clusters/<cluster>.kubeconfig(mode 0600).export_bootstrap_facts.py— emit.auto.tfvars.jsonfrom bootstrap facts. Plan§11.1.wait_for_cluster.sh—kubectl get --raw=/readyzpoll with a deadline.
Local-harness only; consumer repos do not need these.
clusterctl/¶
bootstrap_clusterctl renders the pinned runtime clusterctl.yaml
from ansible/roles/bootstrap_clusterctl/templates/clusterctl.yaml.j2
into /opt/capi-lab/etc/bootstrap_clusterctl/clusterctl.yaml.
Versions track the §8 / §8a verified-version log in the plan.
.artifacts/¶
Runtime-only directory — gitignored except .gitkeep. Plan
§11.1.
.artifacts/ # mode 0700
├── .gitkeep # tracked
├── mgmt.kubeconfig # mode 0600 — active management cluster kubeconfig
├── mgmt.auto.tfvars.json # mode 0600 — handoff bundle for §16.5 TF fixture
├── harness-vm-id # current Vagrant VM id (cascade-clean tracking)
└── clusters/<name>.kubeconfig # mode 0600 — per-workload debug kubeconfigs
Contract:
- File mode
0600, directory mode0700, owner = runner user. mgmt.kubeconfigis the same file through the whole canonical flow (§3.1): first points at bootstrap k3s, thenexport_artifactsrewrites it in place after pivot to point at mgmt-1.mgmt.auto.tfvars.jsonis the JSON handoff from Ansible to Terraform (consumed via-var-file=).clusters/<name>.kubeconfigis produced bymake workload-kubeconfigfrom thekubeconfigTF output.
Cleanup: make clean-mgmt-bundle, make clean-workload-kubeconfig,
make clean-local.
What is NOT in this repo¶
Plan §2.5 is enforced by
omission. You will not find:
| Not here | Lives in |
|---|---|
inventories/prod/, inventories/local/ with real hosts |
A private consumer repo. |
host_vars/<real-fqdn>.yml with real IPs / ULA prefixes |
A private consumer repo. |
| Plaintext or vault secrets, real LXD trust certificates | A private secrets store mounted into the consumer repo at runtime. |
terraform/environments/<site>/ root modules |
A private consumer repo, importing terraform/modules/workload_cluster/. |
*.tfvars / *.tfvars.json for real sites |
A private consumer repo (gitignored here by design). |
make deploy TARGET=..., make destroy TARGET=... |
The consumer repo's own Makefile. |
*.deploy.yml / *.production.yml playbooks |
A private consumer repo. |
Consumer-repo pattern: 07-deployment-guide.md.
Naming conventions¶
Mixing them silently is forbidden — a target like
base_system-delegated-test (snake + kebab in one name) is rejected
at review.
Ansible roles¶
- Role directory in
ansible/roles/is snake_case:base_system/,lxd_storage_pools/,bootstrap_clusterctl/. - Role display-name in task / handler names is kebab-case:
base-system,lxd-storage-pools. - Task names follow
<role> | <section> | <action>with the role part in kebab-case:base-system | preflight | assert opt root present. - Handler names follow
<role> | handlers | <action>:lxd-host | handlers | restart snap.lxd.daemon.
The asymmetry is deliberate: Ansible variable names must match the
directory (snake_case-only); display names are kebab-case for parity
with Make targets and scenario directories. Plan
§2.6.3.
Molecule scenarios + Make targets¶
- Scenario directory under
tests/molecule/<name>/is kebab-case:tests/molecule/base-system/,tests/molecule/lxd-storage-pools/. - Scenario name == role directory name with
_→-.base_system(role) →base-system(scenario). - Make targets are kebab-case end-to-end:
make -C tests/molecule base-system-delegated-test,make -C tests/molecule e2e-local-vagrant-converge.
The reverse mapping (scenario → role) is read at runtime from
MOLECULE_SCENARIO_NAME by tests/molecule/shared/converge.yml.
Plan §2.6.3,
§9.5.2.
Helm charts¶
- Chart directory in
charts/is kebab-case:charts/capi-cluster-class/,charts/cni-calico/. - Chart name in
Chart.yamlmatches the directory exactly. - Chart-version-as-CR-name pattern (plan
§2.9,§12.10) appendsChart.Versionwith.→-:capi-cluster-class-0-3-0.
Variable prefix rules¶
Ansible variable hygiene is a hard contract. Three categories; every variable must fall into exactly one.
Project globals: k8s_lab_*¶
Every project-wide variable carries the k8s_lab_ prefix. These are
the stable inter-role interface — anything more than one role reads
is a global by definition.
Examples (full list in
08-configuration-reference.md;
schema in plan §8):
k8s_lab_opt_root: "/opt/capi-lab"
k8s_lab_project_name: "capi-lab"
k8s_lab_uplink_interface: "eth0"
k8s_lab_external_bridge_name: "br-ext6"
k8s_lab_internal_network_name: "capi-int"
k8s_lab_external_ipv6_prefix: "..."
k8s_lab_metallb_vip_range_v6: "..."
k8s_lab_lxd_host_address: "..."
The _section_ fragment (storage, internal, external,
images) is part of the flat variable name, not a YAML
namespace: k8s_lab_storage_pool_name is one identifier, not
k8s_lab.storage.pool_name.
Naked globals like opt_root, enabled, api_publish_port are
banned — plan §2.6.5. They
collide silently with vars inherited from wider inventory, make
grep-by-name useless, and hide ownership.
Role public vars: <role_name>_*¶
Every variable a role exposes in defaults/main.yml carries the
role's snake_case directory name as a prefix:
# ansible/roles/base_system/defaults/main.yml
base_system_enabled: true
base_system_btrfs_pool_required: true
base_system_extra_kernel_modules: [wireguard]
# ansible/roles/lxd_host/defaults/main.yml
lxd_host_snap_channel: "6/stable"
lxd_host_snap_refresh_mode: "hold"
Only the role itself reads these. A role MUST NOT read variables
with another role's <other_role>_* prefix — that creates
coupling invisible to either role's contract. Cross-role
communication goes through k8s_lab_* globals or set_fact (next
section). Plan §2.6.2,
§2.6.5. Booleans are
affirmatively named: <role>_enabled, <role>_flow_control_*.
do_* / with_* / run_* are banned.
Role private vars: _<role_name>_*¶
Internal facts, derived values, registers, and helper vars carry a leading underscore plus the role prefix:
# ansible/roles/lxd_storage_pools/tasks/pools.yml
- ansible.builtin.uri: ...
register: _lxd_storage_pools_pools_query_register
- ansible.builtin.set_fact:
_lxd_storage_pools_pools_existing: "{{ ... }}"
Patterns: facts _<role>_<section>_<fact>; registers
_<role>_<section>_<purpose>_register. The leading underscore says
"internal — do not depend on this from outside the role". Faceless
register names like result, out, tmp are forbidden. Plan
§2.6.2,
§2.6.3.
Cross-role communication¶
A role needs a value computed by another. Exactly two supported
mechanisms (plan §2.6.5):
- Project global with the
k8s_lab_prefix. For stable values in the substrate contract (paths, network names, addresses). Documented in plan§8and08-configuration-reference.md; sourced from inventorygroup_vars(production) or fromtests/molecule/shared/inventory/group_vars/k8slab_host.yml(test harness). - Runtime fact via
set_factnamed_<role>_<section>_<fact>, for values that only exist after a role has run. The producing role must be in the consumer'smeta/main.ymldependencies:.
Anything else — naked globals, reading another role's
<other_role>_* defaults, arranging order via pre_tasks /
import_role to substitute meta-deps — is forbidden by plan
§2.6.5. Violations create hidden
dependencies that pass one Molecule scenario and break another.
The plan says why; this chapter says how it looks once you follow it. Mirror the patterns above when adding new code, and grep the existing layout for the closest analogue.