Warning
This article was automatically translated by OpenAI (gpt-4o). It may be edited eventually, but please be aware that it may contain incorrect information at this time.
Many components deployed on the Tanzu Application Platform have metrics endpoints for Prometheus and are annotated with prometheus.io/scrape: true
to be targets for Prometheus Service Discovery.
While it is convenient that metrics are automatically collected just by introducing Prometheus or an agent compatible with Prometheus Service Discovery (such as DataDog, Wavefront, Grafana Agent), it can lead to the unintended collection of a large number of metrics, significantly increasing the number of metrics that are not being monitored.
Especially when using SaaS, the amount of metrics per unit time often correlates with the billing amount, potentially leading to a situation where you are "paying for metrics you are not using."
In this article, I will introduce a method to turn off the prometheus.io/scrape: true
setting and temporarily exclude components from being scraped. You can exclude unnecessary metrics for the time being and re-enable them when needed.
To exclude components from being scraped, you need to modify the manifest (annotations), so we will use overlays.
Below, I will introduce the configuration method for each package, but you can collectively change the tap-values.yaml
and update TAP for each profile. Also, I have not confirmed whether all packages are covered. I will only introduce the ones I have excluded.
Contour
Target Profiles: run, build, view, iterate, full
This component emits the most metrics. Since Envoy has a Grafana dashboard, you might consider utilizing it.
Create the following Secret:
cat <<EOF > contour-disable-scrape.yaml
---
apiVersion: v1
kind: Secret
metadata:
name: contour-disable-scrape
namespace: tap-install
type: Opaque
stringData:
contour-disable-scrape.yml: |
#@ load("@ytt:overlay", "overlay")
#@ for kind, name in [['DaemonSet', 'envoy'], ['Deployment', 'envoy'], ['Deployment', 'contour']]:
#@overlay/match by=overlay.subset({"kind": kind, "metadata": {"name": name}}), expects="0+"
---
spec:
template:
metadata:
annotations:
prometheus.io/scrape: "false"
#@ end
---
EOF
kubectl apply -f contour-disable-scrape.yaml
Add the following settings to tap-values.yaml
under package_overlays
and update TAP:
package_overlays:
- name: contour
secrets:
- name: contour-disable-scrape
# ...
# ...
Cloud Native Runtimes
Target Profiles: run, iterate, full
This component emits the second most metrics. Since Knative Serving has a Grafana dashboard, you might consider utilizing it.
Create the following Secret:
cat <<EOF > cnrs-disable-scrape.yaml
---
apiVersion: v1
kind: Secret
metadata:
name: cnrs-disable-scrape
namespace: tap-install
type: Opaque
stringData:
cnrs-disable-scrape.yml: |
#@ load("@ytt:overlay", "overlay")
#@overlay/match by=overlay.subset({"kind":"Deployment","metadata":{"namespace": "knative-serving"}}),expects="1+"
---
spec:
template:
#@overlay/match-child-defaults missing_ok=True
metadata:
annotations:
prometheus.io/scrape: 'false'
wavefront.com/scrape: 'false'
---
EOF
kubectl apply -f cnrs-disable-scrape.yaml
Add the following settings to tap-values.yaml
under package_overlays
and update TAP:
package_overlays:
- name: cnrs
secrets:
- name: cnrs-disable-scrape
# ...
# ...
cert-manager
Target Profiles: run, build, view, iterate, full
Since it is included in all profiles, it accumulates.
Create the following Secret:
cat <<EOF > cert-manager-disable-scrape.yaml
---
apiVersion: v1
kind: Secret
metadata:
name: cert-manager-disable-scrape
namespace: tap-install
type: Opaque
stringData:
cert-manager-disable-scrape.yml: |
#@ load("@ytt:overlay", "overlay")
#@overlay/match by=overlay.subset({"kind":"Deployment","metadata":{"namespace": "cert-manager"}}),expects="1+"
---
spec:
template:
#@overlay/match-child-defaults missing_ok=True
metadata:
annotations:
prometheus.io/scrape: 'false'
---
EOF
kubectl apply -f cnrs-disable-scrape.yaml
Add the following settings to tap-values.yaml
under package_overlays
and update TAP:
package_overlays:
- name: cert-manager
secrets:
- name: cert-manager-disable-scrape
# ...
# ...
Flux
Target Profiles: run, build, iterate, full
Create the following Secret:
cat <<EOF > fluxcd-source-controller-disable-scrape.yaml
---
apiVersion: v1
kind: Secret
metadata:
name: fluxcd-source-controller-disable-scrape
namespace: tap-install
type: Opaque
stringData:
fluxcd-source-controller-disable-scrape.yml: |
#@ load("@ytt:overlay", "overlay")
#@overlay/match by=overlay.subset({"kind":"Deployment","metadata":{"namespace": "flux-system"}}),expects="1+"
---
spec:
template:
#@overlay/match-child-defaults missing_ok=True
metadata:
annotations:
prometheus.io/scrape: 'false'
---
EOF
kubectl apply -f cnrs-disable-scrape.yaml
Add the following settings to tap-values.yaml
under package_overlays
and update TAP:
package_overlays:
- name: fluxcd-source-controller
secrets:
- name: fluxcd-source-controller-disable-scrape
# ...
# ...
If there are other components that are targets for scraping, you can handle them using the same method.
Ideally, increasing the number of metrics collected would make the platform more "observable" and better, but it is a trade-off with billing. The above settings go against the trend of enhancing "observability," so re-enable metrics collection as needed to make the platform more "observable."