Greenplum on Kubernetes 1.0がリリースされたので試してみます。今回はPrivate Network上のMicroK8sにインストールします。

目次

Tanzu Greenplum on Kubernetes 1.0のダウンロード

Tanzu Greenplum on Kubernetes 1.0は次のリンクからダウンロード可能です(多分、要ライセンス登録):

https://support.broadcom.com/group/ecx/productfiles?subFamily=VMware%20Tanzu%20Greenplum%20on%20Kubernetes&displayGroup=VMware%20Tanzu%20Greenplum%20on%20Kubernetes&release=1.0.0&os=&servicePk=&language=EN

"Terms and Conditions"のリンクをクリックした後、"I agree ..."にチェックを入れます。

image

次のファイルをダウンロードします。

  • gp-command-center-7.5.0.tar.gz
  • gp-instance-7.5.2.tar.gz
  • gp-operator-1.0.0.tgz
  • gp-operator-image-v1.0.0.tar.gz

image

Tanzu Greenplum on Kubernetes 1.0のリロケーション

ダウンロードしたら次の3つのイメージをローカルのDockerにロードします。

docker load -i gp-operator-image-v1.0.0.tar.gz
docker load -i gp-instance-7.5.2.tar.gz
docker load -i gp-command-center-7.5.0.tar.gz

今回はPrivate Network上のSelf HostedなContainer Registryにプッシュしてからインストールします。push先はPrivate Network上でないとEULAに違反する恐れがあるので気をつけてください。

REGISTRY_HOSTNAME=my-private-registry.example.com
REGISTRY_USERNAME=your-username
REGISTRY_PASSWORD=your-password
docker login ${REGISTRY_HOSTNAME} -u ${REGISTRY_USERNAME} -p ${REGISTRY_PASSWORD}

次のコマンドでイメージをタグ付けしてプッシュします。

DESTINATION_REPOSITORY=${REGISTRY_HOSTNAME}

docker tag tds-greenplum-docker-prod-local.usw1.packages.broadcom.com/greenplum/gp-operator/gp-operator:v1.0.0 ${DESTINATION_REPOSITORY}/greenplum/gp-operator/greenplum-operator:v1.0.0
docker push ${DESTINATION_REPOSITORY}/greenplum/gp-operator/greenplum-operator:v1.0.0

docker tag tds-greenplum-docker-prod-local.usw1.packages.broadcom.com/greenplum/gp-operator/gp-instance:7.5.2 ${DESTINATION_REPOSITORY}/greenplum/gp-operator/gp-instance:7.5.2
docker push ${DESTINATION_REPOSITORY}/greenplum/gp-operator/gp-instance:7.5.2

docker tag tds-greenplum-docker-prod-local.usw1.packages.broadcom.com/greenplum/gp-operator/gp-command-center:7.5.0 ${DESTINATION_REPOSITORY}/greenplum/gp-operator/gp-command-center:7.5.0
docker push ${DESTINATION_REPOSITORY}/greenplum/gp-operator/gp-command-center:7.5.0

必須ではないですが、HelmチャートもGitHub Container Registryにプッシュしておきます。

helm push ./gp-operator-1.0.0.tgz oci://${DESTINATION_REPOSITORY}/greenplum/gp-operator

Greenplum Operatorのインストール

Helm Chartのデフォルト値を確認します。

$ helm show values oci://${DESTINATION_REPOSITORY}/greenplum/gp-operator/gp-operator --version 1.0.0
Pulled: my-private-registry.example.com/gp-operator/gp-operator:1.0.0
Digest: sha256:af0787eef38e853458c0b211c14a0ad41b01a71eb925dd9f980ece60c719b433
controllerManager:
  operator:
    args:
    - --leader-elect
    - --health-probe-bind-address=:8081
    - --metrics-bind-address=:8443
    - --webhook-cert-path=/tmp/k8s-webhook-server/serving-certs
    containerSecurityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
    env:
      certManagerClusterIssuerName: ""
      certManagerNamespace: cert-manager
    image:
      repository: ${DESTINATION_REPOSITORY}/greenplum/gp-operator/gp-operator
      tag: v1.0.0
    imagePullPolicy: Always
    resources:
      limits:
        cpu: 200m
        memory: 500Mi
      requests:
        cpu: 200m
        memory: 500Mi
  podSecurityContext:
    runAsNonRoot: true
  replicas: 1
  serviceAccount:
    annotations: {}
imagePullSecrets: []
kubernetesClusterDomain: cluster.local
metricsService:
  ports:
  - name: https
    port: 8443
    protocol: TCP
    targetPort: 8443
  type: ClusterIP
webhookService:
  ports:
  - port: 443
    protocol: TCP
    targetPort: 9443
  type: ClusterIP

imageをrelocationしたものに変えるためにvaluesファイルを作成します。

cat <<EOF > gp-operator-values.yaml
---
controllerManager:
  operator:
    image:
      repository: ${DESTINATION_REPOSITORY}/greenplum/gp-operator/greenplum-operator
      tag: v1.0.0
    resources:
      requests:
        cpu: 100m
        memory: 128Mi
imagePullSecrets:
- name: gp-operator-registry-secret
---
EOF

imagePullSecretsで指定するSecretを作成します。

kubectl create namespace gpdb
kubectl create secret docker-registry gp-operator-registry-secret \
    --docker-server=${REGISTRY_HOSTNAME} \
    --docker-username=${REGISTRY_USERNAME} \
    --docker-password="${REGISTRY_PASSWORD}" \
    -n gpdb

Helm Chartのテンプレートを確認します。

helm template gp-operator oci://${DESTINATION_REPOSITORY}/greenplum/gp-operator/gp-operator -n gpdb --version 1.0.0 -f gp-operator-values.yaml

Greenplum Operatorをインストールします。

helm upgrade --install \
  -n gpdb \
  gp-operator \
  oci://${DESTINATION_REPOSITORY}/greenplum/gp-operator/gp-operator \
  -f gp-operator-values.yaml \
  --version 1.0.0 \
  --wait

GP OperatorのPodが起動していることを確認します。

$ kubectl get pod -n gpdb 
NAME                                              READY   STATUS    RESTARTS   AGE
gp-operator-controller-manager-86985f94f7-czlkc   1/1     Running   0          35s

GP OperatorのCRDが作成されていることを確認します。

$ kubectl api-resources --api-group=greenplum.data.tanzu.vmware.com
NAME                       SHORTNAMES   APIVERSION                           NAMESPACED   KIND
greenplumbackuplocations                greenplum.data.tanzu.vmware.com/v1   true         GreenplumBackupLocation
greenplumbackups           gpbackup     greenplum.data.tanzu.vmware.com/v1   true         GreenplumBackup
greenplumclusters          gp           greenplum.data.tanzu.vmware.com/v1   true         GreenplumCluster
greenplumcommandcenters    gpcc         greenplum.data.tanzu.vmware.com/v1   true         GreenplumCommandCenter
greenplumrestores          gprestore    greenplum.data.tanzu.vmware.com/v1   true         GreenplumRestore
greenplumversions                       greenplum.data.tanzu.vmware.com/v1   false        GreenplumVersion

Greenplum Clusterの作成

GP Operatorのインストールが完了したら、Greenplum Clusterを作成します。まず、Greenplum Versionを作成します。

cat <<EOF > greenplumversion-7.5.2.yaml
---
apiVersion: greenplum.data.tanzu.vmware.com/v1
kind: GreenplumVersion
metadata:
  name: greenplumversion-7.5.2
spec:
  dbVersion: 7.5.2
  image: ${DESTINATION_REPOSITORY}/greenplum/gp-operator/gp-instance:7.5.2
  operatorVersion: 1.0.0
  extensions:
  - name: postgis
    version: 3.3.2
  - name: greenplum_backup_restore
    version: 1.31.0
  gpcc:
    version: 7.5.0
    image: ${DESTINATION_REPOSITORY}/greenplum/gp-operator/gp-command-center:7.5.0
---
EOF

kubectl apply -f greenplumversion-7.5.2.yaml

Greenplum Clusterを作成するNamespaceを作成し、イメージプル用のSecretを作成します。

kubectl create namespace demo
kubectl create secret docker-registry gp-operator-registry-secret \
    --docker-server=${REGISTRY_HOSTNAME} \
    --docker-username=${REGISTRY_USERNAME} \
    --docker-password="${REGISTRY_PASSWORD}" \
    -n demo

Greenplum Clusterのマニフェストを作成します。今回はCoordinatorが1ノード、Segmentが2ノードの構成にします。

cat <<EOF > gp-demo.yaml
---
apiVersion: greenplum.data.tanzu.vmware.com/v1
kind: GreenplumCluster
metadata:
  name: gp-demo
  namespace: demo
spec:
  version: greenplumversion-7.5.2
  imagePullSecrets:
  - gp-operator-registry-secret
  coordinator:
    storageClassName: microk8s-hostpath
    service:
      type: LoadBalancer
    storage: 1Gi
  global:
    gucSettings:
    - key: wal_level
      value: logical
  segments:
    count: 2
    storageClassName: microk8s-hostpath
    storage: 10Gi
---
EOF

kubectl apply -f gp-demo.yaml

しばらくすると次のリソースが作成されます。

$ kubectl get gp,sts,pod,svc,pvc -n demo -owide 
NAME                                                       STATUS    AGE
greenplumcluster.greenplum.data.tanzu.vmware.com/gp-demo   Running   7m58s

NAME                                   READY   AGE     CONTAINERS   IMAGES
statefulset.apps/gp-demo-coordinator   1/1     7m6s    instance     my-private-registry.example.com/greenplum/gp-operator/gp-instance:7.5.2
statefulset.apps/gp-demo-segment       2/2     6m56s   instance     my-private-registry.example.com/greenplum/gp-operator/gp-instance:7.5.2

NAME                        READY   STATUS    RESTARTS   AGE     IP             NODE     NOMINATED NODE   READINESS GATES
pod/gp-demo-coordinator-0   1/1     Running   0          7m6s    10.1.173.177   cherry   <none>           <none>
pod/gp-demo-segment-0       1/1     Running   0          6m56s   10.1.173.178   cherry   <none>           <none>
pod/gp-demo-segment-1       1/1     Running   0          6m43s   10.1.42.171    banana   <none>           <none>

NAME                           TYPE           CLUSTER-IP       EXTERNAL-IP      PORT(S)          AGE     SELECTOR
service/gp-demo-headless-svc   ClusterIP      None             <none>           <none>           7m7s    cluster-name=gp-demo,cluster-namespace=demo
service/gp-demo-svc            LoadBalancer   10.152.183.135   192.168.11.241   5432:30917/TCP   3m45s   cluster-name=gp-demo,cluster-namespace=demo,type=coordinator

NAME                                                STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS        VOLUMEATTRIBUTESCLASS   AGE     VOLUMEMODE
persistentvolumeclaim/state-gp-demo-coordinator-0   Bound    pvc-790a5953-33e1-4f7e-875f-7fba41386016   1Gi        RWO            microk8s-hostpath   <unset>                 7m6s    Filesystem
persistentvolumeclaim/state-gp-demo-segment-0       Bound    pvc-2c114171-13b9-4dd6-b6c7-037018f52843   10Gi       RWO            microk8s-hostpath   <unset>                 6m56s   Filesystem
persistentvolumeclaim/state-gp-demo-segment-1       Bound    pvc-1a458596-9e4d-408d-a200-f16fb643c292   10Gi       RWO            microk8s-hostpath   <unset>                 6m43s   Filesystem

動作確認

LoadBalancerのEXTERNAL-IPに対して5432ポートで接続します。gpadminユーザーのパスワードはSecretに格納されています。

$ kubectl get secret -n demo gp-demo-creds -ojson | jq '.data | map_values(@base64d)'
{
  "gpadmin": "8mOJ2tYC9e4AIu"
}

次のコマンドで接続します。

$ psql postgresql://gpadmin:8mOJ2tYC9e4AIu@192.168.11.241:5432/postgres
psql (17.0, server 12.22)
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off, ALPN: none)
Type "help" for help.

postgres=# 

初期状態は次の通りです。

postgres=# \l
                                               List of databases
   Name    |  Owner  | Encoding | Locale Provider | Collate | Ctype | Locale | ICU Rules |  Access privileges  
-----------+---------+----------+-----------------+---------+-------+--------+-----------+---------------------
 postgres  | gpadmin | UTF8     | libc            | C       | C     |        |           | 
 template0 | gpadmin | UTF8     | libc            | C       | C     |        |           | =c/gpadmin         +
           |         |          |                 |         |       |        |           | gpadmin=CTc/gpadmin
 template1 | gpadmin | UTF8     | libc            | C       | C     |        |           | =c/gpadmin         +
           |         |          |                 |         |       |        |           | gpadmin=CTc/gpadmin
(3 rows)

利用可能なExtensionは次の通りです。ベクトル検索のためのpgvectorやUberで使われている地理インデックスシステムであるh3などが利用可能です。

postgres=# SELECT * FROM pg_available_extensions ORDER BY name;
             name             |    default_version    | installed_version |                                                       comment                                                       
------------------------------+-----------------------+-------------------+---------------------------------------------------------------------------------------------------------------------
 address_standardizer         | 3.3.2                 |                   | Used to parse an address into constituent elements. Generally used to support geocoding address normalization step.
 address_standardizer_data_us | 3.3.2                 |                   | Address Standardizer US dataset example
 advanced_password_check      | 1.4                   |                   | Advanced Password Check
 anon                         | 2.1.0                 |                   | Anonymization & Data Masking for PostgreSQL
 btree_gin                    | 1.3                   |                   | support for indexing common datatypes in GIN
 citext                       | 1.6                   |                   | data type for case-insensitive character strings
 dataflow                     | 1.0                   |                   | Extension which provides extra formatters and types for dataflow
 dblink                       | 1.2                   |                   | connect to other PostgreSQL databases from within a database
 diskquota                    | 2.3                   |                   | Disk Quota Main Program
 file_fdw                     | 1.0                   |                   | foreign-data wrapper for flat file access
 fuzzystrmatch                | 1.1                   |                   | determine similarities and distance between strings
 gp_distribution_policy       | 1.0                   |                   | check distribution policy in a GPDB cluster
 gp_exttable_fdw              | 1.0                   | 1.0               | External Table Foreign Data Wrapper for Greenplum
 gp_internal_tools            | 1.0.0                 |                   | Different internal tools for Greenplum
 gp_legacy_string_agg         | 1.0.0                 |                   | Legacy one-argument string_agg implementation for Greenplum
 gp_sparse_vector             | 1.0.1                 |                   | SParse vector implementation for GreenPlum
 gp_toolkit                   | 1.18                  | 1.18              | various GPDB administrative views/functions
 gp_wlm                       | 0.1                   |                   | Greenplum Workload Manager Extension
 gpss                         | 1.0                   |                   | Extension which implements kinds of gpss formaters and protocol buffer
 greenplum_fdw                | 1.1                   |                   | foreign-data wrapper for remote greenplum servers
 h3                           | 4.1.3                 |                   | H3 bindings for PostgreSQL
 h3_postgis                   | 4.1.3                 |                   | H3 PostGIS integration
 hll                          | 2.16                  |                   | type for storing hyperloglog data
 hstore                       | 1.6                   |                   | data type for storing sets of (key, value) pairs
 ip4r                         | 2.4                   |                   | 
 isn                          | 1.2                   |                   | data types for international product numbering standards
 ltree                        | 1.1                   |                   | data type for hierarchical tree-like structures
 metrics_collector            | 1.0                   |                   | Greenplum Metrics Collector Extension
 orafce                       | 4.9.1                 |                   | Functions and operators that emulate a subset of functions and packages from the Oracle RDBMS
 orafce_ext                   | 1.0                   |                   | 
 pageinspect                  | 1.9                   |                   | inspect the contents of database pages at a low level
 pg_buffercache               | 1.4.1                 |                   | examine the shared buffer cache
 pg_cron                      | 1.6                   |                   | Job scheduler for PostgreSQL
 pg_hint_plan                 | 1.3.9                 |                   | 
 pg_trgm                      | 1.4                   |                   | text similarity measurement and index searching based on trigrams
 pgaudit                      | 7.0                   |                   | provides auditing functionality
 pgcrypto                     | 1.3                   |                   | cryptographic functions
 pgml                         | 2.8.5+greenplum.2.0.0 |                   | pgml:  Created by the PostgresML team
 pgrouting                    | 3.6.2                 |                   | pgRouting Extension
 plperl                       | 1.0                   |                   | PL/Perl procedural language
 plperlu                      | 1.0                   |                   | PL/PerlU untrusted procedural language
 plpgsql                      | 1.0                   | 1.0               | PL/pgSQL procedural language
 plpython3u                   | 1.0                   |                   | PL/Python3U untrusted procedural language
 pointcloud                   | 1.2.5                 |                   | data type for lidar point clouds
 pointcloud_postgis           | 1.2.5                 |                   | integration for pointcloud LIDAR data and PostGIS geometry data
 postgis                      | 3.3.2                 |                   | PostGIS geometry and geography spatial types and functions
 postgis_raster               | 3.3.2                 |                   | PostGIS raster types and functions
 postgis_tiger_geocoder       | 3.3.2                 |                   | PostGIS tiger geocoder and reverse geocoder
 postgres_fdw                 | 1.0                   |                   | foreign-data wrapper for remote PostgreSQL servers
 sslinfo                      | 1.2                   |                   | information about SSL certificates
 tablefunc                    | 1.0                   |                   | functions that manipulate whole tables, including crosstab
 timestamp9                   | 1.3.0                 |                   | timestamp nanosecond resolution
 uuid-ossp                    | 1.1                   |                   | generate universally unique identifiers (UUIDs)
 vector                       | 0.7.0                 |                   | vector data type and ivfflat and hnsw access methods
(54 rows)

次のSQLを実行して、テストデータを投入します。

CREATE TABLE IF NOT EXISTS organization
(
    organization_id   BIGINT PRIMARY KEY,
    organization_name VARCHAR(255) NOT NULL
);
INSERT INTO organization(organization_id, organization_name) VALUES(1, 'foo');
INSERT INTO organization(organization_id, organization_name) VALUES(2, 'bar');

データを確認します。

select organization_id,organization_name,gp_segment_id from organization;

デフォルではPrimary Keyでデータが分散されます。gp_segment_idというカラムでデータがどのセグメントに配置されているか確認することができます。

organization_id | organization_name | gp_segment_id 
-----------------+-------------------+---------------
               2 | bar               |             0
               1 | foo               |             1
(2 rows)

pgvectorを使ったベクトル検索も試してみます。

CREATE EXTENSION vector;
CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3));
INSERT INTO items (embedding) VALUES ('[1,2,3]'), ('[4,5,6]');

データを確認します。

SELECT * FROM items ORDER BY embedding <-> '[3,1,2]' LIMIT 5;

ベクトル検索ができました。

 id | embedding 
----+-----------
  1 | [1,2,3]
  2 | [4,5,6]
(2 rows)

Greenplum Command Centerの作成

次にGreenplum Command Centerを作成します。

cat <<EOF > gpcc-demo.yaml
apiVersion: greenplum.data.tanzu.vmware.com/v1
kind: GreenplumCommandCenter
metadata:
  name: gpcc-demo
  namespace: demo
spec:
  storageClassName: microk8s-hostpath
  greenplumClusterName: gp-demo
  storage: 2Gi
  service:
    type: LoadBalancer
EOF

kubectl apply -f gpcc-demo.yaml

しばらくすると次のリソースが作成されます。

$ kubectl get gp,sts,pod,svc,secret,pvc -n demo -owide
NAME                                                       STATUS    AGE
greenplumcluster.greenplum.data.tanzu.vmware.com/gp-demo   Running   10m

NAME                                   READY   AGE     CONTAINERS       IMAGES
statefulset.apps/gp-demo-coordinator   1/1     10m     instance         my-private-registry.example.com/greenplum/gp-operator/gp-instance:7.5.2
statefulset.apps/gp-demo-segment       2/2     9m55s   instance         my-private-registry.example.com/greenplum/gp-operator/gp-instance:7.5.2
statefulset.apps/gpcc-demo-cc-app      1/1     4m26s   command-center   my-private-registry.example.com/greenplum/gp-operator/gp-command-center:7.5.0

NAME                        READY   STATUS    RESTARTS   AGE     IP             NODE     NOMINATED NODE   READINESS GATES
pod/gp-demo-coordinator-0   1/1     Running   0          10m     10.1.42.175    banana   <none>           <none>
pod/gp-demo-segment-0       1/1     Running   0          9m55s   10.1.173.185   cherry   <none>           <none>
pod/gp-demo-segment-1       1/1     Running   0          9m48s   10.1.42.177    banana   <none>           <none>
pod/gpcc-demo-cc-app-0      1/1     Running   0          4m26s   10.1.42.179    banana   <none>           <none>

NAME                           TYPE           CLUSTER-IP       EXTERNAL-IP      PORT(S)                         AGE     SELECTOR
service/gp-demo-headless-svc   ClusterIP      None             <none>           <none>                          10m     cluster-name=gp-demo,cluster-namespace=demo
service/gp-demo-svc            LoadBalancer   10.152.183.204   192.168.11.241   5432:32002/TCP                  8m14s   cluster-name=gp-demo,cluster-namespace=demo,type=coordinator
service/gpcc-demo-cc-svc       LoadBalancer   10.152.183.75    192.168.11.242   8443:32144/TCP,8080:30307/TCP   3m36s   cluster-name=gp-demo,cluster-namespace=demo,type=command-center

NAME                                 TYPE                             DATA   AGE
secret/gemfire-registry-secret       kubernetes.io/dockerconfigjson   1      78d
secret/gp-demo-client-cert-secret    kubernetes.io/tls                3      25h
secret/gp-demo-creds                 Opaque                           1      8m58s
secret/gp-demo-server-cert-secret    kubernetes.io/tls                3      25h
secret/gp-demo-ssh-key               Opaque                           2      10m
secret/gp-operator-registry-secret   kubernetes.io/dockerconfigjson   1      26h
secret/gpcc-demo-cc-creds            Opaque                           1      3m36s

NAME                                                      STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS        VOLUMEATTRIBUTESCLASS   AGE     VOLUMEMODE
persistentvolumeclaim/command-center-gpcc-demo-cc-app-0   Bound    pvc-5be921dd-d668-489c-80ca-8248db0cd4be   2Gi        RWO            microk8s-hostpath   <unset>                 4m26s   Filesystem
persistentvolumeclaim/state-gp-demo-coordinator-0         Bound    pvc-6bc2b6b0-4b13-47c0-ad1f-6c1ca47aada0   1Gi        RWO            microk8s-hostpath   <unset>                 10m     Filesystem
persistentvolumeclaim/state-gp-demo-segment-0             Bound    pvc-7a634b8d-5dd6-4f8d-a81d-bf2ec41c4029   10Gi       RWO            microk8s-hostpath   <unset>                 9m55s   Filesystem
persistentvolumeclaim/state-gp-demo-segment-1             Bound    pvc-436ed5cb-8449-468c-8bb6-f16a64317b81   10Gi       RWO            microk8s-hostpath   <unset>                 9m48s   Filesystem

LoadBalancerのEXTERNAL-IPに対して8080ポートで接続します。
image

gpmonユーザーのパスワードはSecretに格納されています。

$ kubectl get secret -n demo gpcc-demo-cc-creds  -ojson | jq '.data | map_values(@base64d)' 
{
  "gpmon": "6PXN0s7xdfyLp9"
}

image

簡単なセットアップでモニタリングできるのは嬉しいですね。

リソースの削除

kubectl delete -f gpcc-demo.yaml
kubectl delete -f gp-demo.yaml
helm uninstall -n gpdb gp-operator --wait

Greenplum on Kubernetes 1.0を試しました。初期バージョンなのでまだまだ改善点が多いと思いますが、 Greenplumを試しやすくなったのは良いことだと思います。今後のバージョンアップに期待です。

Found a mistake? Update the entry.
Share this article: