IK.AM

@making's tech note


Linux Machine上にVMware Greenplum® 7をOrbStack上のRocky Linuxにインストールするメモ

🗃 {Middleware/RDBMS/Greenplum}
🏷 OrbStack 🏷 Greenplum 🏷 Tanzu 🏷 PostgreSQL 🏷 Rocky 
🗓 Updated at 2023-10-02T07:43:17Z  🗓 Created at 2023-09-20T14:06:45Z   🌎 English Page

⚠️ 本記事の内容はVMwareによってサポートされていません。 記事の内容で生じた問題については自己責任で対応し、 VMwareサポート窓口には問い合わせないでください

VMware Greenplum®OrbStackLinux Machine 上にインストールしてみます。
Linux MachineはWindows Subsystem for Linux (WSL)のように使える、Virtual Machineより軽量でContainerより寿命の長い仕組みです。

Coordinator 1台 (gp-coordinator) と Segment 2台 (gp-segment1, gp-segment2) の構成を作ります。本記事ではVMware Greenplum® 7.0.0をインストールします。

基本的には インストールガイド にしたがって作業しました。

目次

Linux Machineの作成

LinuxのDistroには サポートされている Rocky Linux 8を使用します。

次のコマンドで3つのLinux Machineを作成します。検証した環境はApple SiliconのMacだったので amd64 archを明示しました。

orb create -a amd64 rocky:8 gp-coordinator
orb create -a amd64 rocky:8 gp-segment1
orb create -a amd64 rocky:8 gp-segment2

Terminalを3つ開き、次のコマンドでそれぞれのシェルにアクセスします。

orb shell -m gp-coordinator
orb shell -m gp-segment1
orb shell -m gp-segment2

OSの初期セットアップ

各ホスト上で次のコマンドを実行します

sudo yum -y install epel-release
sudo yum -y config-manager --set-enabled powertools
sudo yum -y install apr \
apr-util \
bash \
bzip2 \
curl \
lsof \
bind-utils \
krb5-libs \
libcgroup-tools \
libcurl \
libevent \
libxml2 \
libyaml \
zlib \
openldap \
openssh-clients \
openssh-server \
openssl \
openssl-libs \
sshpass \
perl \
python39 \
readline \
rsync \
R \
sed \
tar \
zip \
apr \
apr-util \
libyaml \
libevent \
java-11-openjdk-devel

echo 2 | sudo update-alternatives --config java

cat <<EOF | sudo tee -a /etc/sysctl.d/99-sysctl.conf > /dev/null
net.ipv4.ip_local_reserved_ports=65330
EOF
sudo sysctl --system

cat <<EOF | sudo tee -a /etc/security/limits.conf > /dev/null
* soft nofile 65536
* hard nofile 65536
EOF

sudo systemctl start sshd
sudo systemctl enable sshd

cat <<EOF | sudo tee /etc/hosts > /dev/null
$(dig +short gp-coordinator.orb.local) gp-coordinator
$(dig +short gp-segment1.orb.local) gp-segment1
$(dig +short gp-segment2.orb.local) gp-segment2
127.0.0.1 localhost
EOF

VMware Greenplum® のダウンロード

VMware Greenplum® のrpmをダウンロードします。ここでは pivnet CLIを使用します。API Tokenは こちら から取得できます。

rpmファイルはMac上にダウンロードすることで、各ホストに共有されるので、以下のコマンドは gp-coordinator 上で行えば良いです。

cd /Users/${USER}/Downloads

curl -sL https://github.com/pivotal-cf/pivnet-cli/releases/download/v3.0.1/pivnet-linux-amd64-3.0.1 > pivnet
chmod +x pivnet
sudo mv pivnet /usr/local/bin/
pivnet login --api-token=*********************-r
pivnet download-product-files --product-slug='vmware-greenplum' --release-version='7.0.0' --glob='*.rpm'
pivnet download-product-files --product-slug='vmware-greenplum' --release-version='7.0.0' --glob='*.gppkg'

VMware Greenplum®のインストール

gpadminユーザーを作ってrpmをダウンロードしたインストールします。

gp-coordinator,gp-segment1, gp-segment2 それぞれで以下を実行します。

sudo groupadd gpadmin
sudo useradd -m gpadmin -g gpadmin
echo gpadmin:Greenplum123 | sudo chpasswd
echo 'gpadmin ALL=(ALL) NOPASSWD: ALL' | sudo EDITOR='tee -a' visudo

sudo yum -y install ./greenplum-db-7.0.0-el8-x86_64.rpm
sudo chown -R gpadmin:gpadmin /usr/local/greenplum*
sudo chgrp -R gpadmin /usr/local/greenplum*

Greenplum用の環境変数などが設定されたスクリプトを .bashrc で読み込むようにします。

cat <<EOF | sudo su - gpadmin bash -c 'tee -a /home/gpadmin/.bashrc'
source /usr/local/greenplum-db/greenplum_path.sh
EOF

SSHの設定

各ノード間をパスワードレスでssh通信できるようにする必要があります。

まずは各ノードで次のコマンドを実行して鍵を生成します。

sudo su - gpadmin bash -c 'ssh-keygen -m PEM -t rsa -b 4096 -q -N "" -f /home/gpadmin/.ssh/id_rsa'

以下の作業は gp-coordinator 上でのみで行います。gpadmin ユーザーで作業します。

gp-coordinator の公開鍵を各ホストの /home/gpadmin/.ssh/authorized_keys に追記します。

sudo su - gpadmin

SSHPASS=Greenplum123 sshpass -e ssh-copy-id -o StrictHostKeyChecking=no gp-coordinator
SSHPASS=Greenplum123 sshpass -e ssh-copy-id -o StrictHostKeyChecking=no gp-segment1
SSHPASS=Greenplum123 sshpass -e ssh-copy-id -o StrictHostKeyChecking=no gp-segment2

次のコマンドで各ホストの公開鍵を各ホストの /home/gpadmin/.ssh/known_hosts に追記します。

cat <<EOF > hostfile_exkeys
gp-coordinator
gp-segment1
gp-segment2
EOF

gpssh-exkeys -f hostfile_exkeys

VMware Greenplum®のセットアップ

以下の作業は gp-coordinator 上でのみで行います。引き続き gpadmin ユーザーで作業します。

次のコマンドで各ホスト用のディレクトリの作成をします。

sudo mkdir -p /data/coordinator
sudo chown gpadmin:gpadmin /data/coordinator

cat <<EOF > hostfile_gpssh_segonly
gp-segment1
gp-segment2
EOF

gpssh -f hostfile_gpssh_segonly -e 'sudo mkdir -p /data/primary'
gpssh -f hostfile_gpssh_segonly -e 'sudo mkdir -p /data/mirror'
gpssh -f hostfile_gpssh_segonly -e 'sudo chown -R gpadmin /data/*'

次のコマンドでVMware Greenplum®を初期化します。

mkdir -p gpconfigs
cat <<EOF > gpconfigs/hostfile_gpinitsystem
gp-segment1
gp-segment2
EOF

cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config /home/gpadmin/gpconfigs/gpinitsystem_config
sed -i.bak \
  -e 's|/data1/primary /data1/primary /data1/primary /data2/primary /data2/primary /data2/primary|/data/primary|' \
  -e 's|/data1/mirror /data1/mirror /data1/mirror /data2/mirror /data2/mirror /data2/mirror|/data/mirror|' \
  -e 's/#MIRROR_PORT_BASE/MIRROR_PORT_BASE/' \
  -e 's/#declare -a MIRROR_DATA_DIRECTORY/declare -a MIRROR_DATA_DIRECTORY/' \
  -e 's/COORDINATOR_HOSTNAME=cdw/COORDINATOR_HOSTNAME=gp-coordinator/' \
  gpconfigs/gpinitsystem_config

gpinitsystem -c gpconfigs/gpinitsystem_config -h gpconfigs/hostfile_gpinitsystem

次のようなログが出力され、確認が求められます。

20231002:16:20:47:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Checking configuration parameters, please wait...
20231002:16:20:47:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Reading Greenplum configuration file gpconfigs/gpinitsystem_config
20231002:16:20:47:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Locale has not been set in gpconfigs/gpinitsystem_config, will set to default value
20231002:16:20:47:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-No DATABASE_NAME set, will exit following template1 updates
20231002:16:20:47:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-COORDINATOR_MAX_CONNECT not set, will set to default value 250
20231002:16:20:48:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Checking configuration parameters, Completed
20231002:16:20:48:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Commencing multi-home checks, please wait...
..
20231002:16:20:49:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Configuring build for standard array
20231002:16:20:49:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Commencing multi-home checks, Completed
20231002:16:20:49:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Building primary segment instance array, please wait...
..
20231002:16:20:53:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Building group mirror array type , please wait...
..
20231002:16:20:56:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Checking Coordinator host
20231002:16:20:56:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Checking new segment hosts, please wait...
....
20231002:16:21:10:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Checking new segment hosts, Completed
20231002:16:21:10:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Greenplum Database Creation Parameters
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:---------------------------------------
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Coordinator Configuration
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:---------------------------------------
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Coordinator hostname       = gp-coordinator
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Coordinator port           = 5432
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Coordinator instance dir   = /data/coordinator/gpseg-1
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Coordinator LOCALE         = 
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Greenplum segment prefix   = gpseg
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Coordinator Database       = 
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Coordinator connections    = 250
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Coordinator buffers        = 128000kB
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Segment connections        = 750
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Segment buffers            = 128000kB
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Encoding                   = UNICODE
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Postgres param file        = Off
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Initdb to be used          = /usr/local/greenplum-db-7.0.0/bin/initdb
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-GP_LIBRARY_PATH is         = /usr/local/greenplum-db-7.0.0/lib
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-HEAP_CHECKSUM is           = on
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-HBA_HOSTNAMES is           = 0
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Ulimit check               = Passed
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Array host connect type    = Single hostname per node
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Coordinator IP address [1]      = ::1
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Coordinator IP address [2]      = 198.19.249.237
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Coordinator IP address [3]      = fd07:b51a:cc66:0:9874:8aff:fe3b:956d
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Coordinator IP address [4]      = fe80::9874:8aff:fe3b:956d
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Standby Coordinator             = Not Configured
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Number of primary segments = 1
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Total Database segments    = 2
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Trusted shell              = ssh
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Number segment hosts       = 2
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Mirror port base           = 7000
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Number of mirror segments  = 1
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Mirroring config           = ON
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Mirroring type             = Group
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:----------------------------------------
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Greenplum Primary Segment Configuration
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:----------------------------------------
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-gp-segment1 	6000 	gp-segment1 	/data/primary/gpseg0 	2
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-gp-segment2 	6000 	gp-segment2 	/data/primary/gpseg1 	3
20231002:16:21:12:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:---------------------------------------
20231002:16:21:12:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Greenplum Mirror Segment Configuration
20231002:16:21:12:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:---------------------------------------
20231002:16:21:12:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-gp-segment2 	7000 	gp-segment2 	/data/mirror/gpseg0 	4
20231002:16:21:12:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-gp-segment1 	7000 	gp-segment1 	/data/mirror/gpseg1 	5

Continue with Greenplum creation Yy|Nn (default=N):
> 

yを入力すると、次のようなログが出力されます。

20231002:16:21:13:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Building the Coordinator instance database, please wait...
20231002:16:21:16:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Starting the Coordinator in admin mode
20231002:16:21:19:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Commencing parallel build of primary segment instances
20231002:16:21:19:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Spawning parallel processes    batch [1], please wait...
..
20231002:16:21:19:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Waiting for parallel processes batch [1], please wait...
.........................
20231002:16:21:46:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:------------------------------------------------
20231002:16:21:46:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Parallel process exit status
20231002:16:21:46:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:------------------------------------------------
20231002:16:21:46:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Total processes marked as completed           = 2
20231002:16:21:46:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Total processes marked as killed              = 0
20231002:16:21:46:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Total processes marked as failed              = 0
20231002:16:21:46:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:------------------------------------------------
20231002:16:21:46:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Removing back out file
20231002:16:21:46:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-No errors generated from parallel processes
20231002:16:21:46:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Restarting the Greenplum instance in production mode
20231002:16:21:46:039387 gpstop:gp-coordinator:gpadmin-[INFO]:-Starting gpstop with args: -a -l /home/gpadmin/gpAdminLogs -m -d /data/coordinator/gpseg-1
20231002:16:21:46:039387 gpstop:gp-coordinator:gpadmin-[INFO]:-Gathering information and validating the environment...
20231002:16:21:46:039387 gpstop:gp-coordinator:gpadmin-[INFO]:-Obtaining Greenplum Coordinator catalog information
20231002:16:21:46:039387 gpstop:gp-coordinator:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20231002:16:21:46:039387 gpstop:gp-coordinator:gpadmin-[INFO]:-Greenplum Version: 'postgres (Greenplum Database) 7.0.0 build commit:0a7a3566873325aca1789ae6f818c80f17a9402d'
20231002:16:21:46:039387 gpstop:gp-coordinator:gpadmin-[INFO]:-Commencing Coordinator instance shutdown with mode='smart'
20231002:16:21:46:039387 gpstop:gp-coordinator:gpadmin-[INFO]:-Coordinator segment instance directory=/data/coordinator/gpseg-1
20231002:16:21:48:039387 gpstop:gp-coordinator:gpadmin-[INFO]:-Stopping coordinator segment and waiting for user connections to finish ...
server shutting down
20231002:16:21:49:039387 gpstop:gp-coordinator:gpadmin-[INFO]:-Attempting forceful termination of any leftover coordinator process
20231002:16:21:49:039387 gpstop:gp-coordinator:gpadmin-[INFO]:-Terminating processes for segment /data/coordinator/gpseg-1
20231002:16:21:54:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Starting gpstart with args: -a -l /home/gpadmin/gpAdminLogs -d /data/coordinator/gpseg-1
20231002:16:21:54:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Gathering information and validating the environment...
20231002:16:21:55:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 7.0.0 build commit:0a7a3566873325aca1789ae6f818c80f17a9402d'
20231002:16:21:55:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Greenplum Catalog Version: '302307241'
20231002:16:21:55:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Starting Coordinator instance in admin mode
20231002:16:21:55:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-CoordinatorStart pg_ctl cmd is env GPSESSID=0000000000 GPERA=None $GPHOME/bin/pg_ctl -D /data/coordinator/gpseg-1 -l /data/coordinator/gpseg-1/log/startup.log -w -t 600 -o " -c gp_role=utility " start
20231002:16:21:55:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Obtaining Greenplum Coordinator catalog information
20231002:16:21:55:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20231002:16:21:55:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Setting new coordinator era
20231002:16:21:55:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Coordinator Started...
20231002:16:21:56:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Shutting down coordinator
20231002:16:22:03:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Commencing parallel segment instance startup, please wait...
.
20231002:16:22:04:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Process results...
20231002:16:22:04:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-
20231002:16:22:04:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-
20231002:16:22:04:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-----------------------------------------------------
20231002:16:22:04:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-   Successful segment starts                                            = 2
20231002:16:22:04:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-   Failed segment starts                                                = 0
20231002:16:22:04:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-   Skipped segment starts (segments are marked down in configuration)   = 0
20231002:16:22:04:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-----------------------------------------------------
20231002:16:22:04:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Successfully started 2 of 2 segment instances 
20231002:16:22:04:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-----------------------------------------------------
20231002:16:22:04:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Starting Coordinator instance gp-coordinator directory /data/coordinator/gpseg-1 
20231002:16:22:04:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-CoordinatorStart pg_ctl cmd is env GPSESSID=0000000000 GPERA=ab03d3b04e6ea60f_231002162155 $GPHOME/bin/pg_ctl -D /data/coordinator/gpseg-1 -l /data/coordinator/gpseg-1/log/startup.log -w -t 600 -o " -c gp_role=dispatch " start
20231002:16:22:05:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Command pg_ctl reports Coordinator gp-coordinator instance active
20231002:16:22:05:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Connecting to db template1 on host localhost
20231002:16:22:05:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-No standby coordinator configured.  skipping...
20231002:16:22:05:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Database successfully started
20231002:16:22:05:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Completed restart of Greenplum instance in production mode
20231002:16:22:05:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Creating core GPDB extensions
20231002:16:22:06:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Importing system collations
20231002:16:22:10:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Commencing parallel build of mirror segment instances
20231002:16:22:10:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Spawning parallel processes    batch [1], please wait...
..
20231002:16:22:10:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Waiting for parallel processes batch [1], please wait...
..........
20231002:16:22:20:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:------------------------------------------------
20231002:16:22:21:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Parallel process exit status
20231002:16:22:21:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:------------------------------------------------
20231002:16:22:21:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Total processes marked as completed           = 2
20231002:16:22:21:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Total processes marked as killed              = 0
20231002:16:22:21:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Total processes marked as failed              = 0
20231002:16:22:21:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:------------------------------------------------
20231002:16:22:21:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Scanning utility log file for any warning messages
20231002:16:22:21:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Log file scan check passed
20231002:16:22:21:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Greenplum Database instance successfully created
20231002:16:22:21:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-------------------------------------------------------
20231002:16:22:21:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-To complete the environment configuration, please 
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-update gpadmin .bashrc file with the following
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-1. Ensure that the greenplum_path.sh file is sourced
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-2. Add "export COORDINATOR_DATA_DIRECTORY=/data/coordinator/gpseg-1"
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-   to access the Greenplum scripts for this instance:
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-   or, use -d /data/coordinator/gpseg-1 option for the Greenplum scripts
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-   Example gpstate -d /data/coordinator/gpseg-1
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Script log file = /home/gpadmin/gpAdminLogs/gpinitsystem_20231002.log
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-To remove instance, run gpdeletesystem utility
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-To initialize a Standby Coordinator Segment for this Greenplum instance
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Review options for gpinitstandby
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-------------------------------------------------------
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-The Coordinator /data/coordinator/gpseg-1/pg_hba.conf post gpinitsystem
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-has been configured to allow all hosts within this new
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-array to intercommunicate. Any hosts external to this
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-new array must be explicitly added to this file
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Refer to the Greenplum Admin support guide which is
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-located in the /usr/local/greenplum-db-7.0.0/docs directory
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-------------------------------------------------------

Greenplum Database instance successfully created と出力されたのでセットアップが成功したように見えます。

.barshrcに次の環境変数を追加します。

cat <<EOF | tee -a /home/gpadmin/.bashrc > /dev/null
export COORDINATOR_DATA_DIRECTORY=/data/coordinator/gpseg-1
export PGPORT=5432
export PGUSER=gpadmin
export PGDATABASE=gpadmin
export LD_PRELOAD=/lib64/libz.so.1 ps
EOF

source /home/gpadmin/.bashrc

データベースの作成

データベースの作成はPostgreSQLと基本的に同じです。以下の作業は gp-coordinator 上でのみで行います。引き続き gpadmin ユーザーで作業します。

createdb test

VMware Greenplum®へのアクセス

VMware Greenplum®へのアクセスもPostgreSQLと基本的に同じです。以下の作業は gp-coordinator 上でのみで行います。引き続き gpadmin ユーザーで作業します。

$ psql -d test
psql (12.12)
Type "help" for help.

次のSQLを実行します。

CREATE TABLE IF NOT EXISTS organization
(
    organization_id   BIGINT PRIMARY KEY,
    organization_name VARCHAR(255) NOT NULL
);
INSERT INTO organization(organization_id, organization_name) VALUES(1, 'foo');
INSERT INTO organization(organization_id, organization_name) VALUES(2, 'bar');

デフォルではPrimary Keyでデータが分散されるようです。gp_segment_idというカラムでデータがどのセグメントに配置されているか確認することができます。

test=# select organization_id,organization_name,gp_segment_id from organization;
 organization_id | organization_name | gp_segment_id 
-----------------+-------------------+---------------
               2 | bar               |             0
               1 | foo               |             1
(2 rows)

とりあえず、ここまで確認できました。

Extensionの有効化

https://docs.vmware.com/en/VMware-Greenplum/7/greenplum-database/install_guide-install_modules.html

いくつかのExtensionを有効化します。

uuid-ossp

https://docs.vmware.com/en/VMware-Greenplum/7/greenplum-database/ref_guide-modules-uuid-ossp.html

test=# CREATE EXTENSION "uuid-ossp";
CREATE EXTENSION
test=# select uuid_generate_v4();
           uuid_generate_v4           
--------------------------------------
 230d1dd3-aca8-4140-b2fe-83b32cacf954
(1 row)

問題なく有効化できました。

pgvector

https://docs.vmware.com/en/VMware-Greenplum/7/greenplum-database/ref_guide-modules-pgvector-pgvector.html

以下のエラーが発生しました。

test=# CREATE EXTENSION vector;
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

/data/coordinator/gpseg-1/log ディレクトリ上のログをみると、次のようなメッセージが出力されていました。

2023-10-02 10:25:01.069654 JST,"gpadmin","test",p41446,th-108324736,"[local]",,2023-10-02 10:22:11 JST,0,con29,cmd10,seg-1,,,,sx1,"LOG","00000","statement: CREATE EXTENSION vector;",,,,,,,0,,"postgres.c",1729,
2023-10-02 10:25:01.450907 JST,,,p41446,th0,,,2023-10-02 10:22:11 JST,0,con29,cmd74,seg-1,,,,,"PANIC","XX000","Unexpected internal error: Master process received signal SIGILL",,,,,,,0,,,,"1    0x7fffffbbbcf0 libpthread.so.0 <symbol not found> + 0xffbbbcf0
2    0x755779 postgres GetIndexAmRoutineByAmId (discriminator 5)
3    0x8d1b4c postgres DefineOpClass (discriminator 5)
4    0xb60f84 postgres <symbol not found> (utility.c:2169)
5    0xb5fe70 postgres standard_ProcessUtility (utility.c:1124)
6    0x7fffec69a3a4 metrics_collector.so AdaptorProcessUtility + 0x84
7    0x7fffec69a741 metrics_collector.so EmitUtilityQueryInfo + 0x396
8    0xb60c1c postgres ProcessUtility (utility.c:397)
9    0x8b9ec2 postgres execute_sql_string (discriminator 5)
10   0x8ba366 postgres <symbol not found> (extension.c:1017)
11   0x8bae35 postgres <symbol not found> (extension.c:1674)
12   0x8bbd2a postgres CreateExtension (extension.c:1888)
13   0xb619b3 postgres <symbol not found> (utility.c:1842)
14   0xb5fe70 postgres standard_ProcessUtility (utility.c:1124)
15   0x7fffec69a3a4 metrics_collector.so AdaptorProcessUtility + 0x84
16   0x7fffec69a741 metrics_collector.so EmitUtilityQueryInfo + 0x396
17   0xb60c1c postgres ProcessUtility (utility.c:397)
18   0xb5d3af postgres <symbol not found> (palloc.h:172)
19   0xb5d9c7 postgres <symbol not found> (pquery.c:1557)
20   0xb5ecb0 postgres PortalRun (pquery.c:1015)
21   0xb59a8a postgres <symbol not found> (postgres.c:1929)

vSphere上で作成したRocky VM上で同じ手順を試すと問題なくExtensionが作成できたので、 おそらく、OrbStackのx86 emulation側の問題だと思われます。

postgresml

https://docs.vmware.com/en/VMware-Greenplum/7/greenplum-database/ref_guide-modules-postgresml.html

gppkgファイルはpivnetコマンドでrpmとともにダウンロード済みです。

cd <Path to gppkg file>
gppkg install -a DataSciencePython3.9-2.0.0-gp7-el8_x86_64.gppkg 
createdb gpadmin
gpconfig -c shared_preload_libraries -v 'pgml' 

Greenplumを再起動します。

gpstop -r -a
gpconfig -c pgml.venv -v '$GPHOME/ext/DataSciencePython3.9'

次のエラーが出る場合は、gpstop -rが漏れています。

gpconfig:gp-coordinator:gpadmin-[CRITICAL]:-not a valid GUC: pgml.venv
not a valid GUC: pgml.venv

設定をリロードします。

gpstop -u
test=# CREATE EXTENSION pgml;
INFO:  Python version: 3.9.16 (main, Jul  3 2023, 20:07:32) 
[GCC 8.5.0 20210514 (Red Hat 8.5.0-18)]
INFO:  Scikit-learn 1.1.2, XGBoost 1.6.2, LightGBM 4.0.0, NumPy 1.22.1
CREATE EXTENSION

次のエラーが出る場合は、gpstop -uが漏れています。

test=# CREATE EXTENSION IF NOT EXISTS pgml;
INFO:  Python version: 3.9.16 (main, Jul  3 2023, 20:07:32)
[GCC 8.5.0 20210514 (Red Hat 8.5.0-18)]
ERROR:  The xgboost package is missing. Install it with `sudo pip3 install xgboost`
ModuleNotFoundError: No module named 'xgboost' (api.rs:36)

PostgresMLのtransformを試します。

SELECT pgml.transform(
    'question-answering',
    inputs => ARRAY[
        '{
            "question": "What does the customer want?",
            "context": "Dear Amazon, last week I ordered an Optimus Prime action figure from your online store in Germany. Unfortunately, when I opened the package, I discovered to my horror that I had been sent an action figure of Megatron instead! As a lifelong enemy of the Decepticons, I hope you can understand my dilemma. To resolve the issue, I demand an exchange of Megatron for the Optimus Prime figure I ordered. Enclosed are copies of my records concerning this purchase. I expect to hear from you soon. Sincerely, Bumblebee."}'
    ]
) AS answer;

またもや次のようなエラーが出ました。

server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

/data/coordinator/gpseg-1/log ディレクトリ上のログをみると、次のようなメッセージが出力されていました。

2023-10-02 16:38:16.387253 JST,"gpadmin","test",p42608,th-108324736,"[local]",,2023-10-02 16:38:05 JST,0,con11,cmd10,seg-1,,,,sx1,"LOG","00000","statement: SELECT pgml.transform(
    'question-answering',
    inputs => ARRAY[
        '{
            ""question"": ""What does the customer want?"",
            ""context"": ""Dear Amazon, last week I ordered an Optimus Prime action figure from your online store in Germany. Unfortunately, when I opened the package, I discovered to my horror that I had been sent an action figure of Megatron instead! As a lifelong enemy of the Decepticons, I hope you can understand my dilemma. To resolve the issue, I demand an exchange of Megatron for the Optimus Prime figure I ordered. Enclosed are copies of my records concerning this purchase. I expect to hear from you soon. Sincerely, Bumblebee.""}'
    ]
) AS answer;",,,,,,,0,,"postgres.c",1729,
2023-10-02 16:38:19.586454 JST,,,p42608,th0,,,2023-10-02 16:38:05 JST,0,con11,cmd10,seg-1,,,,,"PANIC","XX000","Unexpected internal error: Master process received signal SIGILL",,,,,,,0,,,,"1    0x7fffff9a1cf0 libpthread.so.0 <symbol not found> + 0xff9a1cf0
"
2023-10-02 16:38:19.586554 JST,,,p42477,th-108324736,,,,0,,,seg-1,,,,,"LOG","00000","server process (PID 42608) was terminated by signal 4: Illegal instruction","Failed process was running: SELECT pgml.transform(
    'question-answering',
    inputs => ARRAY[
        '{
            ""question"": ""What does the customer want?"",
            ""context"": ""Dear Amazon, last week I ordered an Optimus Prime action figure from your online store in Germany. Unfortunately, when I opened the package, I discovered to my horror that I had been sent an action figure of Megatron instead! As a lifelong enemy of the Decepticons, I hope you can understand my dilemma. To resolve the issue, I demand an exchange of Megatron for the Optimus Prime figure I ordered. Enclosed are copies of my records concerning this purchase. I expect to hear from you soon. Sincerely, Bumblebee.""}'
    ]
) AS answer;",,,,,,0,,"postmaster.c",4224,
2023-10-02 16:38:19.586580 JST,,,p42477,th-108324736,,,,0,,,seg-1,,,,,"LOG","00000","terminating any other active server processes",,,,,,,0,,"postmaster.c",3945,
2023-10-02 16:38:19.586585 JST,,,p42490,th-108324736,,,,0,,,seg-1,,,,,"LOG","00000","ic-proxy: received signal 3",,,,,,,0,,"ic_proxy_main.c",485,
2023-10-02 16:38:19.586589 JST,,,p42490,th-108324736,,,,0,,,seg-1,,,,,"LOG","00000","ic-proxy: server closing",,,,,,,0,,"ic_proxy_main.c",585,
2023-10-02 16:38:19.586593 JST,,,p42490,th-108324736,,,,0,,,seg-1,,,,,"LOG","00000","ic-proxy: server closed with code 1",,,,,,,0,,"ic_proxy_main.c",601,
2023-10-02 16:38:19.586599 JST,"gpadmin","test",p42650,th-108324736,"[local]",,2023-10-02 16:38:19 JST,0,,,seg-1,,,,,"WARNING","57P02","terminating connection because of crash of another server process","The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.","In a moment you should be able to reconnect to the database and repeat your command.",,,,,0,,"postgres.c",3506,
2023-10-02 16:38:19.586608 JST,,,p42483,th-108324736,,,,0,con1,,seg-1,,,,,"WARNING","57P02","terminating connection because of crash of another server process","The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.","In a moment you should be able to reconnect to the database and repeat your command.",,,,,0,,"postgres.c",3506,
2023-10-02 16:38:19.586626 JST,,,p42477,th-108324736,,,,0,,,seg-1,,,,,"LOG","00000","background worker ""ic proxy process"" (PID 42490) exited with exit code 1",,,,,,,0,,"postmaster.c",4202,
2023-10-02 16:38:19.586635 JST,,,p42477,th-108324736,,,,0,,,seg-1,,,,,"LOG","00000","all server processes terminated; reinitializing",,,,,,,0,,"postmaster.c",4488,
2023-10-02 16:38:19.618380 JST,,,p42478,th-108324736,,,,0,,,seg-1,,,,,"LOG","00000","3rd party error log:
addr2line: Relink `/lib64/libbfd-2.30-119.el8.so' with `/lib64/libz.so.1' for IFUNC symbol `crc32_z'
",,,,,,,,"SysLoggerMain","syslogger.c",719,
2023-10-02 16:38:19.693479 JST,,,p42477,th-108324736,,,,0,,,seg-1,,,,,"FATAL","XX000","Can't attach, lock is not in an empty state: PgLwLockInner",,,,,,,0,,"lwlock.rs",82,"Stack trace:
1    0xcdd4bc postgres errstart (elog.c:494)
2    0x7fffec2bc667 pgml.so <symbol not found> + 0xec2bc667
3    0x7fffec2181ab pgml.so <symbol not found> + 0xec2181ab
4    0xb2a924 postgres CreateSharedMemoryAndSemaphores (ipci.c:405)
5    0xab9810 postgres <symbol not found> (discriminator 5)
6    0xaba145 postgres <symbol not found> (postmaster.c:3728)
7    0x7fffff9a1cf0 libpthread.so.0 <symbol not found> + 0xff9a1cf0
8    0x7ffffc8f263b libc.so.6 __select + 0x1b
9    0x10dae69 postgres pg_usleep (pgsleep.c:61)
10   0xabbb6b postgres <symbol not found> (postmaster.c:1951)
11   0xabd317 postgres PostmasterMain (postmaster.c:1638)
12   0x6e43d4 postgres main (main.c:173)
13   0x7ffffc806d85 libc.so.6 __libc_start_main + 0xe5
14   0x6effde postgres _start + 0x2e

pgvector同様、vSphere上で作成したRocky VM上で同じ手順を試すと問題なくExtensionが作成できたので、 おそらく、OrbStackのx86 emulation側の問題だと思われます。

残念。OrbStack on Apple Silicon上ではpgvector、postgresmlは検証できないようでした。
vSphere上のRocky Linux上では動作しました。手順はこちらです。


✒️️ Edit  ⏰ History  🗑 Delete