VMware Greenplum® を OrbStack の Linux Machine 上にインストールしてみます。
Linux MachineはWindows Subsystem for Linux (WSL)のように使える、Virtual Machineより軽量でContainerより寿命の長い仕組みです。
Coordinator 1台 (gp-coordinator
) と Segment 2台 (gp-segment1
, gp-segment2
) の構成を作ります。本記事ではVMware Greenplum® 7.0.0をインストールします。
基本的には インストールガイド にしたがって作業しました。
目次
- Linux Machineの作成
- OSの初期セットアップ
- VMware Greenplum® のダウンロード
- VMware Greenplum®のインストール
- SSHの設定
- VMware Greenplum®のセットアップ
- データベースの作成
- VMware Greenplum®へのアクセス
- Extensionの有効化
Linux Machineの作成
LinuxのDistroには サポートされている Rocky Linux 8を使用します。
次のコマンドで3つのLinux Machineを作成します。検証した環境はApple SiliconのMacだったので amd64
archを明示しました。
orb create -a amd64 rocky:8 gp-coordinator
orb create -a amd64 rocky:8 gp-segment1
orb create -a amd64 rocky:8 gp-segment2
Terminalを3つ開き、次のコマンドでそれぞれのシェルにアクセスします。
orb shell -m gp-coordinator
orb shell -m gp-segment1
orb shell -m gp-segment2
OSの初期セットアップ
各ホスト上で次のコマンドを実行します
sudo yum -y install epel-release
sudo yum -y config-manager --set-enabled powertools
sudo yum -y install apr \
apr-util \
bash \
bzip2 \
curl \
lsof \
bind-utils \
krb5-libs \
libcgroup-tools \
libcurl \
libevent \
libxml2 \
libyaml \
zlib \
openldap \
openssh-clients \
openssh-server \
openssl \
openssl-libs \
sshpass \
perl \
python39 \
readline \
rsync \
R \
sed \
tar \
zip \
apr \
apr-util \
libyaml \
libevent \
java-11-openjdk-devel
echo 2 | sudo update-alternatives --config java
cat <<EOF | sudo tee -a /etc/sysctl.d/99-sysctl.conf > /dev/null
net.ipv4.ip_local_reserved_ports=65330
EOF
sudo sysctl --system
cat <<EOF | sudo tee -a /etc/security/limits.conf > /dev/null
* soft nofile 65536
* hard nofile 65536
EOF
sudo systemctl start sshd
sudo systemctl enable sshd
cat <<EOF | sudo tee /etc/hosts > /dev/null
$(dig +short gp-coordinator.orb.local) gp-coordinator
$(dig +short gp-segment1.orb.local) gp-segment1
$(dig +short gp-segment2.orb.local) gp-segment2
127.0.0.1 localhost
EOF
VMware Greenplum® のダウンロード
VMware Greenplum® のrpmをダウンロードします。ここでは pivnet
CLIを使用します。API Tokenは こちら から取得できます。
rpmファイルはMac上にダウンロードすることで、各ホストに共有されるので、以下のコマンドは gp-coordinator
上で行えば良いです。
cd /Users/${USER}/Downloads
curl -sL https://github.com/pivotal-cf/pivnet-cli/releases/download/v3.0.1/pivnet-linux-amd64-3.0.1 > pivnet
chmod +x pivnet
sudo mv pivnet /usr/local/bin/
pivnet login --api-token=*********************-r
pivnet download-product-files --product-slug='vmware-greenplum' --release-version='7.0.0' --glob='*.rpm'
pivnet download-product-files --product-slug='vmware-greenplum' --release-version='7.0.0' --glob='*.gppkg'
VMware Greenplum®のインストール
gpadmin
ユーザーを作ってrpmをダウンロードしたインストールします。
gp-coordinator
,gp-segment1
, gp-segment2
それぞれで以下を実行します。
sudo groupadd gpadmin
sudo useradd -m gpadmin -g gpadmin
echo gpadmin:Greenplum123 | sudo chpasswd
echo 'gpadmin ALL=(ALL) NOPASSWD: ALL' | sudo EDITOR='tee -a' visudo
sudo yum -y install ./greenplum-db-7.0.0-el8-x86_64.rpm
sudo chown -R gpadmin:gpadmin /usr/local/greenplum*
sudo chgrp -R gpadmin /usr/local/greenplum*
Greenplum用の環境変数などが設定されたスクリプトを .bashrc
で読み込むようにします。
cat <<EOF | sudo su - gpadmin bash -c 'tee -a /home/gpadmin/.bashrc'
source /usr/local/greenplum-db/greenplum_path.sh
EOF
SSHの設定
各ノード間をパスワードレスでssh通信できるようにする必要があります。
まずは各ノードで次のコマンドを実行して鍵を生成します。
sudo su - gpadmin bash -c 'ssh-keygen -m PEM -t rsa -b 4096 -q -N "" -f /home/gpadmin/.ssh/id_rsa'
以下の作業は gp-coordinator
上でのみで行います。gpadmin
ユーザーで作業します。
gp-coordinator
の公開鍵を各ホストの /home/gpadmin/.ssh/authorized_keys
に追記します。
sudo su - gpadmin
SSHPASS=Greenplum123 sshpass -e ssh-copy-id -o StrictHostKeyChecking=no gp-coordinator
SSHPASS=Greenplum123 sshpass -e ssh-copy-id -o StrictHostKeyChecking=no gp-segment1
SSHPASS=Greenplum123 sshpass -e ssh-copy-id -o StrictHostKeyChecking=no gp-segment2
次のコマンドで各ホストの公開鍵を各ホストの /home/gpadmin/.ssh/known_hosts
に追記します。
cat <<EOF > hostfile_exkeys
gp-coordinator
gp-segment1
gp-segment2
EOF
gpssh-exkeys -f hostfile_exkeys
VMware Greenplum®のセットアップ
以下の作業は gp-coordinator
上でのみで行います。引き続き gpadmin
ユーザーで作業します。
次のコマンドで各ホスト用のディレクトリの作成をします。
sudo mkdir -p /data/coordinator
sudo chown gpadmin:gpadmin /data/coordinator
cat <<EOF > hostfile_gpssh_segonly
gp-segment1
gp-segment2
EOF
gpssh -f hostfile_gpssh_segonly -e 'sudo mkdir -p /data/primary'
gpssh -f hostfile_gpssh_segonly -e 'sudo mkdir -p /data/mirror'
gpssh -f hostfile_gpssh_segonly -e 'sudo chown -R gpadmin /data/*'
次のコマンドでVMware Greenplum®を初期化します。
mkdir -p gpconfigs
cat <<EOF > gpconfigs/hostfile_gpinitsystem
gp-segment1
gp-segment2
EOF
cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config /home/gpadmin/gpconfigs/gpinitsystem_config
sed -i.bak \
-e 's|/data1/primary /data1/primary /data1/primary /data2/primary /data2/primary /data2/primary|/data/primary|' \
-e 's|/data1/mirror /data1/mirror /data1/mirror /data2/mirror /data2/mirror /data2/mirror|/data/mirror|' \
-e 's/#MIRROR_PORT_BASE/MIRROR_PORT_BASE/' \
-e 's/#declare -a MIRROR_DATA_DIRECTORY/declare -a MIRROR_DATA_DIRECTORY/' \
-e 's/COORDINATOR_HOSTNAME=cdw/COORDINATOR_HOSTNAME=gp-coordinator/' \
gpconfigs/gpinitsystem_config
gpinitsystem -c gpconfigs/gpinitsystem_config -h gpconfigs/hostfile_gpinitsystem
次のようなログが出力され、確認が求められます。
20231002:16:20:47:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Checking configuration parameters, please wait...
20231002:16:20:47:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Reading Greenplum configuration file gpconfigs/gpinitsystem_config
20231002:16:20:47:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Locale has not been set in gpconfigs/gpinitsystem_config, will set to default value
20231002:16:20:47:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-No DATABASE_NAME set, will exit following template1 updates
20231002:16:20:47:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-COORDINATOR_MAX_CONNECT not set, will set to default value 250
20231002:16:20:48:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Checking configuration parameters, Completed
20231002:16:20:48:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Commencing multi-home checks, please wait...
..
20231002:16:20:49:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Configuring build for standard array
20231002:16:20:49:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Commencing multi-home checks, Completed
20231002:16:20:49:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Building primary segment instance array, please wait...
..
20231002:16:20:53:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Building group mirror array type , please wait...
..
20231002:16:20:56:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Checking Coordinator host
20231002:16:20:56:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Checking new segment hosts, please wait...
....
20231002:16:21:10:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Checking new segment hosts, Completed
20231002:16:21:10:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Greenplum Database Creation Parameters
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:---------------------------------------
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Coordinator Configuration
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:---------------------------------------
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Coordinator hostname = gp-coordinator
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Coordinator port = 5432
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Coordinator instance dir = /data/coordinator/gpseg-1
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Coordinator LOCALE =
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Greenplum segment prefix = gpseg
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Coordinator Database =
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Coordinator connections = 250
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Coordinator buffers = 128000kB
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Segment connections = 750
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Segment buffers = 128000kB
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Encoding = UNICODE
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Postgres param file = Off
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Initdb to be used = /usr/local/greenplum-db-7.0.0/bin/initdb
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-GP_LIBRARY_PATH is = /usr/local/greenplum-db-7.0.0/lib
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-HEAP_CHECKSUM is = on
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-HBA_HOSTNAMES is = 0
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Ulimit check = Passed
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Array host connect type = Single hostname per node
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Coordinator IP address [1] = ::1
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Coordinator IP address [2] = 198.19.249.237
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Coordinator IP address [3] = fd07:b51a:cc66:0:9874:8aff:fe3b:956d
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Coordinator IP address [4] = fe80::9874:8aff:fe3b:956d
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Standby Coordinator = Not Configured
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Number of primary segments = 1
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Total Database segments = 2
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Trusted shell = ssh
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Number segment hosts = 2
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Mirror port base = 7000
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Number of mirror segments = 1
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Mirroring config = ON
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Mirroring type = Group
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:----------------------------------------
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Greenplum Primary Segment Configuration
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:----------------------------------------
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-gp-segment1 6000 gp-segment1 /data/primary/gpseg0 2
20231002:16:21:11:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-gp-segment2 6000 gp-segment2 /data/primary/gpseg1 3
20231002:16:21:12:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:---------------------------------------
20231002:16:21:12:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Greenplum Mirror Segment Configuration
20231002:16:21:12:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:---------------------------------------
20231002:16:21:12:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-gp-segment2 7000 gp-segment2 /data/mirror/gpseg0 4
20231002:16:21:12:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-gp-segment1 7000 gp-segment1 /data/mirror/gpseg1 5
Continue with Greenplum creation Yy|Nn (default=N):
>
y
を入力すると、次のようなログが出力されます。
20231002:16:21:13:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Building the Coordinator instance database, please wait...
20231002:16:21:16:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Starting the Coordinator in admin mode
20231002:16:21:19:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Commencing parallel build of primary segment instances
20231002:16:21:19:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Spawning parallel processes batch [1], please wait...
..
20231002:16:21:19:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Waiting for parallel processes batch [1], please wait...
.........................
20231002:16:21:46:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:------------------------------------------------
20231002:16:21:46:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Parallel process exit status
20231002:16:21:46:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:------------------------------------------------
20231002:16:21:46:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Total processes marked as completed = 2
20231002:16:21:46:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Total processes marked as killed = 0
20231002:16:21:46:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Total processes marked as failed = 0
20231002:16:21:46:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:------------------------------------------------
20231002:16:21:46:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Removing back out file
20231002:16:21:46:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-No errors generated from parallel processes
20231002:16:21:46:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Restarting the Greenplum instance in production mode
20231002:16:21:46:039387 gpstop:gp-coordinator:gpadmin-[INFO]:-Starting gpstop with args: -a -l /home/gpadmin/gpAdminLogs -m -d /data/coordinator/gpseg-1
20231002:16:21:46:039387 gpstop:gp-coordinator:gpadmin-[INFO]:-Gathering information and validating the environment...
20231002:16:21:46:039387 gpstop:gp-coordinator:gpadmin-[INFO]:-Obtaining Greenplum Coordinator catalog information
20231002:16:21:46:039387 gpstop:gp-coordinator:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20231002:16:21:46:039387 gpstop:gp-coordinator:gpadmin-[INFO]:-Greenplum Version: 'postgres (Greenplum Database) 7.0.0 build commit:0a7a3566873325aca1789ae6f818c80f17a9402d'
20231002:16:21:46:039387 gpstop:gp-coordinator:gpadmin-[INFO]:-Commencing Coordinator instance shutdown with mode='smart'
20231002:16:21:46:039387 gpstop:gp-coordinator:gpadmin-[INFO]:-Coordinator segment instance directory=/data/coordinator/gpseg-1
20231002:16:21:48:039387 gpstop:gp-coordinator:gpadmin-[INFO]:-Stopping coordinator segment and waiting for user connections to finish ...
server shutting down
20231002:16:21:49:039387 gpstop:gp-coordinator:gpadmin-[INFO]:-Attempting forceful termination of any leftover coordinator process
20231002:16:21:49:039387 gpstop:gp-coordinator:gpadmin-[INFO]:-Terminating processes for segment /data/coordinator/gpseg-1
20231002:16:21:54:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Starting gpstart with args: -a -l /home/gpadmin/gpAdminLogs -d /data/coordinator/gpseg-1
20231002:16:21:54:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Gathering information and validating the environment...
20231002:16:21:55:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 7.0.0 build commit:0a7a3566873325aca1789ae6f818c80f17a9402d'
20231002:16:21:55:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Greenplum Catalog Version: '302307241'
20231002:16:21:55:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Starting Coordinator instance in admin mode
20231002:16:21:55:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-CoordinatorStart pg_ctl cmd is env GPSESSID=0000000000 GPERA=None $GPHOME/bin/pg_ctl -D /data/coordinator/gpseg-1 -l /data/coordinator/gpseg-1/log/startup.log -w -t 600 -o " -c gp_role=utility " start
20231002:16:21:55:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Obtaining Greenplum Coordinator catalog information
20231002:16:21:55:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20231002:16:21:55:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Setting new coordinator era
20231002:16:21:55:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Coordinator Started...
20231002:16:21:56:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Shutting down coordinator
20231002:16:22:03:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Commencing parallel segment instance startup, please wait...
.
20231002:16:22:04:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Process results...
20231002:16:22:04:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-
20231002:16:22:04:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-
20231002:16:22:04:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-----------------------------------------------------
20231002:16:22:04:039769 gpstart:gp-coordinator:gpadmin-[INFO]:- Successful segment starts = 2
20231002:16:22:04:039769 gpstart:gp-coordinator:gpadmin-[INFO]:- Failed segment starts = 0
20231002:16:22:04:039769 gpstart:gp-coordinator:gpadmin-[INFO]:- Skipped segment starts (segments are marked down in configuration) = 0
20231002:16:22:04:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-----------------------------------------------------
20231002:16:22:04:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Successfully started 2 of 2 segment instances
20231002:16:22:04:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-----------------------------------------------------
20231002:16:22:04:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Starting Coordinator instance gp-coordinator directory /data/coordinator/gpseg-1
20231002:16:22:04:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-CoordinatorStart pg_ctl cmd is env GPSESSID=0000000000 GPERA=ab03d3b04e6ea60f_231002162155 $GPHOME/bin/pg_ctl -D /data/coordinator/gpseg-1 -l /data/coordinator/gpseg-1/log/startup.log -w -t 600 -o " -c gp_role=dispatch " start
20231002:16:22:05:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Command pg_ctl reports Coordinator gp-coordinator instance active
20231002:16:22:05:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Connecting to db template1 on host localhost
20231002:16:22:05:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-No standby coordinator configured. skipping...
20231002:16:22:05:039769 gpstart:gp-coordinator:gpadmin-[INFO]:-Database successfully started
20231002:16:22:05:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Completed restart of Greenplum instance in production mode
20231002:16:22:05:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Creating core GPDB extensions
20231002:16:22:06:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Importing system collations
20231002:16:22:10:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Commencing parallel build of mirror segment instances
20231002:16:22:10:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Spawning parallel processes batch [1], please wait...
..
20231002:16:22:10:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Waiting for parallel processes batch [1], please wait...
..........
20231002:16:22:20:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:------------------------------------------------
20231002:16:22:21:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Parallel process exit status
20231002:16:22:21:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:------------------------------------------------
20231002:16:22:21:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Total processes marked as completed = 2
20231002:16:22:21:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Total processes marked as killed = 0
20231002:16:22:21:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Total processes marked as failed = 0
20231002:16:22:21:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:------------------------------------------------
20231002:16:22:21:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Scanning utility log file for any warning messages
20231002:16:22:21:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Log file scan check passed
20231002:16:22:21:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Greenplum Database instance successfully created
20231002:16:22:21:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-------------------------------------------------------
20231002:16:22:21:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-To complete the environment configuration, please
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-update gpadmin .bashrc file with the following
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-1. Ensure that the greenplum_path.sh file is sourced
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-2. Add "export COORDINATOR_DATA_DIRECTORY=/data/coordinator/gpseg-1"
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:- to access the Greenplum scripts for this instance:
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:- or, use -d /data/coordinator/gpseg-1 option for the Greenplum scripts
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:- Example gpstate -d /data/coordinator/gpseg-1
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Script log file = /home/gpadmin/gpAdminLogs/gpinitsystem_20231002.log
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-To remove instance, run gpdeletesystem utility
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-To initialize a Standby Coordinator Segment for this Greenplum instance
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Review options for gpinitstandby
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-------------------------------------------------------
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-The Coordinator /data/coordinator/gpseg-1/pg_hba.conf post gpinitsystem
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-has been configured to allow all hosts within this new
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-array to intercommunicate. Any hosts external to this
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-new array must be explicitly added to this file
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-Refer to the Greenplum Admin support guide which is
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-located in the /usr/local/greenplum-db-7.0.0/docs directory
20231002:16:22:22:036255 gpinitsystem:gp-coordinator:gpadmin-[INFO]:-------------------------------------------------------
Greenplum Database instance successfully created
と出力されたのでセットアップが成功したように見えます。
.barshrc
に次の環境変数を追加します。
cat <<EOF | tee -a /home/gpadmin/.bashrc > /dev/null
export COORDINATOR_DATA_DIRECTORY=/data/coordinator/gpseg-1
export PGPORT=5432
export PGUSER=gpadmin
export PGDATABASE=gpadmin
export LD_PRELOAD=/lib64/libz.so.1 ps
EOF
source /home/gpadmin/.bashrc
データベースの作成
データベースの作成はPostgreSQLと基本的に同じです。以下の作業は gp-coordinator
上でのみで行います。引き続き gpadmin
ユーザーで作業します。
createdb test
VMware Greenplum®へのアクセス
VMware Greenplum®へのアクセスもPostgreSQLと基本的に同じです。以下の作業は gp-coordinator
上でのみで行います。引き続き gpadmin
ユーザーで作業します。
$ psql -d test
psql (12.12)
Type "help" for help.
次のSQLを実行します。
CREATE TABLE IF NOT EXISTS organization
(
organization_id BIGINT PRIMARY KEY,
organization_name VARCHAR(255) NOT NULL
);
INSERT INTO organization(organization_id, organization_name) VALUES(1, 'foo');
INSERT INTO organization(organization_id, organization_name) VALUES(2, 'bar');
デフォルではPrimary Keyでデータが分散されるようです。gp_segment_id
というカラムでデータがどのセグメントに配置されているか確認することができます。
test=# select organization_id,organization_name,gp_segment_id from organization;
organization_id | organization_name | gp_segment_id
-----------------+-------------------+---------------
2 | bar | 0
1 | foo | 1
(2 rows)
とりあえず、ここまで確認できました。
Extensionの有効化
https://docs.vmware.com/en/VMware-Greenplum/7/greenplum-database/install_guide-install_modules.html
いくつかのExtensionを有効化します。
uuid-ossp
https://docs.vmware.com/en/VMware-Greenplum/7/greenplum-database/ref_guide-modules-uuid-ossp.html
test=# CREATE EXTENSION "uuid-ossp";
CREATE EXTENSION
test=# select uuid_generate_v4();
uuid_generate_v4
--------------------------------------
230d1dd3-aca8-4140-b2fe-83b32cacf954
(1 row)
問題なく有効化できました。
pgvector
以下のエラーが発生しました。
test=# CREATE EXTENSION vector;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
/data/coordinator/gpseg-1/log
ディレクトリ上のログをみると、次のようなメッセージが出力されていました。
2023-10-02 10:25:01.069654 JST,"gpadmin","test",p41446,th-108324736,"[local]",,2023-10-02 10:22:11 JST,0,con29,cmd10,seg-1,,,,sx1,"LOG","00000","statement: CREATE EXTENSION vector;",,,,,,,0,,"postgres.c",1729,
2023-10-02 10:25:01.450907 JST,,,p41446,th0,,,2023-10-02 10:22:11 JST,0,con29,cmd74,seg-1,,,,,"PANIC","XX000","Unexpected internal error: Master process received signal SIGILL",,,,,,,0,,,,"1 0x7fffffbbbcf0 libpthread.so.0 <symbol not found> + 0xffbbbcf0
2 0x755779 postgres GetIndexAmRoutineByAmId (discriminator 5)
3 0x8d1b4c postgres DefineOpClass (discriminator 5)
4 0xb60f84 postgres <symbol not found> (utility.c:2169)
5 0xb5fe70 postgres standard_ProcessUtility (utility.c:1124)
6 0x7fffec69a3a4 metrics_collector.so AdaptorProcessUtility + 0x84
7 0x7fffec69a741 metrics_collector.so EmitUtilityQueryInfo + 0x396
8 0xb60c1c postgres ProcessUtility (utility.c:397)
9 0x8b9ec2 postgres execute_sql_string (discriminator 5)
10 0x8ba366 postgres <symbol not found> (extension.c:1017)
11 0x8bae35 postgres <symbol not found> (extension.c:1674)
12 0x8bbd2a postgres CreateExtension (extension.c:1888)
13 0xb619b3 postgres <symbol not found> (utility.c:1842)
14 0xb5fe70 postgres standard_ProcessUtility (utility.c:1124)
15 0x7fffec69a3a4 metrics_collector.so AdaptorProcessUtility + 0x84
16 0x7fffec69a741 metrics_collector.so EmitUtilityQueryInfo + 0x396
17 0xb60c1c postgres ProcessUtility (utility.c:397)
18 0xb5d3af postgres <symbol not found> (palloc.h:172)
19 0xb5d9c7 postgres <symbol not found> (pquery.c:1557)
20 0xb5ecb0 postgres PortalRun (pquery.c:1015)
21 0xb59a8a postgres <symbol not found> (postgres.c:1929)
vSphere上で作成したRocky VM上で同じ手順を試すと問題なくExtensionが作成できたので、 おそらく、OrbStackのx86 emulation側の問題だと思われます。
postgresml
https://docs.vmware.com/en/VMware-Greenplum/7/greenplum-database/ref_guide-modules-postgresml.html
gppkgファイルはpivnet
コマンドでrpmとともにダウンロード済みです。
cd <Path to gppkg file>
gppkg install -a DataSciencePython3.9-2.0.0-gp7-el8_x86_64.gppkg
createdb gpadmin
gpconfig -c shared_preload_libraries -v 'pgml'
Greenplumを再起動します。
gpstop -r -a
gpconfig -c pgml.venv -v '$GPHOME/ext/DataSciencePython3.9'
次のエラーが出る場合は、
gpstop -r
が漏れています。gpconfig:gp-coordinator:gpadmin-[CRITICAL]:-not a valid GUC: pgml.venv not a valid GUC: pgml.venv
設定をリロードします。
gpstop -u
test=# CREATE EXTENSION pgml;
INFO: Python version: 3.9.16 (main, Jul 3 2023, 20:07:32)
[GCC 8.5.0 20210514 (Red Hat 8.5.0-18)]
INFO: Scikit-learn 1.1.2, XGBoost 1.6.2, LightGBM 4.0.0, NumPy 1.22.1
CREATE EXTENSION
次のエラーが出る場合は、
gpstop -u
が漏れています。test=# CREATE EXTENSION IF NOT EXISTS pgml; INFO: Python version: 3.9.16 (main, Jul 3 2023, 20:07:32) [GCC 8.5.0 20210514 (Red Hat 8.5.0-18)] ERROR: The xgboost package is missing. Install it with `sudo pip3 install xgboost` ModuleNotFoundError: No module named 'xgboost' (api.rs:36)
PostgresMLのtransformを試します。
SELECT pgml.transform(
'question-answering',
inputs => ARRAY[
'{
"question": "What does the customer want?",
"context": "Dear Amazon, last week I ordered an Optimus Prime action figure from your online store in Germany. Unfortunately, when I opened the package, I discovered to my horror that I had been sent an action figure of Megatron instead! As a lifelong enemy of the Decepticons, I hope you can understand my dilemma. To resolve the issue, I demand an exchange of Megatron for the Optimus Prime figure I ordered. Enclosed are copies of my records concerning this purchase. I expect to hear from you soon. Sincerely, Bumblebee."}'
]
) AS answer;
またもや次のようなエラーが出ました。
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
/data/coordinator/gpseg-1/log
ディレクトリ上のログをみると、次のようなメッセージが出力されていました。
2023-10-02 16:38:16.387253 JST,"gpadmin","test",p42608,th-108324736,"[local]",,2023-10-02 16:38:05 JST,0,con11,cmd10,seg-1,,,,sx1,"LOG","00000","statement: SELECT pgml.transform(
'question-answering',
inputs => ARRAY[
'{
""question"": ""What does the customer want?"",
""context"": ""Dear Amazon, last week I ordered an Optimus Prime action figure from your online store in Germany. Unfortunately, when I opened the package, I discovered to my horror that I had been sent an action figure of Megatron instead! As a lifelong enemy of the Decepticons, I hope you can understand my dilemma. To resolve the issue, I demand an exchange of Megatron for the Optimus Prime figure I ordered. Enclosed are copies of my records concerning this purchase. I expect to hear from you soon. Sincerely, Bumblebee.""}'
]
) AS answer;",,,,,,,0,,"postgres.c",1729,
2023-10-02 16:38:19.586454 JST,,,p42608,th0,,,2023-10-02 16:38:05 JST,0,con11,cmd10,seg-1,,,,,"PANIC","XX000","Unexpected internal error: Master process received signal SIGILL",,,,,,,0,,,,"1 0x7fffff9a1cf0 libpthread.so.0 <symbol not found> + 0xff9a1cf0
"
2023-10-02 16:38:19.586554 JST,,,p42477,th-108324736,,,,0,,,seg-1,,,,,"LOG","00000","server process (PID 42608) was terminated by signal 4: Illegal instruction","Failed process was running: SELECT pgml.transform(
'question-answering',
inputs => ARRAY[
'{
""question"": ""What does the customer want?"",
""context"": ""Dear Amazon, last week I ordered an Optimus Prime action figure from your online store in Germany. Unfortunately, when I opened the package, I discovered to my horror that I had been sent an action figure of Megatron instead! As a lifelong enemy of the Decepticons, I hope you can understand my dilemma. To resolve the issue, I demand an exchange of Megatron for the Optimus Prime figure I ordered. Enclosed are copies of my records concerning this purchase. I expect to hear from you soon. Sincerely, Bumblebee.""}'
]
) AS answer;",,,,,,0,,"postmaster.c",4224,
2023-10-02 16:38:19.586580 JST,,,p42477,th-108324736,,,,0,,,seg-1,,,,,"LOG","00000","terminating any other active server processes",,,,,,,0,,"postmaster.c",3945,
2023-10-02 16:38:19.586585 JST,,,p42490,th-108324736,,,,0,,,seg-1,,,,,"LOG","00000","ic-proxy: received signal 3",,,,,,,0,,"ic_proxy_main.c",485,
2023-10-02 16:38:19.586589 JST,,,p42490,th-108324736,,,,0,,,seg-1,,,,,"LOG","00000","ic-proxy: server closing",,,,,,,0,,"ic_proxy_main.c",585,
2023-10-02 16:38:19.586593 JST,,,p42490,th-108324736,,,,0,,,seg-1,,,,,"LOG","00000","ic-proxy: server closed with code 1",,,,,,,0,,"ic_proxy_main.c",601,
2023-10-02 16:38:19.586599 JST,"gpadmin","test",p42650,th-108324736,"[local]",,2023-10-02 16:38:19 JST,0,,,seg-1,,,,,"WARNING","57P02","terminating connection because of crash of another server process","The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.","In a moment you should be able to reconnect to the database and repeat your command.",,,,,0,,"postgres.c",3506,
2023-10-02 16:38:19.586608 JST,,,p42483,th-108324736,,,,0,con1,,seg-1,,,,,"WARNING","57P02","terminating connection because of crash of another server process","The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.","In a moment you should be able to reconnect to the database and repeat your command.",,,,,0,,"postgres.c",3506,
2023-10-02 16:38:19.586626 JST,,,p42477,th-108324736,,,,0,,,seg-1,,,,,"LOG","00000","background worker ""ic proxy process"" (PID 42490) exited with exit code 1",,,,,,,0,,"postmaster.c",4202,
2023-10-02 16:38:19.586635 JST,,,p42477,th-108324736,,,,0,,,seg-1,,,,,"LOG","00000","all server processes terminated; reinitializing",,,,,,,0,,"postmaster.c",4488,
2023-10-02 16:38:19.618380 JST,,,p42478,th-108324736,,,,0,,,seg-1,,,,,"LOG","00000","3rd party error log:
addr2line: Relink `/lib64/libbfd-2.30-119.el8.so' with `/lib64/libz.so.1' for IFUNC symbol `crc32_z'
",,,,,,,,"SysLoggerMain","syslogger.c",719,
2023-10-02 16:38:19.693479 JST,,,p42477,th-108324736,,,,0,,,seg-1,,,,,"FATAL","XX000","Can't attach, lock is not in an empty state: PgLwLockInner",,,,,,,0,,"lwlock.rs",82,"Stack trace:
1 0xcdd4bc postgres errstart (elog.c:494)
2 0x7fffec2bc667 pgml.so <symbol not found> + 0xec2bc667
3 0x7fffec2181ab pgml.so <symbol not found> + 0xec2181ab
4 0xb2a924 postgres CreateSharedMemoryAndSemaphores (ipci.c:405)
5 0xab9810 postgres <symbol not found> (discriminator 5)
6 0xaba145 postgres <symbol not found> (postmaster.c:3728)
7 0x7fffff9a1cf0 libpthread.so.0 <symbol not found> + 0xff9a1cf0
8 0x7ffffc8f263b libc.so.6 __select + 0x1b
9 0x10dae69 postgres pg_usleep (pgsleep.c:61)
10 0xabbb6b postgres <symbol not found> (postmaster.c:1951)
11 0xabd317 postgres PostmasterMain (postmaster.c:1638)
12 0x6e43d4 postgres main (main.c:173)
13 0x7ffffc806d85 libc.so.6 __libc_start_main + 0xe5
14 0x6effde postgres _start + 0x2e
pgvector同様、vSphere上で作成したRocky VM上で同じ手順を試すと問題なくExtensionが作成できたので、
おそらく、OrbStackのx86 emulation側の問題だと思われます。
残念。OrbStack on Apple Silicon上ではpgvector、postgresmlは検証できないようでした。
vSphere上のRocky Linux上では動作しました。手順はこちらです。