Paxata Pipeline/Spark Cluster sizing
Where to size pipeline executor and Spark Cluster
1. Pipeline configuration properties are under /usr/local/paxata/pipeline/config/paxata.properties
a) px.total.cache.capacity: upper limit of pipeline executor cache size on disk b) px.executor.memory: amount of memory per executor, used within either yarn container or standalone Spark worker
2.Pipeline configuration properties are under /usr/local/paxata/pipeline/config/spark.properties
a) yarn.num.executors: number of YARN Node Managers used as pipeline executors b) yarn.executor.cores: number of executor cores used within YARN container
3. Spark Standalone Sizing is defined in $SPARK_HOME/conf/spark-env.sh
a) SPARK_WORKER_CORES: number of cores for each standalone Spark Worker b) SPARK_WORKER_MEMORY: amount of memory for each standalone Spark worker
4. Spark on YARN Container sizing is defined in yarn-site.xml
a) yarn.nodemanager.resource.cpu-vcores: number of cores for YARN container b) yarn.nodemanager.resource.memory-mb: amount of memory for each YARN container
How to size pipeline executor and Spark Cluster properly
For Spark on YARN, if system has 8 available cores, then pipeline executor cores (2b) can be set as high as 7, because Application Master container, which can live in any YARN Node Manager, requires 1 core. If we set the executor core to 8, that means one of YARN Node Managers must be used for AM Container only, wasting the other 7 cores in that YARN Node Manager. For standalone Spark, the number of executor cores (not configurable) is always the same as Spark Worker Cores (3a).
For Spark on YARN, YARN spins up container to launch executor, which take additional system memory. If you available system memory is 64g, then the max YARN container memory (4b) can be set at 64g * 0.8 = 51g, then pipeline executor memory (1b) can be set as high as 51g*0.65~=33g, as we need to leave sufficient off heap memory for executor to operate on. For standalone Spark, the amount of executor memory (1b) can be set the same as the Spark Worker Memory (3b).
For each executor core, we recommend minimum 5g pipeline executor memory. If the current pipeline executor memory (1b) is set at 33g, then the pipeline executor cores (2b) can only be set to as high as 6.
Core-to-Disk IOPS ratio
For each executor core, Paxata requires 1500 random write IOPS on disk for pipeline cache. If the pipeline executor cores (2b) is set to 6, then minimum random read/write iops on cache disk (1a) is 9000. To see if there's io wait, simply run "top" command on your spark worker during project run, and check CPU wa %. Anything above 0% means disk write iops is not sufficiently high. Here's definition of wa%:
wa - io wait cpu time (or) % CPU time spent in wait (on disk)
Random Write iops can be measured by fio utility.
sudo fio -directory=/path/to/worker/cache --name fio_test_file --direct=1 --rw=randwrite --bs=16k --size=1G --numjobs=16 --time_based --runtime=180 --group_reporting –-norandommap
Paxata requires 10Gbit Ethernet between Paxata hosts.Network Throughput between workers can be measured by iperf3 utility.
Worker1 running on 10.0.2.176, starting a server on port 58921:
sudo iperf3 -s -p 58921
Worker2 connecting to Worker1 on port 58921:
sudo iperf3 -c 10.0.2.176 -i 1 -t 60 -V -p 58921
Cloudera YARN Tuning Spreadsheet:
If you have any questions about this article, please comment below or send us a email at [email protected] Thank you!