After deploying
https://github.com/smith1511/hpc/tree/master/slurm-on-centos7.1-hpc2
on ds1v2 and ds2v2 machines slurm is not accessible, it may be linked to the issues from the azuredeploy.sh log:
- munge
- missing mpicc
- issues with mounting the /dev/sde
I wonder if the scripts takes into account asynchronous deployment of the master vs. the nodes. One never knows which one deploys quicker...
2017/03/29 08:47:31 ---errout---
2017/03/29 08:47:31 tor (or optimal
2017/03/29 08:47:31 I/O) size boundary is recommended, or performance may be impacted.
2017/03/29 08:47:31 + createdPartitions=' /dev/sdc1'
2017/03/29 08:47:31 + for disk in sdc sdd sde sdf sdg sdh sdi sdj sdk sdl sdm sdn sdo sdp sdq sdr
2017/03/29 08:47:31 + fdisk -l /dev/sdd
2017/03/29 08:47:31 + fdisk /dev/sdd
2017/03/29 08:47:31 Device does not contain a recognized partition table
2017/03/29 08:47:31 Building a new DOS disklabel with disk identifier 0xb7f32f7b.
2017/03/29 08:47:31
2017/03/29 08:47:31 The device presents a logical sector size that is smaller than
2017/03/29 08:47:31 the physical sector size. Aligning to a physical sector (or optimal
2017/03/29 08:47:31 I/O) size boundary is recommended, or performance may be impacted.
2017/03/29 08:47:31 + createdPartitions=' /dev/sdc1 /dev/sdd1'
2017/03/29 08:47:31 + for disk in sdc sdd sde sdf sdg sdh sdi sdj sdk sdl sdm sdn sdo sdp sdq sdr
2017/03/29 08:47:31 + fdisk -l /dev/sde
2017/03/29 08:47:31 fdisk: cannot open /dev/sde: No such file or directory
2017/03/29 08:47:31 + break
2017/03/29 08:47:31 + '[' -n ' /dev/sdc1 /dev/sdd1' ']'
2017/03/29 08:47:31 ++ echo /dev/sdc1 /dev/sdd1
2017/03/29 08:47:31 ++ wc -w
2017/03/29 08:47:31 + devices=2
2017/03/29 08:47:31 + mdadm --create /dev/md10 --level 0 --raid-devices 2 /dev/sdc1 /dev/sdd1
2017/03/29 08:47:31 mdadm: Defaulting to version 1.2 metadata
2017/03/29 08:47:31 mdadm: array /dev/md10 started.
2017/03/29 08:47:31 + '[' ext4 == xfs ']'
2017/03/29 08:47:31 + mkfs -t ext4 /dev/md10
2017/03/29 08:47:31 mke2fs 1.42.9 (28-Dec-2013)
2017/03/29 08:47:31 + echo '/dev/md10 /data/beegfs/meta ext4 defaults,nofail 0 2'
2017/03/29 08:47:31 + mount /dev/md10
2017/03/29 08:47:31 mount: /etc/fstab: parse error: ignore entry at line 10`
After deploying
https://github.com/smith1511/hpc/tree/master/slurm-on-centos7.1-hpc2
on ds1v2 and ds2v2 machines slurm is not accessible, it may be linked to the issues from the azuredeploy.sh log:
I wonder if the scripts takes into account asynchronous deployment of the master vs. the nodes. One never knows which one deploys quicker...
2017/03/2908:47:31 ---errout---2017/03/29 08:47:31 tor (or optimal
2017/03/29 08:47:31 I/O) size boundary is recommended, or performance may be impacted.
2017/03/29 08:47:31 + createdPartitions=' /dev/sdc1'
2017/03/29 08:47:31 + for disk in sdc sdd sde sdf sdg sdh sdi sdj sdk sdl sdm sdn sdo sdp sdq sdr
2017/03/29 08:47:31 + fdisk -l /dev/sdd
2017/03/29 08:47:31 + fdisk /dev/sdd
2017/03/29 08:47:31 Device does not contain a recognized partition table
2017/03/29 08:47:31 Building a new DOS disklabel with disk identifier 0xb7f32f7b.
2017/03/29 08:47:31
2017/03/29 08:47:31 The device presents a logical sector size that is smaller than
2017/03/29 08:47:31 the physical sector size. Aligning to a physical sector (or optimal
2017/03/29 08:47:31 I/O) size boundary is recommended, or performance may be impacted.
2017/03/29 08:47:31 + createdPartitions=' /dev/sdc1 /dev/sdd1'
2017/03/29 08:47:31 + for disk in sdc sdd sde sdf sdg sdh sdi sdj sdk sdl sdm sdn sdo sdp sdq sdr
2017/03/29 08:47:31 + fdisk -l /dev/sde
2017/03/29 08:47:31 fdisk: cannot open /dev/sde: No such file or directory
2017/03/29 08:47:31 + break
2017/03/29 08:47:31 + '[' -n ' /dev/sdc1 /dev/sdd1' ']'
2017/03/29 08:47:31 ++ echo /dev/sdc1 /dev/sdd1
2017/03/29 08:47:31 ++ wc -w
2017/03/29 08:47:31 + devices=2
2017/03/29 08:47:31 + mdadm --create /dev/md10 --level 0 --raid-devices 2 /dev/sdc1 /dev/sdd1
2017/03/29 08:47:31 mdadm: Defaulting to version 1.2 metadata
2017/03/29 08:47:31 mdadm: array /dev/md10 started.
2017/03/29 08:47:31 + '[' ext4 == xfs ']'
2017/03/29 08:47:31 + mkfs -t ext4 /dev/md10
2017/03/29 08:47:31 mke2fs 1.42.9 (28-Dec-2013)
2017/03/29 08:47:31 + echo '/dev/md10 /data/beegfs/meta ext4 defaults,nofail 0 2'
2017/03/29 08:47:31 + mount /dev/md10
2017/03/29 08:47:31 mount: /etc/fstab: parse error: ignore entry at line 10`