MPICH安裝設定
MPI是用來進行平行運算的,簡單來說就是使用電腦多顆CPU核心或是多台電腦串連起來進行運算,這裡我們使用的是MPICH進行平行運算。
下載:
透過MPICH官方網站下載:https://www.mpich.org/downloads/
版本為mipch-3.2(stable release)
解壓縮:
$ tar zxvf mpich-3.2.tar.gz
設定安裝路徑:
切換到剛剛解開的資料夾
$ cd mpich-3.2
設定路徑
$ ./configure --prefix=/usr/local/mpich #--prefix=/usr/local/mpich表示安裝目錄為usr/local/mpich,這裡可以自行設定到想要的目錄
當看到Configuration completed時,表示設定完成
編譯和安裝
兩個步驟需要一些時間等待
$ make
$ make install
複製mpich-3.2資料夾內的範例到/usr/local/mpich/
目錄下
$ cp -r examples/ /usr/local/mpich
將mpich安裝目錄複製到lammps-02, lammps-03兩台主機相同目錄(若為單機執行此步驟可略過)
[webber@lammps-01 local]$ scp -r mpich lammps-02:/usr/local/
[webber@lammps-01 local]$ scp -r mpich lammps-03:/usr/local/
環境變數設定
$ vim ~/.bashrc
加入下列文字
export PATH=/usr/local/mpich/bin/:${PATH} #設定/mpich/bin檔案路徑
export LD_LIBRARY_PATH=/usr/local/mpich/lib:${LD_LIBRARY_PATH} #設定/mpich/lib檔案路徑
檢查環境變數是否正確
[webber@lammps-01 ~]$ which mpicc
/usr/local/mpich/bin/mpicc
[webber@lammps-01 ~]$ which mpirun
/usr/local/mpich/bin/mpirun
[webber@lammps-01 ~]$ which mpiexec
/usr/local/mpich/bin/mpiexec
同樣的環境變數設定分別在lammps-02, lammps-03上執行(若為單機執行僅須在lammps-01上直行即可)
單機測試在lammps-01執行
先切到mpich目錄
[webber@lammps-01 ~]$ cd /usr/local/mpich/
使用mpirun指令進行運算,cpi是用來計算圓周率得值,計算結果有時間以及誤差
[webber@lammps-01 mpich]$ mpirun -np 10 ./examples/cpi
Process 7 of 10 is on lammps-01
Process 4 of 10 is on lammps-01
Process 6 of 10 is on lammps-01
Process 5 of 10 is on lammps-01
Process 3 of 10 is on lammps-01
Process 8 of 10 is on lammps-01
Process 9 of 10 is on lammps-01
Process 1 of 10 is on lammps-01
Process 2 of 10 is on lammps-01
Process 0 of 10 is on lammps-01
pi is approximately 3.1415926544231256, Error is 0.0000000008333325
wall clock time = 0.000064
可以發現這10個process都是在lammps-01上執行
多機測試在lammps-01, lammps-02, lammps-03執行
在mpich目錄下新增一個machinefile檔案
[webber@lammps-01 mpich]$ vim manchinefile
編輯manchinefile檔案
每台主機執行process數量設定依照主機上CPU 總共核心數去設定,也可以自行定義想要在該台主機執行之process
lammps-01:24 #在lammps-01上執行24個process
lammps-02:12 #在lammps-02上執行12個process
lammps-03:24 #在lammps-02上執行24個process
使用mpirun進行測試
[webber@lammps-01 mpich]$ mpiexec -n 60 -f manchinefile ./examples/cpi
Process 43 of 60 is on lammps-03
Process 52 of 60 is on lammps-03
Process 44 of 60 is on lammps-03
Process 37 of 60 is on lammps-03
Process 58 of 60 is on lammps-03
Process 49 of 60 is on lammps-03
Process 46 of 60 is on lammps-03
Process 57 of 60 is on lammps-03
Process 48 of 60 is on lammps-03
Process 47 of 60 is on lammps-03
Process 51 of 60 is on lammps-03
Process 53 of 60 is on lammps-03
Process 54 of 60 is on lammps-03
Process 41 of 60 is on lammps-03
Process 40 of 60 is on lammps-03
Process 39 of 60 is on lammps-03
Process 55 of 60 is on lammps-03
Process 38 of 60 is on lammps-03
Process 42 of 60 is on lammps-03
Process 36 of 60 is on lammps-03
Process 56 of 60 is on lammps-03
Process 50 of 60 is on lammps-03
Process 45 of 60 is on lammps-03
Process 24 of 60 is on lammps-02
Process 26 of 60 is on lammps-02
Process 28 of 60 is on lammps-02
Process 29 of 60 is on lammps-02
Process 25 of 60 is on lammps-02
Process 27 of 60 is on lammps-02
Process 33 of 60 is on lammps-02
Process 31 of 60 is on lammps-02
Process 35 of 60 is on lammps-02
Process 30 of 60 is on lammps-02
Process 34 of 60 is on lammps-02
Process 59 of 60 is on lammps-03
Process 32 of 60 is on lammps-02
Process 12 of 60 is on lammps-01
Process 14 of 60 is on lammps-01
Process 17 of 60 is on lammps-01
Process 11 of 60 is on lammps-01
Process 15 of 60 is on lammps-01
Process 2 of 60 is on lammps-01
Process 7 of 60 is on lammps-01
Process 18 of 60 is on lammps-01
Process 23 of 60 is on lammps-01
Process 0 of 60 is on lammps-01
Process 9 of 60 is on lammps-01
Process 10 of 60 is on lammps-01
Process 21 of 60 is on lammps-01
Process 6 of 60 is on lammps-01
Process 19 of 60 is on lammps-01
Process 13 of 60 is on lammps-01
Process 1 of 60 is on lammps-01
Process 4 of 60 is on lammps-01
Process 5 of 60 is on lammps-01
Process 8 of 60 is on lammps-01
Process 3 of 60 is on lammps-01
Process 16 of 60 is on lammps-01
Process 20 of 60 is on lammps-01
Process 22 of 60 is on lammps-01
pi is approximately 3.1415926544231265, Error is 0.0000000008333334
wall clock time = 0.039752
可以看出60個process分布在lammps-01, lammps-02, lammps-03這三台主機上運行