AWP-ODC-FDQ

From SCECpedia
Jump to navigationJump to search

AWP-ODC-FDQ is a version of the wave propagation code AWP-ODC that contains frequency dependent-Q physics modules. Currently we have a GPU version of this code.

PBS Script

maechlin@h2ologin3:~/fdq_awpodc> more fdq_bw.pbs

 #!/bin/bash
 ###
 ### PBS script for submitting FDQ on Blue Waters
 ###
 ### Set the number of nodes
 ### Set the number of PEs per node
 #PBS -l nodes=4:ppn=1:xk
 ###
 ### Set the wallclock time
 ###
 #PBS -l walltime=01:30:00
 ###
 ### Set the job name
 ###
 #PBS -N chino_hills_gpu
 ###
 ### Set the job stdout and stderr
 ###
 #PBS -e $PBS_JOBID.err
 #PBS -o $PBS_JOBID.out
 ###
 ### Set the Queue
 ###
 #PBS -q normal 
 ###
 ### Set the Allocation
 ###
 #PBS -A jmz
 ###
 ### Set the Email (Beginning, End, Abort)
 ###
 #PBS -m bea
 #PBS -M maechlin@usc.edu
 
 ### 
 ### Load specific modules
 ###
 module swap PrgEnv-cray PrgEnv-gnu
 module load cudatoolkit
 module unload darshan
 
 cd $PBS_O_WORKDIR
 
 now=`date`
 fname="O.$now.tmp"
 echo "STARTING $now" >> "$fname"
 aprun -n 4 -N 1 ./pmcl3d --NX 224 --NY 224 -Z 1024 -x 2 -y 2 \
 --TMAX 20.0 --DH 200.0 --DT 0.01 \
 --MEDIASTART 0 --READ_STEP 91 \
 --NTISKP 10 --WRITE_STEP 10 \
 --FAC 1.0 --Q0 150.0 --EX 0.6 --FP 2.5 \
 --INSRC FAULTPOW \
 --NSKPX 2 --NSKPY 2 \
 >> "$fname" 
 echo "ENDING `date`" >> "$fname"

Current Result

maechlin@h2ologin1:~/fdq_awpodc> more *.err _pmiu_daemon(SIGCHLD): [NID 23046] [c7-0c0s3n0] [Thu Jul 2 14:10:29 2015] PE RA NK 0 exit signal Segmentation fault [NID 23046] 2015-07-02 14:10:29 Apid 25377581: initiated application termination maechlin@h2ologin1:~/fdq_awpodc> more *.out


Begin Torque Prologue on nid27634 at Thu Jul 2 14:10:24 CDT 2015 Job Id: 1950564.nid11293 Username: maechlin Group: PRAC_jmz Job name: chino_hills_gpu Requested resources: neednodes=4:ppn=1:xk,nodes=4:ppn=1:xk,walltime=01:30:00 Queue: normal Account: jmz End Torque Prologue: 0.024 elapsed


maechlin@h2ologin1:~/fdq_awpodc> more *.tmp STARTING Thu Jul 2 14:10:26 CDT 2015

rank=0) RS=91, RSG=91, NST=91, IF=1

0 = (0,0)) NX,NY,NZ=224,224,1024 nxt,nyt,nzt=112,112,1024 rec_N=(112,112,1) rec_nxt,=56,56,1 NBGX,SKP,END=(1:2:223),(1:2:223),(1:1:1) rec_nbg,ed=(0,110),(0,110),(0,0) disp=0

rank=1) RS=91, RSG=91, NST=91, IF=1

1 = (0,1)) NX,NY,NZ=224,224,1024 nxt,nyt,nzt=112,112,1024 rec_N=(112,112,1) rec_nxt,=56,56,1 NBGX,SKP,END=(1:2:223),(1:2:223),(1:1:1) rec_nbg,ed=(0,110),(0,110),(0,0) disp=25088

rank=3) RS=91, RSG=91, NST=91, IF=1

3 = (1,1)) NX,NY,NZ=224,224,1024 nxt,nyt,nzt=112,112,1024 rec_N=(112,112,1) rec_nxt,=56,56,1 NBGX,SKP,END=(1:2:223),(1:2:223),(1:1:1) rec_nbg,ed=(0,110),(0,110),(0,0) disp=25312

rank=2) RS=91, RSG=91, NST=91, IF=1

2 = (1,0)) NX,NY,NZ=224,224,1024 nxt,nyt,nzt=112,112,1024 rec_N=(112,112,1) rec_nxt,=56,56,1 NBGX,SKP,END=(1:2:223),(1:2:223),(1:1:1) rec_nbg,ed=(0,110),(0,110),(0,0) disp=224 filetype size (supposedly=rec_nxt*nyt*nzt*WS*4=125440) =125440 rank=0, x_rank_L=-1, x_rank_R=2, y_rank_F=-1, y_rank_B=1 Before inisource After inisource. Time elapsed (seconds): 0.003165 rank=0, source rank, npsrc=1 rank=1, x_rank_L=-1, x_rank_R=3, y_rank_F=0, y_rank_B=-1 rank=2, x_rank_L=0, x_rank_R=-1, y_rank_F=-1, y_rank_B=3 rank=3, x_rank_L=1, x_rank_R=-1, y_rank_F=2, y_rank_B=-1 Before inimesh tau: 5.420455e-03,1.111455e+00; 3.223022e+00,1.321751e-01; 9.346188e+00,4.558041 e-02; 1.571835e-02,3.832840e-01 After inimesh. Time elapsed (seconds): 0.045935 Application 25377581 exit codes: 139 Application 25377581 exit signals: Killed Application 25377581 resources: utime ~0s, stime ~1s, Rss ~414872, inblocks ~789 , outblocks ~1277 ENDING Thu Jul 2 14:10:30 CDT 2015


Changing Default Blue Waters Environment

Default software modules are Cray. Change these to GNU

 module unload PrgEnv-cray   
 module load PrgEnv-gnu 
 module load cudatoolkit 
 module unload darshan 

Makefile

 CC 	= cc
 CFLAGS	= -O3 -Wall
 GFLAGS	= nvcc -O4 -Xptxas -dlcm=ca -maxrregcount=255 -use_fast_math --ptxas-options=-v -arch=sm_35
 INCDIR  = -I/opt/nvidia/cudatoolkit/5.5.20-1.0402.7700.8.1/include
 OBJECTS	= command.o pmcl3d.o grid.o source.o mesh.o cerjan.o swap.o kernel.o io.o
 LIB	= -lm -ldl -L/opt/nvidia/cudatoolkit/5.5.20-1.0402.7700.8.1/lib64 -lcudart -lmpich
 
 pmcl3d:	$(OBJECTS)
    $(CC) $(CFLAGS) $(INCDIR) -o	pmcl3d	$(OBJECTS)	$(LIB)
 
 pmcl3d.o:	pmcl3d.c
    $(CC) $(CFLAGS) $(INCDIR) -c -o pmcl3d.o	pmcl3d.c		
 
 command.o:	command.c
   $(CC) $(CFLAGS) $(INCDIR) -c -o	command.o	command.c	
 
 io.o:	  io.c
    $(CC) $(CFLAGS) $(INCDIR) -c -o	io.o	  io.c	
 
 grid.o:		grid.c
    $(CC) $(CFLAGS) $(INCDIR) -c -o grid.o		grid.c		
 
 source.o:	source.c
   	$(CC) $(CFLAGS) $(INCDIR) -c -o source.o	source.c	
 
 mesh.o:		mesh.c
   $(CC) $(CFLAGS) $(INCDIR) -c -o mesh.o		mesh.c		
 
 cerjan.o:	cerjan.c
   $(CC) $(CFLAGS) $(INCDIR) -c -o cerjan.o	cerjan.c
 
 swap.o:		swap.c
   	$(CC) $(CFLAGS) $(INCDIR) -c -o swap.o		swap.c
 
 kernel.o:	kernel.cu
   	$(GFLAGS) $(INCDIR) -c -o	kernel.o	kernel.cu	
 
 clean:	
   	rm *.o

Location of Code

Kyle has a GPU version of the code on Titan at: /lustre/atlas1/geo112/proj-shared/withers/chino_hills_gpu

The added parameters to run this code are mainly the choice of exponent in the power law Q(f) law, which is constant below 1 Hz.

Notes about FDQ Version

Note that there are some structural changes from the cpu code. The GPU code doesn't use the input parameter file anymore, all parameters are specified in the run script (if different from default values defined in the code). Also see the instructions below.

The following are must:

 - NX, NY, NZ > 2 and even integers
 - PX, PY >= 1, PZ=1, and divide NX, NY, NZ respectively
 - BLOCK_SIZE_Y=2 in pmcl3d_cons.h
 - BLOCK_SIZE_Z divides NZ
 - BLOCK_SIZE_Y * BLOCK_SIZE_Z <= 1024 and a power of 2

The following are suggestions:

 - NX and NY are around at the same order, or 1/2*NY<=NX<=2*NY
 - NZ >= 256 and a power of 2
 - BLOCK_SIZE_Z divides NZ
 - BLOCK_SIZE_Y * BLOCK_SIZE_Z = 512

The critical parameters for Q(f) are:

 --FAC 1.0 --Q0 150.0 --EX 0.8 --FP 0.5 \

I believe fac and Q0 should not be changed.

EX is the exponent in the power law: Q(f)=Q0f^EX. Tom and his student have found EX~0.6-0.8 for so Cal.

Define Q0 in mesh.c (tmpsq, tmppq, usually as a ratio of Vs.

FP is a reference frequency - we usually use FP=0.5-1.0 for the LA area.

See Also