Scripting CCG Executables

Program Class

class pygeostat.programs.programs.Program(program=None, parstr=None, parfile='temp', getpar=None, nogetarg=False, defaultdict={}, scriptnotifier=None)

Base class containing routines for running GSLIB programs

Get Parameter File

Program.getparfile(quiet=True)

Get the parfile from this program by copying to the clipboard or printing the parfile. This replaces the need to pre-execute a program to get the parfile, but relies on CCG programs being properly configured to generate the correct parfile upon first execution.

This function requires pyperclip. This is a dependency of pygeostat, but can be installed with:

> pip install pyperclip
Parameters:quiet (bool) – The function will copy the parameter file to the clipboard with the block quotes as default (quiet = True)

Run Program

Program.run(program=None, parstr=None, parfile=None, nogetarg=None, filehandle=None, logfile=None, testfilename=None, pardict=None, quiet=False, liveoutput=None, chdirpath=None)

Runs a GSLIB style program using the subprocess module and prints the output. On an error, the output is printed and an exception is raised

The only required parameters are program and parstr, all other parameters are optional.

Parameters:
  • program – name of GSLIB/CCG program to run, taken from self if None
  • parstr (str) – parameters, taken from self if None
  • parfile (str) – name of parameter file to create
  • nogetarg – uses a pipe + communicate for the call instead of arguments
  • filehandle – handle for file to write program output to
  • logfile (str) – filename for a log file to write program output to. If file already exists it will overwrite the file. If filehandle is passed then logfile will be ignored.
  • testfilename (list) – name(s) to check for availability prior to executing the program
  • quiet – if quiet, don’t print to let the user know it is calling
  • liveoutput (bool) – live update the ipython notebook or calling script with the output of the called program
  • chdirpath (str) – Some programs have files they use which prevent running multiple instances of the same program from the same directory. This can be used to chang the path the program is called from to prevent conflicts like this (for example, with kt3d_lva)

Examples

An example for setting up a few key words and running histplt

>>> histpltpar = '''                  Parameters for HISTPLT
...           **********************
...
... START OF PARAMETERS:
... {datafl}          -file with data
... {varcol}   0                        -   columns for variable and weight
... -1.0     1.0e21              -   trimming limits
... {outfl}                   -file for PostScript output
...  0.0      -20.0               -attribute minimum and maximum
... -1.0                         -frequency maximum (<0 for automatic)
... 20                           -number of classes
... 0                            -0=arithmetic, 1=log scaling
... 0                            -0=frequency,  1=cumulative histogram
... 0                            -   number of cum. quantiles (<0 for all)
... 3                            -number of decimal places (<0 for auto.)
... {varname}                                                    -title
... 1.5                          -positioning of stats (L to R: -1 to 1)
... -1.1e21                      -reference value for box plot
... '''
>>>
>>> histplt = gs.Program(program='histplt', parfile='histplt.par')
>>>
>>> histplt.run(parstr=histpltpar.format(datafl=datafl.flname,
...                                      varcol=datafl.gscol('Bitumen'),
...                                      varname='Bitumen',
...                                      outfl='histplt_bitumen.ps'))

..codeauthor: Jared Deutsch - 2014-02-13

Write Parameter File

Program.writepar(parstr=None, parfile=None, pardict=None)

Writes out the parameter file without running the program, which can be helpful for checking

Parameters:
  • parstr (str) – This is the parameter file string initiated with the program.
  • parfile (str) – file name or path to save the parameter file to.
  • pardict (dict) – Dictionary for the variables in the parameter file.

Run Programs in Parallel

pygeostat.programs.program_utils.runparallel(gslibprogram, kwargslist, nprocess=None, mute=False, reportprogress=False)

Run a set of gslibprogram calls in parallel

Parameters:
  • gslibprogram (Program) – name of GSLIB/CCG program to run
  • kwargslist (list of dictionaries) – list of keyword arguments which will be used to call gslibprogram.run(kwarg)
  • nprocess (int) – number of threads to spawn. Drawn from gsParams[‘config.nprocess’] if None.

Examples

Setting up the calling parameters. This example is based off the example used in gs.Program()

>>> callpars = []
>>> # For each variable we want to run in parallel, assemble a dictionary of parameters and
... # append to callpars
>>> for variable in ['Bitumen','Fines','Chlorides']:
>>>     # Establish the parameter file for this variable
>>>     mypars = {'datafl':datafl.flname,
...               'varcol':datafl.gscol(variable),
...               'varname':variable,
...               'outfl':'histplt_'+variable+'.ps'}
>>>    # Assemble the arguments for the GSLIB call and add the arguments to the list of calls
>>>    callpars.append({'parstr':histpltpar.format(**mypars),
...                     'parfile':'histplt_'+variable+'.par',
...                     'testfilename':datafl.flname})

Now run in parallel

>>> histplt = gs.Program(program='histplt', parfile='histplt.par')
>>> gs.runparallel(histplt, callpars)

..codeauthor: Jared Deutsch - 2014-02-13

Misc Program Utilities

pygeostat.programs.program_utils.parallel_function(function, arglist=None, kwarglist=None, nprocess=None, returnvals=False, reportprogress=False)

Quickly parallelize a function with a set of arguments or keyword arguments. If the function returns something (as oppose to writes out values to files), set returnvals=True to get the dictionary that can be used to collect the results.

Parameters:
  • function (func) – a callable function DEFINED IN A .py FILE. The function must be imported from a module since it has to be pickled to be parallelized. Defining the function in the jupyter notebook doesnt seem to work.
  • arglist (list or tuples) – a list of tuple arguments to pass to the function, see examples
  • kwarglist (list) – a list of keyword dictionaries to pass to the functions, i.e. [{‘arg1’: value, ‘arg2’: value}, {‘arg1’: value, ‘arg2’: value}, etc]
  • nprocess (int) – the number of parallel processes to run. Drawn from gsParams[‘config.nprocess’] if None.
  • returnvals (bool) – if the function returns something, collect it in a dictionary, you can use the .get() method of the parallel result to collect the required data.
Returns:

optionally return a dictionary of parallel processing results

Return type:

res (dict)

Usage:

For a function that takes a single argument, setup arglist in this way:

>>> arglist = []
>>> arglist.append((arg,))

OR:

>>> arglist = [(arg,) for arg in range(nparallel)]

Alternatively the argument tuple may take several arguments:

>>> arglist = [(args, for, function), (args, for, function), (args, for, function)]

Detailed usage:

>>> arglist = []
>>> for sr in series:
...     arglist.append(('keyout.out', rbfpath + 'keyout%s.out' % sr, griddef,
...                     griddefs[sr], [3]))
>>> gs.parallel_function(rm.changegrid, arglist=arglist)

OR:

>>> kwarglist = []
>>> for sr in series:
...     kwarglist.append({'infl': 'keyout.out',
...                       'outfl': rbfpath + 'keyout%s.out' % sr,
...                       'ingrid': griddef,
...                       'outgrid': griddefs[sr],
...                       'avmethods': [3]
...                       })
>>> gs.parallel_function(rm.changegrid, kwarglist=kwarglist)

..codeauthor:: Ryan Martin - 2017-02-23

pygeostat.programs.program_utils.rseed(prng=None)

Returns a ACORNI (GSLIB-suitable) random number seed

pygeostat.programs.program_utils.rseed_list(nseeds, seed=None)

Returns a list of ACORNI (GSLIB-suitable) random number seeds. A initial seed can be passed ensureing the same list of seeds is returned to a script that is rerun.

Parameters:
  • nseeds (int) – Number of seeds to return
  • seed (int) – Initialization seed
Returns:

List of random number seeds

Return type:

seeds (list)

pygeostat.programs.program_utils.parstr_kwargs(parstr, fmt='pars')

Print a formatted list of kwargs found in the parfile. Tested and working for the {} style string formatting, found in the gamsim_ave parfile found below this function (for exampel). This is mostly used for being lazy and not writing out the kwargs you just entered into the parfile…. can also be helpful if you define the parfile elsewhere and you cant remember what kwargs you setup !

Parameters:
  • parstr (str) – the par string with the {} formatted parameters
  • fmt (str) – the output format, permissible arguments are pars or dict

Examples

Get the formatted parfile

>>> parstr = '''          Parameters for GAMSIM_AVE
>>>             *************************
>>> START OF PARAMETERS:
>>> {lithfl}        -file with lithology information
>>> {lithcol}   {lithcode}                        -   lithology column (0=not used), code
>>> {datafl}       -file with data
>>> ....
>>> '''
>>> print_parfile_kwargs(parstr)
... lithfl=, lithcol=, lithcode=, datafl=, ...

Code author: Ryan Martin - 2017-04-18

pygeostat.programs.program_utils.dedent_parstr(indented_parstr)

Remove leading indents on each line from a parameter file. This is automatically called by Program.run() so that parfiles may be tabbed to permit better structuring of python code.

Examples

An un-tabbed parstr:

>>> parstr = '''          Parameters for GAMSIM_AVE
>>>             *************************
>>> START OF PARAMETERS:
>>> {lithfl}        -file with lithology information
>>> {lithcol}   {lithcode}                        -   lithology column (0=not used), code
>>> {datafl}       -file with data
>>> ....
>>> '''

A tabbed parstr:

>>> parstr = '''          Parameters for GAMSIM_AVE
>>>                 *************************
>>>     START OF PARAMETERS:
>>>     {lithfl}        -file with lithology information
>>>     {lithcol}   {lithcode}                        -   lithology column (0=not used), code
>>>     {datafl}       -file with data
>>>     ....
>>> '''
pygeostat.programs.programs.ScriptCrash(errormsg, notifier)

Print an error that the script has crashed using the ScriptNotifier Class

Code author: Tyler Acorn

CCG Program Wrappers

Functions that wrap CCG programs for use within python. All functions are stored within the namespace pygeostat.wrappers to help distinguish between wrappers and programs that have been converted into Fortran subroutines for python (e.g., gs.varcalc).

pygeostat.programs.wrappers.histsmth(datafile, variable=None, wtvar=None, pardict=None, printdefault=False, verbose=True)

Wrapper to the gslib histsmth program which must be found on the path and is assumed to be consistent with the version 3000 found on the KB

Parameters:
  • datafile (DataFile) – the data file corresponding to the data being smoothed
  • variable (int or str) – the column or string of the variable
  • wtvar (int or str) – the column or string of the weight variable
  • pardict (dict) – dictionary of parameters to pass to histsmth. The defaults found in the original parfile are passed if they are not contained in this pardict
  • printdefault (bool) – if printdefault == True, the modfiyable parameters for this function are printed to the terminal
  • verbose (bool) – print the output from the gslib program
Returns:

smoothedhist – an array of data corresponding to the smoothed histogram

Return type:

np.ndarray

Notes

# default dictionary of parameters
histpars = {'ltrim': -998, 'utrim': 1.0e21, 'nsmooth': 300, 'mintol': 10e-10,
            'minsmooth': None, 'maxsmooth': None, 'scaling': 0, 'maxiter': 2500,
            'seed': rseed(), 'goals': (0, 0, 1, 1), 'weighting': (1., 1., 2., 2.),
            'window': 1, 'tmean': -999.0, 'tvar': -999.0, 'nquant_data': 100,
            'nquant_user': 0, 'cdf_user': 0.5, 'z_user': 1.66}

Most parameters in the histpars are self explainatory except for the following:

  • maxsmooth, minsmooth: the max and min of the smoothed distribution, take from the data if None
  • scaling: 0 arithmetic, 1 logarithmic
  • goals: a 4 long tuple with 1 or 0 for each of (mean, var, smth, quantiles), repsectively as targets for the smoothing
  • weigthing: a 4 long tuple of weighting corresponding to the goals (mean, var, smth, quantiles)

Code author: Ryan Martin - 2017-04-14

CCG Program Wrapping Utils

Misc helper functions for wrapping GSLIB programs, e.g. tempfile creation and cleaning

pygeostat.programs.wrapperutils.cleantemp(filelist)

Cleans the files found in filelist

Parameters:filelist (list) – A list of string filenames to delete

Code author: Ryan Martin - 25-04-2018

pygeostat.programs.wrapperutils.temp_gslib_file(fltype='gslib')

Generate unique temporary file names with different extensions as indicated by the fltype parameter.

Parameters:fltype (str) – one of “gslib”, “par”, “gsb”, “h5”

Code author: Ryan Martin - 25-04-2018

pygeostat.programs.wrapperutils.wrapper_runprogram(fmtpar, gslibprogram, parfile, tmpinfile, tmpoutfile)

Function to catch the stdout from a gslib program, returning (-1, fulloutput) if there was an error, otherwise the return code is (0, fulloutput)

Parameters:
  • fmtpat (str) – A formatted parfile to call the program with
  • gslibprogram (gs.Program) – An initialized gs.Program to be run
  • parfile (str) – The parfile to call the program with
  • tmpinfile (str) – The input file that contains the temporary data setup in the calling function
  • tmpoutfile (str) – The output file that might have been generated for this function
Returns:

  • errcode (int) – 0 if successful, -1 if unsucessful
  • fulloutput (str) – the full output of the gslib program, could be printed, searched, etc..

Code author: Ryan Martin - 25-04-2018