.. _fortran: .. public Fortran in pygeostat ==================== Various Fortran :ref:`subroutines ` have been wrapped for Python using a compiling tool :ref:`Fortran Builder ` developed to permit F2PY compilation with a wide variety of Fortran and C compiler combinations. Compiling Fortran extensions for python allows data to be passed directly from Python to Fortran. This allows users to take advantage of the speed of Fortran, the breadth of historical libraries and the seamless integration with the wide variety of flexible and powerful libraries accessible through Python. When pygeostat is imported, these compiled extensions are not automatically imported. However, full functionality of pygeostat assumes that the compiled extensions are present. If they are not found then usage of the package is still possible; various errors and warnings are thrown if functions requiring the compiled extensions are used. The simplest way to compile the extensions is to run the following commands from a python interpreter after installing pygeostat: .. code-block:: Python >>> import pygeostat as gs >>> gs.build_pygeostat_fortran() Alternatively, a compile script is provided that is command line callable. Starting a terminal in ``pygeostat/fortran/`` and using the following command will result in the extensions built in the current directory: .. code-block:: bash > python compile.py -compiler=gnu all **NOTES** 1. If pygeostat was installed from a ``.whl`` file, the fortran extensions are already present and this step is not required. 2. If installing using the setup.py, try ``python setup.py install -buildfortran`` to pre-build the fortran extensions and copy them over to the site-packages. The wheel installation is preferred over this step 3. The :ref:`Fortran Builder ` is compatible with gfortran and intel compilers and tested with the following configurations: - MinGW 4.7.1 installed with ``conda install mingw`` - MinGW 7.3.0 installed from sourceforge - Intel Fortran 12 and Visual Studio 2012 - Intel Fortran 15+ and Visual Studio 2015 4. To compile custom fortran extensions (e.g. fortran code not contained in pygeostat), consider the :func:`custom fortran builder ` .. _fortbuild: Fortran Extesion Building Functions *********************************** .. automodule :: pygeostat.fortran.compile Build Pygeostat Fortran +++++++++++++++++++++++ .. autofunction:: pygeostat.fortran.compile.build_pygeostat_fortran Pygeostat Fortran Modules +++++++++++++++++++++++++ .. exec:: from pygeostat.fortran.sourcedefs import sources sortedkeys = sorted(list(sources.keys())) for k in sortedkeys: print(k) print('-' * len(k)) files = sources[k] print(", ".join("``{}``".format(f) for f in files) + "\n") Build Custom Fortran ++++++++++++++++++++ .. autofunction:: pygeostat.fortran.compile.build_custom_fortran .. _fortsubs: Fortran Subroutines ******************* Simulated Realizations Accuracy Plot ++++++++++++++++++++++++++++++++++++ Used and wrapped within :func:`gs.accpltsim() ` Covariance Calculation Subroutines ++++++++++++++++++++++++++++++++++ cova3 ----- Used and wrapped within :func:`gs.VariogramModel.getcova() ` sqdist ------ Used and wrapped within :func:`gs.VariogramModel.setcova() ` setrot ------ Used and wrapped within :func:`gs.VariogramModel.setcova() ` Discrete Gaussian Model +++++++++++++++++++++++ Used and wrapped within :func:`gs.dgm() ` Fast GSLIB File I/O Subroutines +++++++++++++++++++++++++++++++ File Read --------- Used and wrapped within :func:`gs.read_gslib_f() ` and :class:`gs.DataFile() ` File Write ---------- Used and wrapped within :func:`gs.write_gslib_f() ` and :class:`gs.DataFile() ` Gaussian KDE Subroutines ++++++++++++++++++++++++ Not used in any functions, but can be called. Get Collocated Exhaustive Secondary Data ++++++++++++++++++++++++++++++++++++++++ Used and wrapped within :func:`gs.getcollocated() ` Histogram Correction ++++++++++++++++++++ Not used in any functions, but can be called. GSLIB-Binary I/O Subroutines ++++++++++++++++++++++++++++ File Read --------- Used and wrapped within :func:`gs.read_gsb() ` and :class:`gs.DataFile() ` File Write ---------- Used and wrapped within :func:`gs.write_gsb() ` and :class:`gs.DataFile() ` Semiparametric Bayesian Updating ++++++++++++++++++++++++++++++++ Not used in any functions, but can be called. Subsample Dataset +++++++++++++++++ .. autofunction:: pygeostat.fortran.wrappers.subsample Experimental Variogram Calculation ++++++++++++++++++++++++++++++++++ Used and wrapped within :func:`gs.varcalc() ` Variogram Modeling and Fitting ++++++++++++++++++++++++++++++ Used and wrapped within :func:`gs.Variogram.fitmodel() ` Gridded Variogram Calculation +++++++++++++++++++++++++++++ Used and wrapped within :func:`gs.varsim() ` F2PY **** .. _instF2PY: F2PY Intro, Installation, Troubleshooting Tips ++++++++++++++++++++++++++++++++++++++++++++++ F2PY compiles Fortran subroutines into .pyd files, allowing them to be called as Python functions.The module is included within numpy. Prerequisites for F2PY include (possibly not limited to, depending on your system, setup, and experience with the F2PY quirks): 1. A C compiler from one of: a) MinGW (cygwin or other) b) MSVC compilers (probably the hardest option) 2. A Fortran compiler: a) gfortran from MinGW b) Intel Fortran Compiler (see: resource/f2py_ifort_Makefile/) 3. The *right* path variables (this list may not be exhaustive, and is a common source of frustration in Windows) a) Anaconda path variables (should already be setup) b) Path to MinGW bin (/MinGW/bin) c) Path to Cygwin bin (/bin) 4. "Wrappable" Fortran code a) see `here `_ for more details Getting F2PY to work directly and properly on windows can be an issue as the right combination of C and Fortran compilers are required based on your operating system and whether your machine has 32 or 64-bit architecture. Libraries also need to be in the correct location. .. warning:: F2PY compilation with Cygwin does not presently work in Anaconda3 2.4.1 for Python 3.5 due to a change in Visual Studio Libraries (Bug Report: https://bugs.Python.org/issue25251). Recommendation is to use Anaconda3 2.3.0 in the meantime. Installation instructures are outlined below, some examples can be found :ref:`here `. It is assumed that the Anaconda distribution is being used and is already installed **Windows 64-Bit** This procedure works for Windows 7 OS, it may work for others but has not been tested. Install mingw to Anaconda by running the following in the command line or cygwin prompt: >>> conda install mingw Install cygwin with the following packages from the development category: * mingw64-x86_64-gcc-core * mingw64-x86_64-gcc-g++ .. _exF2PY: F2PY Examples +++++++++++++ Typical Fortran code allows for dynamically allocatable arrays depending on size constraints determined during run time. This can be accounted for in 2 ways with F2PY. Either write your Fortran subroutines to explicitly define *all* array sizes in the subroutine call. i.e.: .. code-block:: Fortran subroutine example(data,size) integer, intent(in) :: size real*8, intent(inout) :: data(size) ... end subroutine example If the above subroutine was contained in a file called 'example.f90' it could immediately be wrapped with F2PY using the command line F2PY functionality: >>> f2py -c example.f90 -m example There are commonly errors thrown at this stage, some of which are covered in the installation section, others can be sorted out by combing 'the forumns'... The simplest method for troubleshooting is to use various combinations of ``--fcompiler=`` or ``--compiler=`` flags to determine the working combination for your machine. So the modified call may be something like this: >>> f2py --fcompiler=gfortran --compiler=cygwin -c example.f90 -m example The above procedure assumes that your functions are passed and return variables of known size. That is, sizes of the variables are known (there are no 'allocatable' arrays here). Additionally, all types are standard, derived types cannot be passed. If the code you are developing **MUST** account for unknown sizes, or needs to work with some kind of derived type, a working methodology is to use module allocatable arrays and private module variables, respectively: .. code-block:: Fortran module example ! things we can access from Python ! real*8,allocatable,public :: data(:,:) public :: usedata ! things we cannot access from Python, but can use in Fortran ! type(customtype),private :: derived_variable contains subroutine usedata() do !something with data and/or derived_variable enddo end subroutine end module example In this case, you define the size of the allocatable arrays from Python by assigning data to the module variable after compiling the module with F2PY (more on this below): >>> import numpy as np >>> from example import example as ex >>> ex.data = np.array([[1,1,1],[0,0,0]]) Using these simple rules, it is possible to wrap *existing* Fortran codes that are too 'complex' for F2PY, so they can be used in Python. Conider the example below. The kdtree2_module (from the CCG Knowledge Base) contains complex Fortran code including pointers, allocatable arrays, derived types, recursive subroutines. It also relies on allocatable arrays and pointers in order to be used from calling code, since you need to have dynamic storage for nearest neighbor searches, and must store pointers to the kdtree once it is constructed. The following wrapping module can be compiled by F2PY, to allow access to the complex code from Python: .. code-block:: Fortran module kd use kdtree2_module implicit none !things that we can access from Python! public :: createtree,nnearest,rnearest integer,allocatable,dimension(:),public :: inds real*8,allocatable,dimension(:),public :: dis real*8,allocatable,dimension(:,:),public :: dlocs integer, public :: nd !things we can't access from Python: type(kdtree2),pointer,private :: tree type(kdtree2_result),allocatable,target,private :: datainds(:) contains subroutine createtree() real*8,allocatable,dimension(:,:) :: tlocs if(allocated(dlocs))then if(size(dlocs,1)>3)then write(*,*)'Data must be dim x nd dimensioned!' return endif if(associated(tree)) call kdtree2_destroy(tree) tree => kdtree2_create(dlocs,3,.true.) nd = size(dlocs,2) else write(*,*)'Data not defined' endif end subroutine createtree subroutine nnearest(n,pt) integer,intent(in) :: n real*8,intent(in),dimension(3) :: pt if(allocated(inds))deallocate(datainds,inds,dis) allocate(datainds(n),inds(n),dis(n)) call kdtree2_n_nearest(tree,pt,n,datainds) inds=datainds%idx dis=sqrt(datainds%dis) end subroutine nnearest subroutine rnearest(r,pt) real*8,intent(in) :: r real*8,intent(in),dimension(3) :: pt integer :: n,nf real*8 :: r2 r2 = r * r n = kdtree2_r_count(tree,pt,r2) if(allocated(inds))deallocate(datainds,inds,dis) allocate(datainds(n),inds(n),dis(n)) call kdtree2_r_nearest(tree,pt,r2,nf,n,datainds) inds=datainds%idx dis=sqrt(datainds%dis) end subroutine end module kd In order to build the above module with F2PY, we need to first pre-compile the kdtree2.f90 codes into a linkable module for F2PY. Do this with: >>> gfortran -c -O2 kdtree2.f90 -fstack-arrays The -c flag instructs gfortran to build the codes without linking to an executable, which generates a .mod file for every module in the kdtree2.f90 codes, and links all .mod modules into the .o object file that is used in the next step. The -fstack-arrays flag is required in this case since intels compilers assign arrays to the stack by default, whereas gfortran does not. This flag may not be required for all modules. Additionally it may be necessary to define the stack size explicitly here with another flag (see `website `_ for more options). **Pre-compiling the complex codes is the critical step for using complex Fortran codes in Python.** The next step is to build the wrapping module kd, contained in the kd_wrap.f90 codes (shown above), with F2PY. Building the module with F2PY generates the C-wrappers that allows the Fortran code to be called from Python. This step may be a bit of trial and error with compiler flags and whatnot, but in this case, and from a machine with both mingw, cygwin and Python 3.4 installed, the following call worked: >>> f2py --fcompiler=gfortran --compiler=cygwin --opt=-O2 --f90flags=-fstack-arrays -I. ... kdtree2.o -c kd_wrap.f90 -m kd_wrap Various errors will likely be thrown at this step, varying from 64 bit compiler issues, Python issues, distutils, paths, etc. For some troubleshooting see the F2PY install section. Once compiled into a .pyd file, it can be called within Python as a module, and any module variables can be assigned data, any module subroutines may also be called, for example: .. code-block:: Python from kd_wrap import kd import pygeostat as gs import numpy as np #import some data: dat = gs.DataFile(flname='GC_df.out',readfl=True) #in the kdtree2 module, data is assumed to be dim x nd dimensioned kd.dlocs = np.array(dat.data[[0,1,2]]).transpose() kd.createtree() # Find points around some arbitrary location pt = np.array([10,10,10]) # Do a nearest neighbor search with 10 neighbors kd.nnearest(10,pt) # Now we can access the resulting distances and indices print(kd.dis) print(kd.inds) # Do a radius search kd.rnearest(150,pt) # Now we can access the resulting distances and indices print(kd.dis) print(kd.inds) .. _dllFortran: Compiling and calling Fortran dll's ************************************* The standard Python library contains a package called ``ctypes`` which provides interfaces to foreign functions that are found in shared libraries. With the Intel compiler, a Fortran function or subroutine can be exported to dll by ensuring that the `!DEC$` is present to define the exported symbol: .. code-block:: Fortran ! Note: this only works with the Intel Compiler subroutine sqr(val) !DEC$ ATTRIBUTES DLLEXPORT, ALIAS:'sqr' :: sqr integer, intent(inout) :: val val = val ** 2 end subroutine sqr If the above subroutine is contained in the ``example.f90`` file, it is compiled with : >>> ifort /dll example.f90 which produces the shared library ``example.dll``. Inspecting this object with dumpbin: >>> dumpbin /exports example.dll 1 0 00001000 sqr The function can be imported to Python with the ``ctypes`` module and called with the following Python code: .. code-block:: Python import ctypes as ct fortlib = ct.CDLL('example.dll') val = 100 val = ct.pointer(ct.c_int(val)) # setup the pointer to the correct data structure _ = fortlib.sqr(val) # call the function print(val[0]) # index the memory address setup above Considering the :ref:`F2PY examples above`, immediately it is clear that compiling Fortran in this way is easier, and calling the Fortran from Python in this way is more difficult. The advantages are that this is a C-independent (except for Python and the ``ctypes`` library) method to call a Fortran function from Python, and the same dll called from Python here could also be called from other languages (i.e. C# for petrel). Maintaining GNU - Intel compatibility +++++++++++++++++++++++++++++++++++++ Since many of the pygeostat functions can be compiled with either GNU or Intel tools using the :ref:`FortranBuild class `, a method to maintain a compatible Python calling sequence for the dll's is required since one set of Python code should be able to call the Fortran function regardless of how it was compiled. This can be done by using the ``iso_c_binding`` module: .. code-block:: fortran subroutine sqr2(val) BIND(C, NAME='sqr2') use iso_c_binding !DEC$ ATTRIBUTES DLLEXPORT :: sqr2 integer, intent(inout) :: val val = val ** 2 end subroutine sqr2 The ``iso_c_binding`` in this case overwrites the name mangling for each compiler (gfortran and ifort), and ensures the call sequence from Python is constant. In this case the `!DEC$` statement is ignored by gfortran and the function is available in the exported dll. Notably, the `ALIAS` is dropped from the `!DEC$` export for Intel Fortran since the name in the binding overwrites this setting. More Complicated Data Types +++++++++++++++++++++++++++ The main concern for calling Fortran functions in the dll's from Python is how to setup the data structures. Consider the following function which take a 2-dimensional array of integers and computes some arbitrary value from the inputs by replacing the values in the array: .. code-block:: fortran module example use iso_c_binding implicit none contains subroutine sqr_2d_arr(nd, val) BIND(C, NAME='sqr_2d_arr') !DEC$ ATTRIBUTES DLLEXPORT :: sqr_2d_arr integer, intent(in) :: nd integer, intent(inout) :: val(nd, nd) integer :: i, j do j = 1, nd do i = 1, nd val(i, j) = (val(i, j) + val(j, i)) ** 2 enddo enddo end subroutine sqr_2d_arr end module example The module is compiled with: >>> ifort /dll example.f90 or: >>> gfortran -shared example.f90 -o example.dll and can be called from Python with: .. code-block:: Python import ctypes as ct import numpy as np # import the dll fortlib = ct.CDLL('example.dll') # setup the data N = 10 nd = ct.pointer( ct.c_int(N) ) # setup the pointer pyarr = np.arange(0, N, dtype=int) * 5 # setup the N-long for i in range(1, N): # concatenate columns until it is N x N pyarr = np.c_[pyarr, np.arange(0, N, dtype=int) * 5] # call the function by passing the ctypes pointer using the numpy function: _ = fortlib.sqr_2d_arr(nd, np.ctypeslib.as_ctypes(pyarr)) print(pyarr) Important Considerations ++++++++++++++++++++++++ In the above Python code some things are notable: 1. All data is passed by pointer reference to the Fortran dll, the data is modified in place if the variable is `intent(inout)`, or replaced if the variable is `intent(out)`. All memory is controlled on the Python side of things 2. The memory must be allocated in some form on the Python side using ``np.zeros(size, dtype=type)`` (or similar) even if the variable is `intent(out)` 3. **The types of all data initialized on the Python size must match those being called in the Fortran module** a. This is very important. For example, use ``int`` for ``integer``, ``float`` for ``real*8``, etc. Python ``float`` is double precision by default. This may not be a correct list, and is definitely not exhaustive, but it has worked with the limited testing that has been done with this method of calling Fortran b. the function ``ctypes.c_pointer()`` can be used to create the necessary references to non-array elements, be sure to use ``ctypes.c_int()`` and ``ctypes.c_double()`` (and others) as required for the needed parameter c. initializing arrays on the python end is strange. For example, say a Fortran subroutine is defined with an `integer, intent(out)` array with size `nd`, to initialize this array on the python side, you can use:: import ctypes as ct intarray = (ct.c_int * nd)() and to give the array values for input:: import ctypes as ct pyarray = np.zeros(nd, dtype=np.int) valarray = (ct.c_int * nd)(*pyarray) 4. Numpy provides some convenient methods ``np.ctypeslib.as_ctypes()`` to pass initialized arrays with the correct types to the Fortran functions (as demonstrated above) Mixed Language Debugging ++++++++++++++++++++++++ The use of dll's and ``ctypes`` library for calling the Fortran code provides some exciting opportunities to speed up code development while leveraging the plotting and data management capabilities of pygeostat with the GSLIB Fortran geostatistical library. Visual Studio 2015 Community is currently free and has recently developed a set of Python tools that will utilize Python distributions found on the machine. Combined with the Intel Compiler, projects containing Fortran code compiled to dll, and called VIA the Python libraries and tools presented above, can be debugged after it is called from Python!