Haskell/FFI
Using Haskell is fine, but in the real world there are a large number of useful libraries in other languages, especially C. To use these libraries, and let C code use Haskell functions, there is the Foreign Function Interface (FFI).
Calling C from Haskell
editMarshalling (Type Conversion)
editWhen using C functions, it is necessary to convert Haskell types to the appropriate C types.
These are available in the Foreign.C.Types
module; some examples are given in the following table.
Haskell | Foreign.C.Types | C |
---|---|---|
Double | CDouble | double |
Char | CUChar | unsigned char |
Int | CLong | long int |
The operation of converting Haskell types into C types is called marshalling (and the opposite, predictably, unmarshalling).
For basic types this is quite straightforward: for floating-point one uses realToFrac
(either way, as e.g. both Double
and CDouble
are instances of classes Real
and Fractional
), for integers fromIntegral
, and so on.
If you are using GHC previous to 6.12.x, note that the CLDouble type does not really represent a long double , but is just a synonym for CDouble : never use it, since it will lead to silent type errors if the C compiler does not also consider long double a synonym for double . Since 6.12.x CLDouble has been removed, pending proper implementation. |
Calling a pure C function
editA pure function implemented in C does not present significant trouble in Haskell.
The sin
function of the C standard library is a fine example:
{-# LANGUAGE ForeignFunctionInterface #-}
import Foreign
import Foreign.C.Types
foreign import ccall unsafe "math.h sin"
c_sin :: CDouble -> CDouble
First, we specify a GHC extension for the FFI in the first line.
We then import the Foreign
and Foreign.C.Types
modules, the latter of which contains information about CDouble
, the representation of double-precision floating-point numbers in C.
We then specify that we are importing a foreign function, with a call to C.
A "safety level" has to be specified with the keyword safe
(the default) or unsafe
.
In general, unsafe
is more efficient, and safe
is required only for C code that could call back a Haskell function.
Since that is a very particular case, it is actually quite safe to use the unsafe
keyword in most cases.
Finally, we need to specify header and function name, separated by a space.
The Haskell function name is then given, in our case we use a standard c_sin
, but it could have been anything. Note that the function signature must be correct—GHC will not check the C header to confirm that the function actually takes a CDouble
and returns another, and writing a wrong one could have unpredictable results.
It is then possible to generate a wrapper around the function using CDouble
so that it looks exactly like any Haskell function.
haskellSin :: Double -> Double
haskellSin = realToFrac . c_sin . realToFrac
Importing C's sin
is simple because it is a pure function that takes a plain double
as input and returns another as output: things will complicate with impure functions and pointers, which are ubiquitous in more complicated C libraries.
Impure C Functions
editA classic impure C function is rand
, for the generation of pseudo-random numbers.
Suppose you do not want to use Haskell's System.Random.randomIO
, for example because you want to replicate exactly the series of pseudo-random numbers output by some C routine. Then, you could import it just like sin
before:
{-# LANGUAGE ForeignFunctionInterface #-}
import Foreign
import Foreign.C.Types
foreign import ccall unsafe "stdlib.h rand"
c_rand :: CUInt -- Oops!
If you try this naïve implementation in GHCI, you will notice that c_rand
is returning always the same value:
> c_rand 1714636915 > c_rand 1714636915
indeed, we have told GHC that it is a pure function, and GHC sees no point in calculating twice the result of a pure function. Note that GHC did not give any error or warning message.
In order to make GHC understand this is no pure function, we have to use the IO monad:
{-# LANGUAGE ForeignFunctionInterface #-}
import Foreign
import Foreign.C.Types
foreign import ccall unsafe "stdlib.h rand"
c_rand :: IO CUInt
foreign import ccall "stdlib.h srand"
c_srand :: CUInt -> IO ()
Here, we also imported the srand
function, to be able to seed the C pseudo-random generator.
> c_rand 1957747793 > c_rand 424238335 > c_srand 0 > c_rand 1804289383 > c_srand 0 > c_rand 1804289383
Working with C Pointers
editThe most useful C functions are often those that do complicated calculations with several parameters, and with increasing complexity the need of returning control codes arises. This means that a typical paradigm of C libraries is to give pointers of allocated memory as "targets" in which the results may be written, while the function itself returns an integer value (typically, if 0, computation was successful, otherwise there was a problem specified by the number). Another possibility is that the function will return a pointer to a structure (possibly defined in the implementation, and therefore unavailable to us).
As a pedagogical example, we consider the gsl_frexp
function of the GNU Scientific Library, a freely available library for scientific computation.
It is a simple C function with prototype:
double gsl_frexp (double x, int * e)
The function takes a double
x, and it returns its normalised fraction f and integer exponent e so that:
We interface this C function into Haskell with the following code:
{-# LANGUAGE ForeignFunctionInterface #-}
import Foreign
import Foreign.Ptr
import Foreign.C.Types
import System.IO.Unsafe -- for unsafePerformIO
foreign import ccall unsafe "gsl/gsl_math.h gsl_frexp"
gsl_frexp :: CDouble -> Ptr CInt -> IO CDouble
The new part is Ptr
, which can be used with any instance of the Storable
class, among which all C types, but also several Haskell types.
Notice how the result of the gsl_frexp
function is in the IO
monad.
This is typical when working with pointers, be they used for input or output (as in this case); we will see shortly what would happen had we used a simple CDouble
for the function.
The frexp
function is implemented in pure Haskell code as follows:
frexp :: Double -> (Double, Int)
frexp x = unsafePerformIO $
alloca $ \expptr -> do
f <- gsl_frexp (realToFrac x) expptr
e <- peek expptr
return (realToFrac f, fromIntegral e)
We know that, memory management details aside, the function is pure: that's why the signature returns a tuple with f and e outside of the IO
monad.
Yet, f is provided inside of it: to extract it, we use the function unsafePerformIO, which extracts values from the IO
monad: obviously, it is legitimate to use it only when we know the function is pure, and we can allow GHC to optimise accordingly.
To allocate pointers, we use the alloca
function, which also takes responsibility for freeing memory.
As an argument, alloca
takes a function of type Ptr a -> IO b
, and returns the IO b
.
In practice, this translates to the following usage pattern with λ functions:
... alloca $ \pointer -> do
c_function argument pointer
result <- peek pointer
return result
The pattern can easily be nested if several pointers are required:
... alloca $ \firstPointer ->
alloca $ \secondPointer -> do
c_function argument firstPointer secondPointer
first <- peek firstPointer
second <- peek secondPointer
return (first, second)
Back to our frexp
function: in the λ function that is the argument to alloca
, the function is evaluated and the pointer is read immediately afterwards with peek
.
Here we can understand why we wanted the imported C function gsl_frexp
to return a value in the IO
monad: if GHC could decide when to calculate the quantity f, it would likely decide not to do it until it is necessary: that is at the last line when return
uses it, and after e has been read from an allocated, but yet uninitialised memory address, which will contain random data. In short, we want gsl_frexp
to return a monadic value because we want to determine the sequence of computations ourselves.
If some other function had required a pointer to provide input instead of storing output, one would have used the similar poke
function to set the pointed value, obviously before evaluating the function:
... alloca $ \inputPointer ->
alloca $ \outputPointer -> do
poke inputPointer value
c_function argument inputPointer outputPointer
result <- peek outputPointer
return result
In the final line, the results are arranged in a tuple and returned, after having been converted from C types.
To test the function, remember to link GHC to the GSL; in GHCI, do:
$ ghci frexp.hs -lgsl
(Note that most systems do not come with the GSL preinstalled, and you may have to download and install its development packages.)
Working with C Structures
editVery often data are returned by C functions in form of struct
s or pointers to these.
In some rare cases, these structures are returned directly, but more often they are returned as pointers; the return value is most often an int
that indicates the correctness of execution.
We will consider another GSL function, gsl_sf_bessel_Jn_e
.
This function provides the regular cylindrical Bessel function for a given order n, and returns the result as a gsl_sf_result
structure pointer.
The structure contains two double
s, one for the result and one for the error.
The integer error code returned by the function can be transformed in a C string by the function gsl_strerror
.
The signature of the Haskell function we are looking for is therefore:
BesselJn :: Int -> Double -> Either String (Double, Double)
where the first argument is the order of the cylindrical Bessel function, the second is the function's argument, and the returned value is either an error message or a tuple with result and margin of error.
Making a New Instance of the Storable
class
edit
In order to allocate and read a pointer to a gsl_sf_result
structure, it is necessary to make it an instance of the Storable
class.
In order to do that, it is useful to use the hsc2hs
program: we create first a Bessel.hsc file, with a mixed syntax of Haskell and C macros, which is later expanded into Haskell by the command:
$ hsc2hs Bessel.hsc
After that, we simply load the Bessel.hs file in GHC.
This is the first part of file Bessel.hsc:
{-# LANGUAGE ForeignFunctionInterface #-}
module Bessel (besselJn) where
import Foreign
import Foreign.Ptr
import Foreign.C.String
import Foreign.C.Types
#include <gsl/gsl_sf_result.h>
data GslSfResult = GslSfResult { gsl_value :: CDouble, gsl_error :: CDouble }
instance Storable GslSfResult where
sizeOf _ = (#size gsl_sf_result)
alignment _ = alignment (undefined :: CDouble)
peek ptr = do
value <- (#peek gsl_sf_result, val) ptr
error <- (#peek gsl_sf_result, err) ptr
return GslSfResult { gsl_value = value, gsl_error = error }
poke ptr (GslSfResult value error) = do
(#poke gsl_sf_result, val) ptr value
(#poke gsl_sf_result, err) ptr error
We use the #include
directive to make sure hsc2hs
knows where to find information about gsl_sf_result
.
We then define a Haskell data structure mirroring the GSL's, with two CDouble
s: this is the class we make an instance of Storable
.
Strictly, we need only sizeOf
, alignment
and peek
for this example; poke
is added for completeness.
sizeOf
is obviously fundamental to the allocation process, and is calculated byhsc2hs
with the#size
macro.alignment
is the size in bytes of the data structure alignment. In general, it should be the largestalignment
of the elements of the structure; in our case, since the two elements are the same, we simply useCDouble
's. The value of the argument toalignment
is inconsequential, what is important is the type of the argument.peek
is implemented using ado
-block and the#peek
macros, as shown.val
anderr
are the names used for the structure fields in the GSL source code.- Similarly,
poke
is implemented with the#poke
macro.
Importing the C Functions
editforeign import ccall unsafe "gsl/gsl_bessel.h gsl_sf_bessel_Jn_e"
c_besselJn :: CInt -> CDouble -> Ptr GslSfResult -> IO CInt
foreign import ccall unsafe "gsl/gsl_errno.h gsl_set_error_handler_off"
c_deactivate_gsl_error_handler :: IO ()
foreign import ccall unsafe "gsl/gsl_errno.h gsl_strerror"
c_error_string :: CInt -> IO CString
We import several functions from the GSL libraries: first, the Bessel function itself, which will do the actual work. Then, we need a particular function, gsl_set_error_handler_off
, because the default GSL error handler will simply crash the program, even if called by Haskell: we, instead, plan to deal with errors ourselves.
The last function is the GSL-wide interpreter that translates error codes in human-readable C strings.
Implementing the Bessel Function
editFinally, we can implement the Haskell version of the GSL cylindrical Bessel function of order n.
besselJn :: Int -> Double -> Either String (Double, Double)
besselJn n x = unsafePerformIO $
alloca $ \gslSfPtr -> do
c_deactivate_gsl_error_handler
status <- c_besselJn (fromIntegral n) (realToFrac x) gslSfPtr
if status == 0
then do
GslSfResult val err <- peek gslSfPtr
return $ Right (realToFrac val, realToFrac err)
else do
error <- c_error_string status
error_message <- peekCString error
return $ Left ("GSL error: "++error_message)
Again, we use unsafePerformIO
because the function is pure, even though its nuts-and-bolts implementation is not.
After allocating a pointer to a GSL result structure, we deactivate the GSL error handler to avoid crashes in case something goes wrong, and finally we can call the GSL function.
At this point, if the status
returned by the function is 0, we unmarshal the result and return it as a tuple.
Otherwise, we call the GSL error-string function, and pass the error as a Left
result instead.
Examples
editOnce we are finished writing the Bessel.hsc
function, we have to convert it to proper Haskell and load the produced file:
$ hsc2hs Bessel.hsc $ ghci Bessel.hs -lgsl
We can then call the Bessel function with several values:
> besselJn 0 10 Right (-0.2459357644513483,1.8116861737200453e-16) > besselJn 1 0 Right (0.0,0.0) > besselJn 1000 2 Left "GSL error: underflow"
Advanced Topics
editThis section contains an advanced example with some more complex features of the FFI.
We will import into Haskell one of the more complicated functions of the GSL, the one used to calculate the integral of a function between two given points with an adaptive Gauss-Kronrod algorithm.
The GSL function is gsl_integration_qag
.
This example will illustrate function pointers, export of Haskell functions to C routines, enumerations, and handling pointers of unknown structures.
Available C Functions and Structures
editThe GSL has three functions which are necessary to integrate a given function with the considered method:
gsl_integration_workspace * gsl_integration_workspace_alloc (size_t n);
void gsl_integration_workspace_free (gsl_integration_workspace * w);
int gsl_integration_qag (const gsl_function * f, double a, double b,
double epsabs, double epsrel, size_t limit,
int key, gsl_integration_workspace * workspace,
double * result, double * abserr);
The first two deal with allocation and deallocation of a "workspace" structure of which we know nothing (we just pass a pointer around). The actual work is done by the last function, which requires a pointer to a workspace.
To provide functions, the GSL specifies an appropriate structure for C:
struct gsl_function
{
double (* function) (double x, void * params);
void * params;
};
The reason for the void
pointer is that it is not possible to define λ functions in C, so the function cannot be partially applied with some general parameters independent of x
and these parameters are therefore passed in a pointer of an unknown type.
In Haskell, we do not need the params
element, and will consistently ignore it.
Imports and Inclusions
editWe start our qag.hsc file with the following:
{-# LANGUAGE ForeignFunctionInterface, EmptyDataDecls #-}
module Qag ( qag,
gauss15,
gauss21,
gauss31,
gauss41,
gauss51,
gauss61 ) where
import Foreign
import Foreign.Ptr
import Foreign.C.Types
import Foreign.C.String
#include <gsl/gsl_math.h>
#include <gsl/gsl_integration.h>
foreign import ccall unsafe "gsl/gsl_errno.h gsl_strerror"
c_error_string :: CInt -> IO CString
foreign import ccall unsafe "gsl/gsl_errno.h gsl_set_error_handler_off"
c_deactivate_gsl_error_handler :: IO ()
We declare the EmptyDataDecls
pragma, which we will use later for the Workspace
data type.
Since this file will have a good number of functions that should not be available to the outside world, we also declare it a module and export only the final function qag
and the gauss
- flags.
We also include the relevant C headers of the GSL.
The import of C functions for error messages and deactivation of the error handler was described before.
Enumerations
editOne of the arguments of gsl_integration_qag
is key
, an integer value that can have values from 1 to 6 and indicates the integration rule.
GSL defines a macro for each value, but in Haskell it is more appropriate to define a type, which we call IntegrationRule
.
Also, to have its values automatically defined by hsc2hs, we can use the enum
macro:
newtype IntegrationRule = IntegrationRule { rule :: CInt }
#{enum IntegrationRule, IntegrationRule,
gauss15 = GSL_INTEG_GAUSS15,
gauss21 = GSL_INTEG_GAUSS21,
gauss31 = GSL_INTEG_GAUSS31,
gauss41 = GSL_INTEG_GAUSS41,
gauss51 = GSL_INTEG_GAUSS51,
gauss61 = GSL_INTEG_GAUSS61
}
hsc2hs will search the headers for the macros and give our variables the correct values. The enum directive will define a function with an appropriate type signature for each of the enum values. The above example will get translated to something like this (with the C macros appropriately replaced by their values):
newtype IntegrationRule = IntegrationRule { rule :: CInt }
gauss15 :: IntegrationRule
gauss15 = IntegrationRule GSL_INTEG_GAUSS15
gauss21 :: IntegrationRule
gauss21 = IntegrationRule GSL_INTEG_GAUSS21
.
.
.
The variables cannot be modified and are essentially constant flags.
Since we did not export the IntegrationRule
constructor in the module declaration, but only the gauss
flags, it is impossible for a user to even construct an invalid value. One thing less to worry about!
Haskell Function Target
editWe can now write down the signature of the function we desire:
qag :: IntegrationRule -- Algorithm type
-> Int -- Step limit
-> Double -- Absolute tolerance
-> Double -- Relative tolerance
-> (Double -> Double) -- Function to integrate
-> Double -- Integration interval start
-> Double -- Integration interval end
-> Either String (Double, Double) -- Result and (absolute) error estimate
Note how the order of arguments is different from the C version: indeed, since C does not have the possibility of partial application, the ordering criteria are different than in Haskell.
As in the previous example, we indicate errors with a Either String (Double, Double)
result.
Passing Haskell Functions to the C Algorithm
edittype CFunction = CDouble -> Ptr () -> CDouble
data GslFunction = GslFunction (FunPtr CFunction) (Ptr ())
instance Storable GslFunction where
sizeOf _ = (#size gsl_function)
alignment _ = alignment (undefined :: Ptr ())
peek ptr = do
function <- (#peek gsl_function, function) ptr
return $ GslFunction function nullPtr
poke ptr (GslFunction fun nullPtr) = do
(#poke gsl_function, function) ptr fun
makeCfunction :: (Double -> Double) -> (CDouble -> Ptr () -> CDouble)
makeCfunction f = \x voidpointer -> realToFrac $ f (realToFrac x)
foreign import ccall "wrapper"
makeFunPtr :: CFunction -> IO (FunPtr CFunction)
We define a shorthand type, CFunction
, for readability. Note that the void
pointer has been translated to a Ptr ()
, since we have no intention of using it.
Then it is the turn of the gsl_function
structure: no surprises here. Note that the void
pointer is always assumed to be null, both in peek
and in poke
, and is never really read nor written.
To make a Haskell Double -> Double
function available to the C algorithm, we make two steps: first, we re-organise the arguments using a λ function in makeCfunction
; then, in makeFunPtr
, we take the function with reordered arguments and produce a function pointer that we can pass on to poke
, so we can construct the GslFunction
data structure.
Handling Unknown Structures
editdata Workspace
foreign import ccall unsafe "gsl/gsl_integration.h gsl_integration_workspace_alloc"
c_qag_alloc :: CSize -> IO (Ptr Workspace)
foreign import ccall unsafe "gsl/gsl_integration.h gsl_integration_workspace_free"
c_qag_free :: Ptr Workspace -> IO ()
foreign import ccall safe "gsl/gsl_integration.h gsl_integration_qag"
c_qag :: Ptr GslFunction -- Allocated GSL function structure
-> CDouble -- Start interval
-> CDouble -- End interval
-> CDouble -- Absolute tolerance
-> CDouble -- Relative tolerance
-> CSize -- Maximum number of subintervals
-> CInt -- Type of Gauss-Kronrod rule
-> Ptr Workspace -- GSL integration workspace
-> Ptr CDouble -- Result
-> Ptr CDouble -- Computation error
-> IO CInt -- Exit code
The reason we imported the EmptyDataDecls
pragma is this: we are declaring the data structure Workspace
without providing any constructor.
This is a way to make sure it will always be handled as a pointer, and never actually instantiated.
Otherwise, we normally import the allocating and deallocating routines.
We can now import the integration function, since we have all the required pieces (GslFunction
and Workspace
).
The Complete Function
editIt is now possible to implement a function with the same functionality as the GSL's QAG algorithm.
qag gauss steps abstol reltol f a b = unsafePerformIO $ do
c_deactivate_gsl_error_handler
workspacePtr <- c_qag_alloc (fromIntegral steps)
if workspacePtr == nullPtr
then
return $ Left "GSL could not allocate workspace"
else do
fPtr <- makeFunPtr $ makeCfunction f
alloca $ \gsl_f -> do
poke gsl_f (GslFunction fPtr nullPtr)
alloca $ \resultPtr -> do
alloca $ \errorPtr -> do
status <- c_qag gsl_f
(realToFrac a)
(realToFrac b)
(realToFrac abstol)
(realToFrac reltol)
(fromIntegral steps)
(rule gauss)
workspacePtr
resultPtr
errorPtr
c_qag_free workspacePtr
freeHaskellFunPtr fPtr
if status /= 0
then do
c_errormsg <- c_error_string status
errormsg <- peekCString c_errormsg
return $ Left errormsg
else do
c_result <- peek resultPtr
c_error <- peek errorPtr
let result = realToFrac c_result
let error = realToFrac c_error
return $ Right (result, error)
First and foremost, we deactivate the GSL error handler, that would crash the program instead of letting us report the error.
We then proceed to allocate the workspace; notice that, if the returned pointer is null, there was an error (typically, too large size) that has to be reported.
If the workspace was allocated correctly, we convert the given function to a function pointer and allocate the GslFunction
struct, in which we place the function pointer.
Allocating memory for the result and its error margin is the last thing before calling the main routine.
After calling, we have to do some housekeeping and free the memory allocated by the workspace and the function pointer. Note that it would be possible to skip the bookkeeping using ForeignPtr
, but the work required to get it to work is more than the effort to remember one line of cleanup.
We then proceed to check the return value and return the result, as was done for the Bessel function.
Self-Deallocating Pointers
editIn the previous example, we manually handled the deallocation of the GSL integration workspace, a data structure we know nothing about, by calling its C deallocation function. It happens that the same workspace is used in several integration routines, which we may want to import in Haskell.
Instead of replicating the same allocation/deallocation code each time, which could lead to memory leaks when someone forgets the deallocation part, we can provide a sort of "smart pointer", which will deallocate the memory when it is not needed any more.
This is called ForeignPtr
(do not confuse with Foreign.Ptr
: this one's qualified name is actually Foreign.ForeignPtr
!).
The function handling the deallocation is called the finalizer.
In this section we will write a simple module to allocate GSL workspaces and provide them as appropriately configured ForeignPtr
s, so that users do not have to worry about deallocation.
The module, written in file GSLWorkspace.hs, is as follows:
{-# LANGUAGE ForeignFunctionInterface, EmptyDataDecls #-}
module GSLWorkSpace (Workspace, createWorkspace) where
import Foreign.C.Types
import Foreign.Ptr
import Foreign.ForeignPtr
data Workspace
foreign import ccall unsafe "gsl/gsl_integration.h gsl_integration_workspace_alloc"
c_ws_alloc :: CSize -> IO (Ptr Workspace)
foreign import ccall unsafe "gsl/gsl_integration.h &gsl_integration_workspace_free"
c_ws_free :: FunPtr( Ptr Workspace -> IO () )
createWorkspace :: CSize -> IO (Maybe (ForeignPtr Workspace) )
createWorkspace size = do
ptr <- c_ws_alloc size
if ptr /= nullPtr
then do
foreignPtr <- newForeignPtr c_ws_free ptr
return $ Just foreignPtr
else
return Nothing
We first declare our empty data structure Workspace
, just like we did in the previous section.
The gsl_integration_workspace_alloc
and gsl_integration_workspace_free
functions will no longer be needed in any other file: here, note that the deallocation function is called with an ampersand ("&"), because we do not actually want the function, but rather a pointer to it to set as a finalizer.
The workspace creation function returns a IO (Maybe) value, because there is still the possibility that allocation is unsuccessful and the null pointer is returned. The GSL does not specify what happens if the deallocation function is called on the null pointer, so for safety we do not set a finalizer in that case and return IO Nothing
; the user code will then have to check for "Just
-ness" of the returned value.
If the pointer produced by the allocation function is non-null, we build a foreign pointer with the deallocation function, inject into the Maybe
and then the IO
monad. That's it, the foreign pointer is ready for use!
This function requires object code to be compiled, so if you load this module with GHCI (which is an interpreter) you must indicate it:
$ ghci GSLWorkSpace.hs -fobject-code Or, from within GHCI: > :set -fobject-code> :load GSLWorkSpace.hs |
The qag.hsc file must now be modified to use the new module; the parts that change are:
{-# LANGUAGE ForeignFunctionInterface #-}
-- [...]
import GSLWorkSpace
import Data.Maybe(isNothing, fromJust)
-- [...]
qag gauss steps abstol reltol f a b = unsafePerformIO $ do
c_deactivate_gsl_error_handler
ws <- createWorkspace (fromIntegral steps)
if isNothing ws
then
return $ Left "GSL could not allocate workspace"
else do
withForeignPtr (fromJust ws) $ \workspacePtr -> do
-- [...]
Obviously, we do not need the EmptyDataDecls
extension here any more; instead we import the GSLWorkSpace
module, and also a couple of nice-to-have functions from Data.Maybe
.
We also remove the foreign declarations of the workspace allocation and deallocation functions.
The most important difference is in the main function, where we (try to) allocate a workspace ws
, test for its Just
ness, and if everything is fine we use the withForeignPtr
function to extract the workspace pointer. Everything else is the same.
Calling Haskell from C
editSometimes it is also convenient to call Haskell from C, in order to take advantage of some of Haskell's features which are tedious to implement in C, such as lazy evaluation.
We will consider a typical Haskell example, Fibonacci numbers. These are produced in an elegant, haskellian one-liner as:
fibonacci = 0 : 1 : zipWith (+) fibonacci (tail fibonacci)
Our task is to export the ability to calculate Fibonacci numbers from Haskell to C.
However, in Haskell, we typically use the Integer
type, which is unbounded: this cannot be exported to C, since there is no such corresponding type.
To provide a larger range of outputs, we specify that the C function shall output, whenever the result is beyond the bounds of its integer type, an approximation in floating-point.
If the result is also beyond the range of floating-point, the computation will fail.
The status of the result (whether it can be represented as a C integer, a floating-point type or not at all) is signalled by the status integer returned by the function.
Its desired signature is therefore:
int fib( int index, unsigned long long* result, double* approx )
Haskell Source
editThe Haskell source code for file fibonacci.hs is:
{-# LANGUAGE ForeignFunctionInterface #-}
module Fibonacci where
import Foreign
import Foreign.C.Types
fibonacci :: (Integral a) => [a]
fibonacci = 0 : 1 : zipWith (+) fibonacci (tail fibonacci)
foreign export ccall fibonacci_c :: CInt -> Ptr CULLong -> Ptr CDouble -> IO CInt
fibonacci_c :: CInt -> Ptr CULLong -> Ptr CDouble -> IO CInt
fibonacci_c n intPtr dblPtr
| badInt && badDouble = return 2
| badInt = do
poke dblPtr dbl_result
return 1
| otherwise = do
poke intPtr (fromIntegral result)
poke dblPtr dbl_result
return 0
where
result = fibonacci !! (fromIntegral n)
dbl_result = realToFrac result
badInt = result > toInteger (maxBound :: CULLong)
badDouble = isInfinite dbl_result
When exporting, we need to wrap our functions in a module (it is a good habit anyway).
We have already seen the Fibonacci infinite list, so let's focus on the exported function: it takes an argument, two pointers to the target unsigned long long
and double
, and returns the status in the IO
monad (since writing on pointers is a side effect).
The function is implemented with input guards, defined in the where
clause at the bottom. A successful computation will return 0, a partially successful 1 (in which we still can use the floating-point value as an approximation), and a completely unsuccessful one will return 2.
Note that the function does not call alloca
, since the pointers are assumed to have been already allocated by the calling C function.
The Haskell code can then be compiled with GHC:
ghc -c fibonacci.hs
C Source
editThe compilation of fibonacci.hs has spawned several files, among which fibonacci_stub.h, which we include in our C code in file fib.c:
#include <stdio.h>
#include <stdlib.h>
#include "fibonacci_stub.h"
int main(int argc, char *argv[]) {
if (argc < 2) {
printf("Usage: %s <number>\n", argv[0]);
return 2;
}
hs_init(&argc, &argv);
const int arg = atoi(argv[1]);
unsigned long long res;
double approx;
const int status = fibonacci_c(arg, &res, &approx);
hs_exit();
switch (status) {
case 0:
printf("F_%d: %llu\n", arg, res);
break;
case 1:
printf("Error: result is out of bounds\n");
printf("Floating-point approximation: %e\n", approx);
break;
case 2:
printf("Error: result is out of bounds\n");
printf("Floating-point approximation is infinite\n");
break;
default:
printf("Unknown error: %d\n", status);
}
return status;
}
The notable thing is that we need to initialise the Haskell environment with hs_init
, which we call passing it the command-line arguments of main; we also have to shut Haskell down with hs_exit()
when we are done.
The rest is fairly standard C code for allocation and error handling.
Note that you have to compile the C code with GHC, not your C compiler!
ghc -no-hs-main fib.c fibonacci.o -o fib
You can then proceed to test the algorithm:
./fib 42 F_42: 267914296 $ ./fib 666 Error: result is out of bounds Floating-point approximation: 6.859357e+138 $ ./fib 1492 Error: result is out of bounds Floating-point approximation is infinite ./fib -1 fib: Prelude.(!!): negative index