Wednesday, March 22, 2017

The Haskell Tool Stack for Multi-Package Projects


I organize my coding in projects, which include several packages, but are managed in a single git repository; a single repository coordinates changes in different packages to be able to reproduce consistent states of the code. In this blog I want to document methods to deal with such arrangements using the new Haskell Stack build methods and postpone a discussion of such Multi-Package Project organization for another time.

The Stack project is an effort to overcome one of the major impediments of complex Haskell projects. Haskell packages are versioned and cabal and the linker check

  • the dependencies between packages and their versions, assuring that an acceptable version of a dependent (used) packages is present
  • from each package only one version is present in the build.

It may be difficult to find a set of versions of the required packages, which are consistent; finding such consistent sets is partially automated by cabal install. Two major issues are observed:
  • solutions are not really reproducible and may change by actions outside of the programmers control (e.g. new package version appear)
  • the algorithm may not find an acceptable solution; with some tricks and detective work, a solution can be found, sometimes not - the frustration is known as "cabal hell"

Haskell Stack overcomes these impediments by proposing curated sets of package version which are known to fit together; these are known as snapshots or "lts" and are imported from  Stackage (similar to importing data about versions found on Hackage in cabal).

Using the Haskell Tool Stack starts following the installation guidelins   is installing a complete Haskell environment in locations different than what GHC and cabal usually uses (i.e. you get a new copy of ghc in ~/.stack)

Directories used
  • ghc and other global things go into ~/.stack
  • binaries go into ~/.local/bin  (the result of getAppUserDataDirectory)

The guidelines    to install stack and to run a simple new project are easy to follow and work; in the following, I assume an installed stack.

The example project used here consists of two packages, a and b, a is a library (and a test suite), b is an executable using some of the functions in a. 

Stack relies on a stack.yaml file, which for this project is

flags: {}
extra-package-dbs: []
packages:
- a
- b
extra-deps: []
resolver: lts-8.2

This fixes the snapshot to lts-8.2 (newest in march 2017 would be 8.5), which uses ghc 8.0.2 (lts-7.20 is the latest for ghc 8.0.1, for others see on stackage). As long as the resolver for a project is not changed, the same versions of packages are used and the build is repeatable.

The difference to a single package project stack.yaml file is to replace for packages the entry 
packages:
- "." 
with the names of the package directories.

Extending the multi-project test with a subdirectory with libraries m and n (in the librarySubdir branch) requires additions to stack.yaml
flags: {}
extra-package-dbs: []
packages:
- a
- b
- libs/m
- libs/n
extra-deps: []
resolver: lts-8.2
Stack builds the project and updates the required parts after updates automatically.


P.S.
Autocompletion for stack requires
   
    eval "$(stack --bash-completion-script stack)" 

which can be included in .bashrc

2 comments:

  1. This is a good write up, but I'd like to point out a few minor things

    > solutions are not really reproducible and may change by actions outside of the programmers control (e.g. new package version appear)

    Actually, `cabal` has more advanced facilities for ensuring reproducibility than Stack does, as it allows to freeze the package index file (see http://cabal.readthedocs.io/en/latest/nix-local-build.html#cfg-field-index-state) as well as providing support for freeze-files. And also there's a couple of issues resulting from Stack ignoring/bypassing the system package manager and using GHC binary distributions, rather than GHC bindists which we provide e.g. over at
    http://downloads.haskell.org/debian/ for Debian/Ubuntu (packages for other Linux distributions are provided elsewhere) which are compiled, widely tested and optimised specifically for the
    respective Ubuntu/Debian release.

    > the algorithm may not find an acceptable solution; with some tricks and detective work, a solution can be found, sometimes not - the frustration is known as "cabal hell"

    If there is a solution (and Stackage snapshots are basically just that: some arbitrary solution inferrable from the version constraint meta-data), then the cabal solver will find it. In fact, cabal will be able to find solutions that Stack due to its simplistic model can't even build without jumping through hoops.

    If cabal doesn't a solution that exists according to the meta-data it's quite likely a bug in the cabal solver, and you should report it. We can't fix problems if we're not told about it.

    If cabal finds a solution which fails to compile, you should report it at https://github.com/haskell-infra/hackage-trustees - we usually try to fix issues in a timely matter (within 24h). But we need your help to know what needs fixing.

    This has the benefit that any problems you discover and we fix benefit everyone else using cabal.

    As to the term "Cabal Hell" I'd like to refer to https://www.well-typed.com/blog/2015/01/how-we-might-abolish-cabal-hell-part-2/ to give a more accurate description what it actually means, as people tend to blame "cabal hell" for all sorts of things.

    ReplyDelete
  2. Thank you for your comment. At the moment, stack seems to be a viable solution to the things I do and the packages I use (leksah from github has an installation instruction with stack which works). The problem I encountered with cabal was mostly that the solution found was influenced by what was already loaded in ghc-pkg.
    I understand that stack uses cabal internally and is hitting a problem with cabal, where cabal reuses an ID (hash), even though the package is recompiled and has a different content.
    The layout given in the blog is susceptible to this bug as changes in a are pulled into b only if they are not 'stack build' on a first. If a build on a happens, then b does not notice the change and continues to use the previous state.

    ReplyDelete