PanDDA

Known issues/errors when running pandda in CCP4!

Unfortunately, pandda doesn’t work at the moment in the newest version of ccp4 (7.1). An update is coming, but due to the ongoing additional stresses and burdens of the pandemic I cannot confirm an expected date for availability. For the moment, you need to download and install an older version of ccp4 (7.0) and update it to a specific update (7.0.72).
You can download the older versions of ccp4 here.

About

What is PanDDAs? The PanDDA (Pan-Dataset Density Analysis) method was developed to analyse the data resulting from crystallographic fragment screening. These experiments result in a large number of datasets that potentially contain weakly bound ligands. The detection and identification of weak signal caused by a binding ligand requires a sensitive and objective data-analysis method.

What's this page for? This page discusses the usage of the pandda programs. The tutorials page guides you through a basic example and gives an overview of the standard pandda protocol; the strategies page contains help for several common situations/approaches; and the manual page provides an inevitably incomplete list of all of the command-line options available.

For more details on the methods and algorithm, please refer to the paper:

Pearce, N. M. et al. (2017) ‘A multi-crystal method for extracting obscured crystallographic states from conventionally uninterpretable electron density', Nature Communications.

For more information about modelling, it may be useful to refer to:

Pearce, N. M., Krojer, T. and von Delft, F. (2017) ‘Proper modelling of ligand binding requires an ensemble of bound and unbound states’, Acta Crystallographica Section D Structural Biology.

Input

The input to a PanDDA analysis is a series of refined crystallographic datasets of the same crystal system. The datasets do not need to be strictly isomorphous, but for best results they should have the same solvent and buffer molecules. The only systematic difference between the datasets should be the presence of different ligands.

Output

The output of a PanDDA analysis is a series of ligand-bound protein models (modelled manually with coot) and the associated evidence for the bound ligands. Multi-state ensembles are automatically generated, representing the superposition of bound and unbound conformations present in the crystal.

Notices

The documentation on these pages is for PanDDA v0.2.X.
PanDDA v0.1.X is now obsolete and should not be used.

Availability/Download

PanDDAs is written and tested on Mac and Linux. It is not tested on Windows.

As part of CCP4

PanDDA is now distributed as part of CCP4 (you may need to update your version of CCP4 to the latest version to install it). Updates will be issued periodically within CCP4, but it is inevitable that the CCP4 version will fall behind the most-up-to-date version: consider updating your version from the code on bitbucket (instructions below).

Instructions for updating PanDDAs within CCP4

Latest Release Version

Due to the method used to bundle PanDDA inside CCP4, updating from the command line may not remove all of the original files from the distribution. So it's best to do an uninstall step first!

1) On the command line, uninstall the current version of PanDDA:

> ccp4-python -m pip uninstall panddas

2) It is normally then required to update a couple of things:

> ccp4-python -m pip install pip --upgrade

> ccp4-python -m pip install numpy --upgrade

3) Install the newest released version using pip:

> ccp4-python -m pip install panddas

4) Science! (hopefully).

Latest Developer Version

To install the latest developer version: instead of step 3) above, download the newest version from bitbucket. Unzip the download directory and cd into it:

> cd "/path/to/download/directory"

Then install the new version:

> ccp4-python setup.py install

Source Code

For developers, the panddas source code is freely available on bitbucket. If you're interested in contributing, please see contact details below.

Version Changes + New Features

Version 0.3 -- coming soon!

Version 0.3.0

Input Changes

Reorganisation of the input commands ("phil commands")

New Features

More informative error messages when errors occur during alignment!
ground_datasets=... option that allows the user to define the datasets to be used for map_characterisation (see strategies page: pandda.analyse).

Bug fixes

Map shearing?

Version 0.2

Version 0.2.12

(Bug) fixes

Fix issues introduced in ccp4 update 048 (the "basic_map" error).
Fix compatibility with new version of edstats.
Fix compatibility with new versions of numpy.

Improvements

pandda.analyse

New ways to decide how datasets are processed:

ground_state_datasets=[...]

Mark which datasets are to be used for map characterisation
Only these datasets will be used for characterisation
Roughly the opposite of exclude_from_characterisation=[...]

only_datasets=[...]

Mark which datasets are to be loaded by the program
Only these datasets will be loaded from the input folder
Roughly the opposite of ignore_datasets=[...]

Clearer error messages for missing cryst lines, unit cells, etc.

"There is no crystal symmetry for this structure"
"There is no unit cell information for this structure"
"There is no spacegroup information for this structure"

Can now use median map instead of mean map in z-map calculation.

average_map=mean_map or average_map=medn_map.

Can restrict map characterisation to only part of a structure
- Two different ways to mask:
- Do NOT use a very small mask around the binding site as this will cause problems -- you need to include a large enough region (domain or chain) so that the the noise can be correctly characterised in each dataset.
Can restrict z-map characterisation to only part of a structure
- Define mask with:
- Change size of mask around this region:

pandda.inspect

Generate new ligands from smiles using acedrg

giant.datasets.prepare

New script for preparing data for pandda.
Automates the re-indexing of MTZ files, filling of missing reflections, transfer of R-free flags, and running of refinement pipelines (currently only dimple).
Example usages:

giant.datasets.prepare reference_pdb=ref.pdb reference_mtz=ref.mtz labelling=foldername data/*/input.mtz
giant.datasets.prepare reference_pdb=ref.pdb reference_mtz=ref.mtz labelling=filename data/*.mtz

giant.score_model

can now supply f_label to select columns for edstats.

Changes to Inputs

pandda.analyse

Reorganisation of input phil

z_map.[...] changed to statistical_maps.[...]
min/max_build_datasets moved from analysis.[...] to statistical_maps.[...]
structure_factors=[...] moved from maps.[...] to diffraction_data.[...]
apply_b_factor_scaling=[...] moved to diffraction_data.[...]
checks.[...] moved to diffraction_data.checks.[...]
blob_search.[...] renamed to z_map_analysis

Variable name changes

calculate_first_mean_map_only changed to calculate_first_average_map_only

Version 0.2.11

Bug fixes

Fix to giant.score_model_multiple that caused an error when a prefix was supplied.

Improvements

Dataset reflection data checks

All errors are reported rather than just exiting on first error
Errors now contain a detailed list of the missing/invalid reflections

Changes to Defaults

giant.merge_conformations

prune_duplicates_rmsd = 0.05 (used to be 0.1)

giant.quick_refine

split_conformations = False (used to be True)

Version 0.2.10

Bug fixes

Fix to giant.split_conformations regarding resetting occupancies for output

Improvements

pandda.analyse
- now runs on structures containing non-standard amino acids
giant.merge_conformations

new option: prune_duplicates_rmsd=[...]

controls threshold where alternate conformations are removed/combined

Changes to Inputs

giant.merge_conformations

Allow the major+minor occupancy multipliers to both equal 1 to allow for pre-calculated occupancies.

Version 0.1 - do not use

pandda version 0.1 contains a series of major issues regarding the usability of the output maps. Though these maps are generated correctly, they are not properly aligned to the input models. This fundamentally affects the ability of the experimental evidence for the ligands -- the event maps -- to be disseminated and used for validating any model by a third party.

pandda version 0.2 and higher have fixed these issues - please use these instead.

Newer versions also have improved diagnostics for detecting artefacts in the data that can make it difficult to detect ligands.

Known issues/errors when running pandda in CCP4!

About

Input

Output

Notices

Availability/Download

As part of CCP4

Instructions for updating PanDDAs within CCP4

Latest Release Version

Latest Developer Version

Source Code

Version Changes + New Features

Version 0.3 -- coming soon!

Version 0.3.0

Version 0.2

Version 0.2.12

pandda.analyse

pandda.inspect

giant.datasets.prepare

giant.score_model

pandda.analyse

Version 0.2.11

Version 0.2.10

pandda.analyse

giant.merge_conformations

Version 0.1 - do not use