Crystallographic software on malleable hardware

Z. Matěj1, K. Skovhede1,2, A. Barczyk1, C. Johnsen2, M. R. B. Kristensen2, B. Vinter2

1MAX IV Laboratory, Lund University, Sweden

2Niels Bohr Institute, University of Copenhagen, Denmark

zdenek.matej@maxiv.lu.se

X-ray laboratories have to deal with high data rates from novel high throughput cameras. Large scale facilities are developing plans for robust data reduction in order to moderate the data streams, simplify data storage, visualisation and analysis. For several applications traditional strategy of storing data first, and processing later presents a bottleneck and so real-time data reduction and analysis are receiving attention. Crystallographic software has been always keeping up with high performance computing. Effective implementation of crystallographic algorithms can be found even on graphical processing units [1-3]. However the range of “exotic” hardware for scientific computing is continually increasing, including digital annealers or the first commercial quantum computers already today. In this work so called field-programmable gate arrays (FPGAs) are used for non-trivial crystallographic data reduction. FPGAs present a sort of malleable computer hardware that is, already for decades, extensively used for readout of fast X-ray cameras or real-time applications controlling crystallographic experiments. However implementation of complex crystallographic analysis and data reduction codes on FPGAs is not common. The problem of azimuthal integration (AZINT) of streamed 2D-detector data for powder diffraction and small angle scattering is chosen here and implemented on FPGAs in order to demonstrate possibilities of this type of compute accelerators for more advanced data analysis in crystallography and other photon and neutron sciences. Future applications may include frame filtering, spot finding or diffraction features classification.

AZINT improves fundamentally the signal to noise ratio and allows detection of diffraction peaks even from noisy image data. The AZINT FPGA implementation allows for fixed and extremely short latencies in receiving integrated diffraction patterns that can be fitted in other parts of the configurable pipeline and provide a real-time feedback to the experiment. The solution can be integrated with compute infrastructures at large scale facilities or as an embedded device it can increase capabilities of handling high throughput detector data in any lab. Azimuthal integration represents the first demonstration case of a project which aims for making FPGAs easily available for scientific software developers with use of industrial standards as OpenCL as well as with free and open-source numeric algebra toolbox based on synchronous message exchange (SME) [4].

This work was allowed beside others by advancements of FPGA platforms for data oriented computing and with evolution of appropriate programming models during the last decade. Compute FPGAs are excellent candidates for processing high throughput detector data. All the tasks of receiving, decompressing the camera image stream and the final AZINT computation can be handled on a single device. Initial benchmarks show that SME based implementation of a histogram computation, which is a basis of AZINT, can process 600 Gb/s [5]of uncompressed data stream.

1. G. Ashiotis, A. Deschildre, Z. Nawaz, J. P. Wright, D. Karkoulis, F. E. Picca, J. Kieffer, J. Appl. Cryst. 48, (2015), 510-519. doi: 10.1107/S1600576715004306

2. V. Favre-Nicolin, J. Coraux, M.-I. Richard, H. Renevier, J. Appl. Cryst. 44, (2011), 635-640. doi: 10.1107/S0021889811009009

3. I. Simecek, J. Rohlicek, T. Zahradnicky, D. Langr, J. Appl. Cryst. 48, (2015), 166-170. doi: 10.1107/S1600576714026466

4. K. Skovhede, B. Vinter, in FSP 2017; Fourth International Workshop on FPGAs for Software Programmers, (2017), pp. 58-65.

5. C. Johnsen, SME Binning, github.com/bh107/SME-Binning (visited on May 6th, 2019).

eSSENCE@LU 5:10 project - Programmable hardware platform for scientific software - is kindly acknowledged for supporting this work.