BioMed Research International

Volume 2016 (2016), Article ID 7060348, 11 pages

http://dx.doi.org/10.1155/2016/7060348

## Versatility of Approximating Single-Particle Electron Microscopy Density Maps Using Pseudoatoms and Approximation-Accuracy Control

^{1}IMPMC, Sorbonne Universités, CNRS UMR 7590, UPMC Univ Paris 6, MNHN, IRD UMR 206, 75005 Paris, France^{2}Biocomputing Unit, Centro Nacional de Biotecnología, CSIC, Campus de Cantoblanco, Darwin 3, 28049 Madrid, Spain

Received 30 August 2016; Accepted 3 November 2016

Academic Editor: Elena Orlova

Copyright © 2016 Slavica Jonić and Carlos Oscar S. Sorzano. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Three-dimensional Gaussian functions have been shown useful in representing electron microscopy (EM) density maps for studying macromolecular structure and dynamics. Methods that require setting a desired number of Gaussian functions or a maximum number of iterations may result in suboptimal representations of the structure. An alternative is to set a desired error of approximation of the given EM map and then optimize the number of Gaussian functions to achieve this approximation error. In this article, we review different applications of such an approach that uses spherical Gaussian functions of fixed standard deviation, referred to as pseudoatoms. Some of these applications use EM-map normal mode analysis (NMA) with elastic network model (ENM) (applications such as predicting conformational changes of macromolecular complexes or exploring actual conformational changes by normal-mode-based analysis of experimental data) while some other do not use NMA (denoising of EM density maps). In applications based on NMA and ENM, the advantage of using pseudoatoms in EM-map coarse-grain models is that the ENM springs are easily assigned among neighboring grains thanks to their spherical shape and uniformed size. EM-map denoising based on the map coarse-graining was so far only shown using pseudoatoms as grains.

#### 1. Introduction

Single-particle analysis is an electron microscopy (EM) technique that allows determining the structure at near-atomic resolutions for a large range of macromolecular complexes [1–16]. Also, it allows studying conformational variability of macromolecular complexes by determining their different conformations [17–22]. These different conformations are usually obtained by analyzing heterogeneity with methods that assume a small number of discrete conformations coexisting in the specimen [23–28], while several methods have been recently developed to help analyzing continuous conformational changes [29–33].

EM-map representations with a reduced number of points or with a set of 3D Gaussian functions have been shown useful in studying macromolecular structure and dynamics [30, 33–44]. The process of representing EM maps with a set of points or 3D Gaussian functions (grains) is sometimes referred to as coarse-graining of EM maps. A typical approach to coarse-graining is a neural network clustering approach that quantizes the given EM map so that the probability density of the grains closely resembles the probability density of the given data, which makes the coarse-grain representation retain the overall shape of the structure from the given EM map [34, 36, 38–40]. This approach is referred to as Vector Quantization (VQ). A different approach is to parametrize a Gaussian Mixture Model (GMM) of the probability density function using expectation-maximization algorithm [41, 45]. All these approaches require setting a desired (target) number of grains or a maximum number of iterations to stop the iterative procedure, which may result in suboptimal representations. Indeed, the use of a small target number of grains or a small maximum number of iterations may lead to a small final number of grains resulting in a model with overrepresented high density regions and underrepresented low density regions. Furthermore, in the case of symmetrical structures, the inadequately small final number of grains can result in representations that are overall nonuniform (asymmetrical). A difficulty is thus to choose the stopping parameter that will produce a sufficiently high number of grains to appropriately represent all density regions.

An alternative is to set a desired (target) error of approximation of the given EM map and then optimize the number of Gaussian functions, their position, and their weights to achieve the target approximation error, as in the approach that we introduced in [43]. In each iteration, this approach adds some Gaussian functions (grains) while removing some (the grains with small weights or distances will be removed). We have found that this strategy of minimizing the global representation error, involving controlled adding and removing grains, allows placing new grains where they are most needed and adapting the grains near the removed ones to better represent the local intensity in the input EM map [43], which helps overcoming the underrepresentation problem. For instance, we have found that symmetry is preserved in EM-map approximations with this strategy for typical values of the target approximation error such as 1–15% [30, 33, 42–44]. This method uses spherical Gaussian functions of fixed standard deviation that we refer to as pseudoatoms. Its versatility has been shown in applications such as predicting conformational changes of macromolecular complexes, exploring actual conformational changes, analyzing continuous conformational changes, and denoising of EM density maps [30, 33, 42–44]. Some of these applications are based on EM-map normal mode analysis (NMA) with elastic network model (ENM) [46, 47] (e.g., predicting conformational changes of macromolecular complexes or exploring actual conformational changes using normal-mode-based analysis of experimental data). In some other applications, NMA is not used (e.g., denoising of EM density maps).

The advantage of using pseudoatoms in applications based on NMA and ENM, with respect to other types of grains (Table 1), is their uniformity over the molecule that is a prerequisite for a simple application of the ENM. Indeed, as pseudoatoms have spherical shape and uniformed size over the molecule, they allow an easy setting of springs among neighboring grains in the ENM. On the contrary, 3D points (the so-called codebook vectors) obtained with VQ can be regarded as Gaussian functions whose standard deviation can vary over the molecule (each codebook vector is associated with a data subregion known as Voronoi cell [34] and different subregions can have different sizes), which may make the ENM setting more complicated than with Gaussian functions of the same standard deviation. Similarly, NMA should be more complicated with ellipsoidal Gaussian distribution obtained with the GMM approach (to the best of our knowledge, such NMA has not been reported so far). Regarding EM-map denoising (not based on NMA), only pseudoatoms were so far reported as grains in that application of coarse-graining [44].