Reconstructing the Dynamics of the Outer Electron Radiation Belt by Means of the Standard and Ensemble Kalman Filter With the VERB-3D Code

.

The use of such data assimilation tools to analyze the state of the radiation belts is becoming increasingly popular. A variety of studies have used 1D radial diffusion models to apply the KF or the EKF algorithms (e.g., Naehr & Toffoletto, 2005;Koller et al., 2005;Shprits et al., 2007;Kondrashov et al., 2007;Ni et al., 2009;Daae et al., 2011;Shprits et al., 2012;Schiller et al., 2012), or the EnKF (e.g., Koller et al., 2007;Reeves et al., 2012;. Data assimilation in 1D space is useful to gain insights of the evolution of the system, but does not allow for propagation of covariances between different pitch angles and energies. Therefore, 1D approaches do not exploit the full potential of the satellite observations, and moreover, does not proper study of acceleration and loss processes. On the contrary, multidimensional models enable us to use the entire information on pitch angle distributions and energy spectra from different satellites. Up until now, only two 3D data assimilation approaches for the radiation belt region have been implemented: one for the KF and one for the EnKF. Shprits et al. (2013) introduced the "operator-splitting" technique for 3D data assimilation with the KF. The authors showed the robustness of the 3D split-KF approach and presented the evolution of PSD radial profiles resulting from assimilation of CRRES data. More recently, Cervantes et al. (2020) presented simulations using a 3D split-KF tool, that includes mixed diffusion terms in the forecast step. Bourdarie and Maget (2012) used the EnKF to reconstruct radiation belts fluxes along satellite orbit, but they did not present global evolution of reconstructed fluxes and did not validate the EnKF against KF.
The goals of this work are: (a) to investigate the convergence of the state estimate from the EnKF to the optimal estimate (i.e., the KF estimate) in a setting where the KF is optimal, and (b) to combine the operator-splitting and the EnKF approaches to obtain global reanalysis of the radiation belts. We address these goals as follows: we perform a fraternal twin experiment using a 1D radial diffusion model and synthetic data to asses the convergence of the EnKF estimate under controlled conditions, we extend the split-operator technique to the EnKF in order to develop two computationally efficient 3D EnKF approximations. We use the VERB-3D code and the new split-EnKF methods to assimilate electron fluxes from Van Allen Probes and Geostationary Operational Environmental Satellites (GOES) in the entire 3D phase space. We present the global evolution of PSD in the radiation belts obtained with the new multidimensional EnKF approaches. Finally, we validate the convergence of our EnKF simulations by performing a systematic comparison of KF and EnKF methods for radiation belt electrons. Such a validation of data assimilation methods has not been provided in previous studies. The implementation of the split-EnKF filter approaches allows us to overcome the computational limitations associated to the KF and builds the framework for further inference applications.
In the next section, we describe the physics-based model and the satellite data. In Section 3, we present the theory of the filtering algorithms. Section 4 is devoted to the results of data assimilation experiments with E ) coordinates and is known as the bounce-and drift-averaged Fokker-Planck-equation: where  0 E is the equatorial pitch angle, E p is the relativistic momentum and * (2 )/( ) 1 E L ), PSD is equal to zero to represent the losses to the atmosphere; at the upper radial boundary (  * 6.6 E L ), time-dependent PSD is estimated from GOES measurements, estimated following Subbotin et al. (2011). Setting PSD equal to zero at the lower pitch angle boundary (    0 0.3 E ), we account for electron precipitation in a weak diffusion regime  ) to describe a flat pitch angle distribution (Horne et al., 2003). At the upper energy boundary, a zero PSD boundary condition is applied representing the absence of high-energy electrons (>10 MeV), while at the lower energy boundary PSD is set constant in time to represent a balance of convective source and loss processes.

Satellite Observations
We test the new split-operator EnKF techniques using electron observations obtained from the Van Allen Probes and GOES missions for the entire month of November, 2012. This particular period is chosen, as it includes both quiet and active geomagnetic conditions, and an intense storm ( the observation operator E H maps the model space onto the observation space and accounts for differences in dimensionality between data and model, due to the sparsity of the observations. Note that the covariance update requires the model operator to be linear.

Ensemble Kalman Filter
The EnKF can be interpreted as a purely statistical Monte Carlo approximation of the KF. In other words, the optimal state of the system a k where  ens 1,…, E i N . The covariance matrix E R represents the measurement errors. Every state in the ensemble is propagated in the update step, as follows: , , where the Kalman gain ( k E K ) with the optimal weighting factors is calculated as in Equation (4).

Convergence of the EnKF to the Standard KF
It is important to note, that for a linear system and a large number of samples   ens E N the EnKF and the KF produce the same mean and covariance estimate (Mandel et al., 2011). In other words, in the linear case the EnKF converges to the KF in the limit of an infinite number of ensemble members. Since one of our goals in this work is to present a fair comparison between the KF and EnKF, we chose a rather simple implementation of the EnKF that is easy to match with the KF from a theoretical perspective but still assures convergence to the KF estimate in the linear case. For this reason, more complex tuning tools such as localization and inflation were not taken into consideration in the present study. Burgers et al. (1998) carefully revisited the analysis step of the KF and EnKF, and gave the fundamental setup of the EnKF for this convergence to hold. They showed that treating the observations as random variables allows the covariance of the analyzed ensemble a e E P (in Equation (6)) to be expressed in the same way as in the analysis error covariance of the KF, that is: The authors state, that also in the forecast step correspondence between the KF and EnKF is given, when each ensemble member evolves according to: where k i E dq is a stochastic forcing representing model errors from a distribution with zero mean and covariance E e Q , defined as: For high dimensional problems, the optimal KF shows major shortcomings in terms of computational efficiency, as operating and storing large covariance matrices make the method very computationally demanding. In this regard, the EnKF has the advantage of using each error covariance matrix for the particular time step in question and then dismissing it. It is crucial, however, that the use of the EnKF on finite ensemble sizes only provides an approximation of the KF, which makes this filtering method suboptimal. Despite the underlying Gaussian assumption, accuracy and stability have been rigorously shown for different approaches of the EnKF on linear and nonlinear operators (de Wiljes et al., 2018;de Wiljes & Tong, 2020). Shprits et al. (2013) proposed a suboptimal approximation of the KF that uses the operator-splitting method, often applied to solve partial differential equations. With this technique, the Kalman filter algorithm can be sequentially applied to the 1D diffusion operators in radial distance, energy and pitch-angle (mixed terms are neglected). Since each diffusion operates along one dimension in the model space, we can solve the equations sequentially for constant values of the other two dimensions, obtaining the solution in the entire

Operator Splitting Technique
). The update or analysis step of the KF is performed after each diffusion along one dimension. This "splitting" of the diffusions and thereby of the dimensionality of the problem allows the split-KF to operate with smaller matrices compared to the full-3D case and is, therefore, computationally much more convenient.
In this study, we use the split-operator method to separately perform data assimilation using the EnKF for each diffusion operator. This method may be viewed as a form of localisation as correlations across dimensions are not considered anymore in the filter update. Computationally, the problem is reduced to the calculation of matrices in rather manageable sizes, that is, the size of the state vector is always dimensions, and ens E N is the number of ensemble members. The f E P matrices are handled by the algorithm as 2D matrices of size (  E N N ). Therefore, even for a large ens E N , the split-EnKF approach is, as in the split-KF approach, highly computationally efficient. For these reasons, the split-EnKF approach allows to increase dimensionality and also study different filter variations. We present two new split-EnKF variations and compare them with a 1D radial diffusion EnKF (e.g., Reeves et al., 2012), a 1D radial diffusion KF (e.g., Shprits et al., 2007), and the 3D split-operator KF (e.g., Shprits et al., 2013), as listed below: 1. In order to setup the EnKF and check its convergence to the KF, we implemented the EnKF in a simple 1D radial diffusion model, named here EnKF(1D_RD), and compare the reanalysis results with a 1D-KF radial diffusion model, denoted KF(1D_RD) for simplicity 2. We solve the diffusion equation in the three space dimensions (radial, energy, and pitch-angle) sequentially and assimilate data after the calculation in each dimension using a 1D split EnKF update, that is, a total of three updates is performed. This filter approach is denoted here as EnKF(3x1D) and we 7 of 19 compare its results to the KF analogous, which uses a standard KF for the 1D split update, for simplicity called KF(3x1D). The pseudocode of this filter is given in Algorithm 1 (see Appendix). 3. Here, we solve the diffusion equation in the three space dimensions (radial, energy, and pitch-angle), but we first assimilate data using a 1D split EnKF update along the radial dimension, and then use a 2D split EnKF update for the local diffusion, meaning that the assimilation along energy and pitch-angle dimensions is computed simultaneously. We denote this filter approach as EnKF(1D_RD+2D_LD) and present its pseudocode in Algorithm 2 (see Appendix). A similar split-KF approach is rather computationally expensive, as it requires the calculation and storage of 4D forecast error covariance matrices every time step. Therefore, we compare the EnKF(1D_RD+2D_LD) with the EnKF(3x1D) and EnKF(1D_RD).

Validation
In order to validate the results of our data assimilation experiments (see next section), we calculate the value of the innovation, also called forecast error:   obs , f k k zero mean and variance of 0 5 .  z k a .

Fraternal Twin Experiment
We perform a fraternal twin experiment to asses the general results and performance of the EnKF in comparison with the KF. For the experiment, we use a model that is considerably different from the "truth," so that the assimilation results can be properly evaluated. The model for the assimilation is a 1D radial diffusion model at constant Kp  1 E (Figure (1), panel a) and 1 h time step, the synthetic data are also assimilated every hour. In order to estimate the optimal ensemble size, for which sufficient convergence of the EnKF is given, we ran several simulations for different ensemble sizes, that is, N ens  5 10 40 80 150 300 500 1500 , , , , , , , ensemble members, and compare the assimilation results with the results from KF(1D_RD). The results of our synthetic simulations are presented in Figure (2) for electrons at   700 The figure is divided into three columns. The first column (A) shows the assimilation results of KF(1D_RD) (A.I) and EnKF(1D_RD) (plots A.II to A.IX) for ensemble sizes N ens  5 10 40 80 150 300 500 1500 , , , , , , , , respectively. One advantage of performing a fraternal twin experiment is that the true PSD is actually known, allowing us to compare the simulation results directly with the true state of the system. These comparisons are presented in Column (B) as the difference between the true state (truth in Figure (1), panel b) and the assimilation results in column (A). Column (C) displays the difference between the estimate of KF(1D_RD) (plot A.I) and the estimate of EnKF(A.II to A.IX). The fraternal experiment gives us initial insights into the similarities and differences between the assimilated states of EnKF(1D_RD) and KF(1D_RD). From plots A.I and B.I, we can see that propagation effects in the KF(1D_RD) lead to a continuous increase in the width of the belt, so that the state estimate of KF reaches most of the width of the true state within the first 10 days of the simulation. The largest differences are observed at the beginning of the simulation and at the edge of the belt, where large gradients are present.
As expected, assimilation results with the EnKF improve considerably with increasing ensemble size. The estimates obtained using 5 and 10 ensemble members (A.II to A.III) are rather patchy and quite different from both, the truth and KF estimate. In general, it is expected that an ensemble size smaller than the number of grid nodes in the L-domain leads to poor results in the EnKF(1D_RD) estimate. The results with 40 and 80 ensemble members are significantly closer to the truth and to the KF estimate, but their estimates do not appear to be smooth and their convergence is somewhat slower to the truth than the KF estimate. Large differences to the true estimate are also observed at the beginning of the simulation and at the edge of the belt. From an ensemble size of 150 upwards, the resulting estimate is not only smooth but also the differences with both, the KF estimate and the truth, become negligible. For such sizes, the EnKF(1D_RD) is able to propagate the assimilated data as fast as the KF (1D_RD) The strong resemblance of panels B.VI, B.VII, B.VIII and B.IX in Figure (2) suggests that for ensemble sizes larger 150, the convergence to the KF(1D_RD) becomes so slow that even 1500 members do not lead to a significant difference. In order to look at the evolution of the filter estimates with respect to the true PSD, we calculated the Euclidean norm (2-norm) of the difference between the true state and the assimilated states at each time step, for all the simulations of our fraternal experiment (see Figure (3)). The trends observed in Figure (2 column B) are more clearly seen in Figure (3). In particular, the difference curves of the KF (blue diamonds) and the EnKF with ≥150 ensemble members (EnKF with  150 ens E N , red crosses) agree quite well with one another and show maximum variations of less than half an order of magnitude. For this reason, we consider ensembles with 150 members as sufficient to approximate the KF(1D_RD) and use this ensemble size for the assimilation of real satellite data. It is worth to mention that the use of only one synthetic data source affects the convergence speed of both filters to the true state. In the following section, we present a series of assimilation runs using real satellite data from four different spacecraft.

Reanalysis With Satellite Measurements
In this section, we present the data assimilation results using real satellite measurements for each proposed split filter together with a systematic comparison against KF filtering results.

Comparison between EnKF(1D_RD) and KF(1D_RD)
Now, that we estimated an adequate ensemble size, we can compare the reanalysis results obtained with the EnKF(1D_RD) and the KF(1D_RD). Figure ( Noticeably, panels (a), (b), and (c) reveal that both filters are able to reproduce the general features shown by the satellite observations throughout the simulated period. The difference between both simulations (panel d) allows for a more detailed overview of the filter performance. Blue tones in this plot indicate areas, where the EnKF(1D_RD) produces lower PSD values than the KF(1D_RD). Yellow to red colors indicate the Both innovation plots show very similar values and trends in time and radial distance. This indicates that the forecast state is corrected by a similar magnitude by both filters, that is, similar difference to the observations. The highest innovation values are observed at the beginning of the simulation, at times of evident magnetopause crossings (8th and 15th Nov) and throughout  16 25 E November. This indicates that the model tends to underestimate PSD at these times so that the filter applies stronger corrections to the forecast. We analyze general trends in the innovation by calculating the mean innovation at   Figure (4,II). Both curves show a very similar evolution in time, which is in agreement with panels (a) and (b). Moreover, this figure nicely visualizes the variability of both innovations during the intense storm and active times (15-25 November). Interestingly, both innovations only vary within two orders of PSD magnitude, being the only exception the major storm.
As shown in panel (d), the general difference between absolute innovations remain around 10 10 E to 8 10 E , minor variations are observed mostly during  16 25 E November. Since the underlying model is the same for both filters, these differences can only arise from fluctuations in error covariance matrices of the EnKF caused by the use of a finite ensemble size (see Equation 9). The plot in panel (d), shows the times at which the EnKF(1D_RD) imposes larger corrections on the forecast than the KF(1D_RD). In general, the EnK-F(1D_RD) and the KF(1D_RD) filters produce very similar reanalysis results.

Reanalysis Using the EnKF(3x1D) Approach
In this section, we present our first split-operator variation of the EnKF, the EnKF(3x1D). In this filtering approach, the diffusion equation is solved in the radial, energy and pitch-angle dimensions sequentially for the entire model space. After each dimensional step a 1D update step takes place using a one-dimensional EnKF, as presented in EnKF(1D-RD). The model is thereby updated three times every time step. The convergence and performance of this 3D filter approach are tested using the same data assimilation setup presented in the previous sections and it is compared to its KF analogous filter approach (here denoted KF(3x1D)), suggested by Shprits et al. (2013).  (3x1D), panel (c) shows the reanalysis of KF(3x1D) and panel (d) illustrates the PSD-difference between both reanalysis (EnKF(3x1D)-KF(3x1D)). Similar to the EnKF(1D_RD), the overall PSD features observed in the satellite measurements are well reproduced by both 3D-split filters. However, differences in PSD between EnKF(3x1D) and KF(3x1D) are in the same order as in the 1D-RD approach. During the first half of the simulation period, the EnKF(3x1D) tends to estimate higher PSD values than the KF(3x1D). For the second half of November, 2012, the trend appears to be reversed. On 15 November, when the intense storm causes the magnetopause to reach below  * 4 E L , the difference between the simulations is largest. During the active period of  16 25 E November, the KF(3x1D) that produces larger PSD-values than the EnKF(3x1D).
Resulting innovations, displayed in Figure  . Since the underlying model is the same for both filters, there are two possible reasons for these differences: (a) the use of a finite number of ensembles will also lead to discrepancies in the estimation of the covariance matrices of EnKF and KF, and (b) error propagation due to sequential application of the update step (we will extend on this topic in the discussion section). The 13 of 19 largest differences between innovations are observed around November 15, where EnKF(3x1D) reanalysis is more underestimated than the KF(3x1D) reanalysis. These features are also seen in the mean innovations above  * 3 E L (in panel three), which apart from a few time points have pretty much the same evolution and variations, remaining generally within two orders of magnitude. Overall, the EnKF(3x1D) and KF(3x1D) filters deliver a very similar reanalysis. It is important to note that the innovation of the 3D-split approaches is, in general, significantly smaller compared to 1D-RD filters. This is related to the improved underlying physics-based model and to the repetition of the 1D update step.

Reanalysis Using the EnKF(1D_RD+2D_LD) Approach
Here, we present our second split-operator approach for the EnKF. In this filtering setup, the diffusion equation is solved in the radial, energy and pitch-angle dimensions sequentially for the entire model space. After the radial step a 1D update is performed in the * E L -dimension. In contrast to the 3x1D approach, after the calculation of local processes takes place, a single combined 2D update step in the energy and pitch-angle dimensions is performed. Therefore, the model is updated twice in this approximation. To test our 2D filter approach, we use the same data assimilation setup presented in the previous sections. Since a similar KF(1D_RD+2D_LD) filter approach is numerically highly complex and therefore very computationally expensive, we compare the EnKF(1D_RD+2D_LD) to a reanalysis performed with the EnKF(1D_RD) in this section, and to the results of EnKF(3x1D) in the next section.  (1D_RD)). Both reanalysis present very similar trends overall and reproduce the main features of the satellite data. The PSD-difference between the two filters is highest on 15 November and during 16-25 November, where EnKF(1D_RD+2D_LD) produces slightly higher PSD values than EnKF(1D_RD). Interestingly, the fast losses observed on 15 November, caused by magnetopause compression, are reproduced slightly different in both filters.
Analysis of the innovations gives us detailed information about these features. Figure (6.II) presents the resulting innovations for the reanalysis with EnKF(1D_RD+2D_LD) (panel a) and with EnKF(1D_RD) (panel b). The mean innovations above  * 3 E L are in panel three, the difference between both absolute innovations (EnKF(1D_RD+2D_LD) -EnKF(1D_RD)) is in panel four), and Kp is shown in the bottom panel. The innovation plots have similar features in time and space for both simulations. The innovation difference oscillates around 8 10 E by maximal two orders of magnitude. In this case, the underlying models are different, therefore, the observed trend indicates a systematic overestimation of PSD in the 1D radial diffusion model. This is expected as the model on which EnKF(1D_RD+2D_LD) operates accounts for radial and local processes, being therefore more accurate. The mean innovations of both simulations also follow very similar trends, but the EnKF(1D_RD) curve (red line) occasionally exceeds the EnKF(1D_RD+2D_ LD) curve (black line), particularly during the second half of the simulation period (e.g., November 15, 16).
Although, both simulations converge to very similar solutions, the PSD differences reveal quite a few deviations. Particularly, large differences after the 15 November are observed, and the magentopause crossing event is particularly highlighted in this plot. A general trend toward negative numbers in panel b, indicates that the state estimates of EnKF(3x1D) have larger values than those of EnKF(1D_RD+2D_LD). The innovation difference shows only a few large values at the beginning of the simulation and during  15 25 E November, especially around November 16. This is also observed in the mean innovations. The figure in panel four also indicates that the innovation of the EnKF(1D_RD+2D_LD) has often higher values than EnKF(3x1D). In this particular case, the physical models should be theoretically the same. However, due to the different implementation of the EnKF in the two approaches, more so the total updates performed in each filter approach, the underlying models become different. The EnKF(1D_RD+2D_LD) updates the model twice and the second update occurs in energy and pitch-angle diffusion simultaneously, involving covariance matrices of sizes  2 2 ( ) E N N . This means, that spurious correlations present in the covariances will certainly lead to differences in the estimates of EnKF(1D_RD+2D_LD) compared to those of EnK-F(3x1D). Error propagation will also play a role for these two filtering approaches, but its effect on EnK-F(1D_RD+2D_LD) results could have a rather small impact.

Discussion
In this study, we developed and implemented two new split-operator approximations of the three dimensional EnKF to perform ensemble data assimilation of electron PSD in the radiation belts. We performed a fraternal twin experiment using a 1D radial diffusion model, in order to study the convergence of the EnK-F(1D_RD) to the optimal state of the system (KF(1D_RD)). While the state estimate of the EnKF(1D_RD) does improve with increasing ensemble size, comparison between the assimilation results from both 1D filters showed that 150 ensemble members are sufficient to properly approximate the KF. Differences between the EnKF(1D_RD) approximation and the optimal KF(1D_RD) are rather negligible.
Using the initial setup for the EnKF(1D_RD), we implemented the split-operator EnKF approaches for the 3D state space and modeled the global state of the outer radiation belt for the month of November, 2012. We presented detailed comparison of the split KF and EnKF filtering tools, in order to verify the accuracy of the EnKF approaches. Our results suggest that although the split KF and EnKF approaches are simple approximations of the optimal KF, they are able to reconstruct accurately the radiation belt region. Only 15 of 19 minor differences are observed at the beginning of the simulations, during active times and magnetopause compression events. This is consistent with the findings of Shprits et al. (2013) and justifies the general robustness of the split-EnKF approach.
In general, the simulations need about 3 days to level out discrepancies arising from the initial PSD. These initial errors appear to be larger in the 1D approaches, but become smaller for the (EnKF(3x1D) and EnK-F(1D_RD+2D_LD)) methods. Additionally, the observed differences may be due to spurious correlations recognized by the EnKF, which arise from the random perturbation of the observations, but are not really physical. This might be of particular importance for simulations with the EnKF(1D_RD + 2D_LD). Note that while it is true that the EnKF(1D_RD) filter converges to a reasonable solution, the reduction in the innovations of our two 3D EnKF approaches, EnKF(3x1D) and EnKF(1D_RD+2D_LD), indicates that the 3D update does allow for propagation of the satellite data to other energies and pitch angles. Therefore, a more accurate analysis is estimated, which in turn, leads to a better forecast estimate in the next time step. N is sufficient to produce accurate estimates. An additional computational benefit comes from the use of the split operator, this also applies for the EnKF(3x1D), which allows for a computational complex- in the three-dimensional setup (Shprits et al., 2013). The greatest computational benefits become more evident, when a finer grid is chosen for the EnKF(3x1D) and KF(3x1D), or for a full 3D nonlinear EnKF, which is necessary if the mixed terms in the evolution Equation (1) are being considered.
A difficulty in dealing with the split-filters lays in the correct use of the model errors. After application of the first analysis step, satellite data has been assimilated and thus improvement of the model is achieved. Therefore, for the second update step, the model errors described in matrix E Q will not be the same as in the initial setup. A more accurate approach could, for instance, include some dynamical reduction of the model errors after each update iteration. This subject belongs to uncertainty estimation and is beyond the scope of this study.

Conclusions
In this study, we setup, implement and validate two new split-operator approximations of the three dimensional EnKF, which allow us to reconstruct the entire state of the outer radiation belt. We provide a detailed comparison between different data assimilation tools using satellite observations. The main conclusions from our study are summarized below: • A fraternal twin experiment performed using an initial setup of the EnKF which is based on the KF implementation on a simple 1D radial diffusion model, allows us to find that 150 ensemble members are sufficient to accurately model the optimal state solution of the KF. • The use of the split-operator technique allows us to increase dimensionality in our simulations and tackles the issue of computational efficiency, which becomes particularly important at higher dimensions. Therefore, the new 3D split-EnKF approaches are suitable for forecasting purposes in real-time. • Our validation method suggests that the split KF and EnKF methods show similar results. The use of the new 3D approaches reduces the global innovations in comparison to 1D filters. This is partly due to the more accurate model but also due propagation of pitch angle and energy data into the model space, which yields an analysis state that is closer to the data. The use of this state estimate as initial condition in next step leads to a more accurate forecast state.
The KF(3x1D), EnKF (1D_RD+2D_LD) and EnKF(3x1D) tools are state of the art data assimilation techniques that reconstruct accurately the radiation belt region. The data assimilation tools developed in this study can be applied in the future to a variety of problems, including non-linear parameter estimation, non-linear assimilation of observations, free-prediction studies, estimation of model errors through state augmentation, a.o.