Deep learning a poroelastic rock-physics model for pressure and saturation discrimination

Determining saturation and pore pressure is relevant for hydrocarbon production as well as natural gas and CO2 storage. In this context, seismic methods provide spatially distributed data used to determine gas and fluid migration. A method is developed that allows the determination of saturation and reservoir pressure from seismic data, more accurately from the rock-physics attributes of velocity, attenuation, and density. Two rock-physics models based on Hertz-Mindlin-Gassmann and Biot-Gassmann are developed. Both generate poroelastic attributes from pore pressure, gas saturation, and other rockphysics parameters. The rock-physics models are inverted with deep neural networks to derive saturation, pore pressure, and porosity from rock-physics attributes. The method is demonstrated with a 65 m deep unconsolidated high-porosity reservoir at the Svelvik ridge, Norway. Tests for the most suitable structure of the neural network are carried out. Saturation and pressure can be meaningfully determined under the condition of a gas-free baseline with known pressure and data from an accurate seismic campaign, preferably cross-well seismic. Including seismic attenuation increases the accuracy. Although training requires hours, predictions can be made in only a few seconds, allowing for rapid interpretation of seismic results.


INTRODUCTION
The determination of gas saturation is a frequent task in hydrocarbon production (Grude et al., 2013;Calvert et al., 2016) and natural gas storage (Priolo et al., 2015) and is also highly important for CO 2 storage applications (Chadwick et al., 2010;Ivandic et al., 2012). The underlying equations are identical for all applications because the change in the elastic attributes is physically induced by the higher compressibility and lower density of gas compared to liquid, resulting in a reduced impedance. Typically, the data are acquired based on surface seismic acquisition, inverted to obtain elastic attributes, and then soft elastic attributes are correlated to the presence of gas. Because the uncertainty of the rock velocity is already high, the correlation is not very sensitive in directly obtaining the gas saturation. Nevertheless, time-lapse campaigns detect changes in the velocity, which allows us to subtract out the rock velocity. The velocity difference then can be attributed to dynamic effects, such as saturation and also to pressure (Landrø, 2001). However, seismic attributes show a much lower sensitivity to pressure compared to saturation, whereas pressure is more difficult to determine.
There is a high demand in gas storage applications to derive pressure and saturation from seismic data. For gas storage, the initial formation saturation is typically zero, which is an advantage for the method compared to hydrocarbon production, where the initial saturation is subject to significant uncertainties. However, avoiding overpressure and thereby induced potential fracturing has a high priority for gas storage applications (Castelletto et al., 2013).
Traditionally, saturation-driven changes are inverted based on the amplitude variation with offset (AVO) response in the seismic image (Landrø, 2001) or by quantifying 4D velocity changes based on multiple vintages of seismic surveys (Aarre, 2006), which can also be done with machine-learning methods (Dramsch et al., 2019). However, the AVO approach has conceptual disadvantages. Most approximations are only valid within certain offset and angle ranges and also within a certain depth interval, called the AVO window (Avseth et al., 2010). Further, the attenuation of sufficiently high frequencies limits the application in larger depths.
Data from cross-well seismic allow for higher frequencies and provide a simpler geometry that may allow a more accurate detec-tion of the shear-wave (S-wave) velocity. The nonlinear dependencies between saturation, pressure, and seismic attributes require the application of rock-physics models, providing the means for a discrimination between the different driving forces.
The use of rock-physics models also allows us to consider sitespecific data as prior knowledge by choosing an appropriate representation for the geologic conditions. The prior knowledge allows us to shift the focus to the most relevant parameters in CO 2 storage: pressure, saturation, and porosity.
A decrease in processing time, ideally in real time, increases the operational value of the acquired data (Bertrand et al., 2014). By the application of machine-learning methods, the computational effort can be reduced and a step toward faster evaluation can be made.
The presented methodology aims to support a planned near-surface CO 2 injection campaign, in which the seismic imaging is carried out with a cross-well setup. The data sets in this study are generated synthetically, with models and parameters adapted to a 65 m deep unconsolidated glacial formation, located at the Svelvik ridge, Norway (Sørensen et al., 1990).
Many potentially relevant rock-physics model formulations exist, but only some are applicable to the unconsolidated glacial deposits, with grain sizes from gravel to clay, that are present at the field site under investigation.
The first soft-sand model was developed by Mindlin (1949). Biot (1956) then develops a theory including frequency-dependent contributions for determining the poroelastic parameters. Raymer et al. (1980) propose a mixing approach to calculate poroelastic parameters for matrix and fluid phases that comprise more than one component. In this model, the poroelastic attributes compressional wave (P-wave) velocity V P , S-wave velocity V S , and density ρ are functions of the porosity ϕ, clay volume V cl , and water saturation S w . Krief et al. (1990) further alter the relationships of Raymer et al. (1980) to obtain a better fit for highly unconsolidated sediments using a porosity-dependent Biot's coefficient. Pride et al. (1992) present explicit equations of motion as well as stress/strain relations in a dynamic two-phase porous medium consisting of a fluid and matrix. Extending the work from Landrø (2001), Lang and Grana (2019) present a Bayesian rock-physics inversion discriminating pore pressure and fluid effects. The twophase fluid distribution is frequently described by the Gassmann (1951) Currently, there is a fast-growing application of deep neural networks to support interpretation and derive elastic properties from seismic data (Grana et al., 2017;Araya-Polo et al., 2018;Wu and Lin, 2018;Biswas et al., 2019;Das et al., 2019;Zheng et al., 2019;Das and Mukerji, 2020). Applications of machine learning have long been constrained by limiting computational capacities. Now, sufficiently large training data sets can be generated with forward modeling to represent multiparameter moderate complex systems, which increases effort in the development of machine-learning approaches. Gradient methods require numerically accurate forward models and have problems in resolving discrete input data . Deep neural networks do not show these disadvantages, and they are well suited to resolve the nonlinear dependencies between the petrophysical parameters and the corresponding elastic response (Raissi, 2018). Several recent studies have focused on full-waveform inversion (FWI) in the context of deep convolutional neural networks (Mosser et al., 2018;Zhang and Stewart, 2019). Compared to traditional inversion, neural networks can provide a significant improvement in turnaround time. Xue et al. (2019) apply different machine-learning techniques (e.g., neural networks and random forests) for mapping saturation changes by analyzing normalized root-mean square amplitude changes and normalized differences of the reflection coefficient.
In the present paper, deep neural networks are used as an inversion tool to determine rock-physics properties based on elastic attributes. Figure 1 shows the flow scheme of the modeling approach. The rock-physics parameters are the input to the rock-physics forward model that is used to obtain the poroelastic attributes. The training data set comprising the rock-physics parameters and resultant poroelastic attributes is then fed through a sequence of increasingly deep neural networks. Although the initial workload may be similar to a conventional inversion, the human workload for evaluating repeat surveys can be significantly reduced, as well as the time required for inversion. This allows for near-real-time results and therefore improves the operational value of seismic data (Moseley et al., 2018). The poroelastic attributes V P , V S , ρ, Q P , and Q S are taken as predetermined, either by inversion or direct measurements from cross-well experiments, serving as observation data on which rock-physics parameters will be calibrated in a similar way as Xue et al. (2019). In the present paper, porosity and pressure prior to injection are defined as additional poroelastic attributes affecting the rock physics. Saturation and pressure need to be explicitly part of the rock-physics models to allow for their calibration. Because the consequences of Figure 1. Scheme of a three-layer neural network in prediction mode and the application cases in this paper. The poroelastic attributes are at the left side, and the rock-physics parameters are at the right side. Note that, depending on the rock-physics model, only a subset of 3-5 poroelastic attributes is used. The V P , V S , and ρ are always input attributes, and the dashed attributes are case dependent. The initial pressure P 0 and depth z are not exactly rock-physics attributes, but they are on the input side because they are known and affect the physics. Cases 1-3 comprise different rock-physics parameter sets as the inversion target. These sets are simulated by different rock-physics models (indicated by the colored boxes). The structure of the paper follows the three steps of network selection, feasibility, and reservoir application, in which the state of the art is consecutively enhanced by our developments. pressure variation effects on the poroelastic attributes are typically not included in the rock-physics models, they are introduced into the appropriate formulations (Avseth et al., 2010;Lang and Grana, 2019).
For field applications, the sensitivity of pressure dependence on the poroelastic attributes may be approximated from the attributes themselves but should ideally be carried out using direct measurements from core plugs or well tests.
The current paper aims at methodological progress on two fields: first, the application of appropriate deep neural networks for seismic inversion; and second, the formulation and application of appropriate rock-physics models to distinguish pressure-and saturation-induced changes in seismic attributes.

Rock physics
Two independent rock-physics models are used for forward modeling the poroelastic attributes from rock-physics parameters. The first rock-physics model is called Hertz-Mindlin-Gassmann (HMG). It is based on the Hertz-Mindlin model in a soft-sand description (Mindlin, 1949). Although this model is strictly only valid for a single mineral component, Hossain et al. (2011) show that this limitation can be overcome and demonstrate its applicability for two or more mineral components. In the current study, the matrix is a mixture of quartz and clay and a perfectly patchy fluid distribution of the gas and water phase. The matrix and fluid phases are each described by a single effective modulus (Domenico, 1977). The dry rock-physics parameters K d , G d , and ρ d are obtained by mixing the matrix components using the Hashin and Shtrikman (1963) bounds (also see Appendix A and Appendix B for variables not closely defined in the text).
The rock-physics input parameters for HMG are the porosity ϕ, gas saturation S g , pressure P, sand/clay mixing ratio (V cl ), bulk/ shear modulus (K/G), and densities (ρ) of the frame and fluid phase. The second rock-physics model is called Biot-Gassmann (BG), and it is principally based on the poroelastic description introduced by Biot (1956). The rock-physics input parameters are similar to the HMG model. The dry moduli in the BG model are a function of the consolidation parameter cs. A higher consolidation factor increases the matrix moduli in relation to the dry bulk moduli (equations A-8 and A-9; Pride, 2005).
The fluid substitution in both models is described by the Gassmann equation (Gassmann, 1951): where K is the bulk modulus and the subscripts sat, ma, d, and fl denote saturated, matrix, dry, and fluid, respectively. For this paper, the rock-physics models are assumed to be calibrated. Although the Svelvik ridge consists of unconsolidated rock, the rock-physics model BG describes consolidated rocks. Although this is not straightforward and therefore prevents a direct application of the method to the Svelvik field site, this abstraction was carried out to demonstrate the applicability of the developed approach to real CO 2 storage formations, which are typically located in consolidated environments. Due to their complexity, the equations are not presented here. The current implementation can be found in Appendix A; for a general overview, refer to Mavko et al. (2009).

Pressure dependence
Neither rock-physics model (HMG and BG) includes a pressure dependence of the poroelastic attributes by definition. Therefore, two independent pressure dependencies are introduced to both rock-physics models. According to Mavko and Mukerji (1998), the effective pressure P eff is the overburden pressure P over minus the pore pressure P p : The pore pressure is further referred to as the baseline pressure P 0 before (time T 0 ) and the monitor pressure P 1 after (time T 1 ) injection. An increase in the pore pressure results in a decrease of the compressional forces acting at the grain contacts. As a consequence, the velocity is decreased on the increased pore pressure, also called softening.
The first velocity-pressure dependence follows Avseth et al. (2010), and it is referred to as PA: where V P;S refers to the P-and S-wave velocity and P eff;0;1 refers to the effective baseline and monitor pressures. The scaling factors a P;S are identical for both wave velocities and are kept constant at −0.2. This value is within a realistic range for shallow unconsolidated sediments, and the negative sign implies softening at the decreasing effective pressure. An accurate pressure dependence is crucial, wherefore the scaling factors should ideally be determined for field conditions, for example, by core experiments or pumping tests. The termP eff;0 is a reference pressure, typically the maximal pressure. The second velocity-pressure dependency is based on the work of Lang and Grana (2019), further referred to as pressure dependence PL. Within this description, after defining ΔP ¼ P 1 − P 0 and for the gas saturation ΔS g ¼ S g 1 − S g 0 , V P and V S are dependent on ΔS g and ΔP showing a quadratic dependence on ϕ, whereas ρ is only dependent on ϕ and ΔS g , k α ðϕ;S g ÞΔS 2 g þl α ðϕ;S g ÞΔS g þm α ðϕ;PÞΔP 2 þn α ðϕ;PÞΔP k β ðϕ;S g ÞΔS 2 g þl β ðϕ;S g ÞΔS g þm β ðϕ;PÞΔP 2 þn β ðϕ;PÞΔP k ρ ðϕ;S g ÞΔS 2 g þl ρ ðϕ;S g ÞΔS g : (5) The relative acoustic impedance (AI) change between a monitor and baseline survey is determined according to Landro et al. (1999): (6) Figure 2 shows the relative acoustic impedance change ΔAI∕AI for different porosities caused by a pore pressure and saturation increase with respect to the baseline conditions with P 0 ¼ 6.5 bar and S g;0 ¼ 0. Calculations are based on the HMG model with PA ( pressure-induced impedance change is strongly dependent on porosity in PA but only slightly in PL. Further, PA shows a higher sensitivity to small pressure changes, which is less pronounced for PL. For low porosities, the overall change in AI is similar, but for high porosities, PL shows approximately 40% higher impedance changes. Because both pressure models rely on the Gassmann equation, the impedance change due to saturation changes ΔS is identical for both (Figure 2c and 2d). The velocity changes are highest for small gas saturations and become almost linear for saturations >0.1. The lower row (Figure 2e and 2f) shows the ΔAI∕AI isolines for pressure and saturation. In both pressure models, the isolines are roughly parallel, with the saturation effect far exceeding the pressure effect on the impedance. For the given ranges, a saturation change has approximately a 10 times higher effect on the impedance than the pressure change with a higher pressure effect for PL than for PA.

Machine learning
A deep neural network acts as an inversion tool that derives rockphysics parameters plus pressure and saturation from poroelastic attributes. This is the reversed computation direction of the above-described rock-physics models. The neural network is trained with an ensemble. The training is performed with the poroelastic attributes as the network input and the rock-physics parameters as the expected network response.

Ensemble generation
For each learning case, one training ensemble of size N T is generated. Such an ensemble contains rock-physics parameters and the corresponding forward-calculated poroelastic attributes. The possible combinations of parameters and attributes are defined by the different rock-physics model formulations, and the different combinations used in the present work are defined by the three cases ( Figure 1). The rock-physics parameters are generated with a Monte Carlo approach; they are uniformly distributed within the parameter range and stochastically independent. Depending on the rockphysics inversion parameters, the remaining parameters of the rock-physics forward model are defaulted. This reduces the nonuniqueness in the inversion and allows us to focus a priori on the most unknown information. After rockphysics simulation, inputs as well as the outputs are scaled. The median is subtracted from the data, which are then divided by the range between the 25 and 75 percentiles, such that the 25 and 75 percentiles are −1 and 1, respectively. For some generated rock-physics parameter sets, the corresponding rock-physics model does not generate a solution. These sets are discarded and not included in N T . Training is carried out with the ensemble attributes as the input to the network and the rock-physics parameters as the output to the network.

Network settings
A suitable learning rate and weight decay are determined by grid search and the AdamW optimizer (Kingma and Ba, 2014;Loshchilov and Hutter, 2017). The values of 8 · 10 −4 and 1.25 · 10 −4 are used in all following neural networks. The loss function can be interpreted as an analog to an objective function in other inversion schemes. The L1 smooth loss function is used as Girshick (2015) shows that the convergence rate is increased by a factor of 3 to 10 compared to the standard L1 (see equation 7). The L1 value has the form L1 ¼ with x as the training data and y as the predicted data. Activation is achieved by a rectified linear unit (ReLU) on the nodes (Nair and Hinton, 2010). Dropout was applied to prevent overfitting (Srivastava et al., 2014). A dropout decrease strategy was applied, with 30% dropout on the first layer, decreasing by 10% for each successive layer. The results are very similar to constant dropout of 30% for each layer. Because there is no clear advantage in either of the approaches, we continued computations with dropout decrease.

Validation
Accuracy (acc) is computed based on a validation ensemble, whose members are independent from the training ensemble: with N V as the size of the validation ensemble. Based on the accuracy metric, early stopping is applied as an additional measure to counteract overfitting (Prechelt, 1998). Similar to a truncation criteria in traditional solver settings, further iteration stops. In the present approach, early stopping is applied after N S ¼ 100 epochs after which no improvement of accuracy is achieved.

EXPERIMENTAL STRATEGY
The experiments are carried out in three steps that build on one another. The network size and structure with appropriate learning rate are determined for an exemplary rock-physics model in the network selection step. The inversion feasibility is assessed for different rock physical parameterizations under error-free conditions with a 0D model during the feasibility step. The approach is then applied on a scenario-based simulation for a near-surface CO 2 migration test in the reservoir application step.

Network selection -Setup
This step should find a suitable network configuration that shows fast learning and sufficient accuracy while avoiding overfitting. It is not our aim to find the optimal network, which would require too  For each rock-physics reference value set, the poroelastic attributes are calculated. For example, case 1-B with ϕ 0.3, K d 8.0, G d 2.0, S g 0.0, P 1 6.5, and cs 5.0. The V cl is not included in feasibility. much resources and, as conditions slightly change, would probably not be optimal in the next steps.
Seven network configurations with one to three layers and 600-6000 neurons are tested (Table 1). The training ensemble is generated with the BG rock-physics model for 0D and error-free conditions. The rock-physics parameters are ϕ, K d , and G d , and the poroelastic attributes are V P , V S , and ρ. The training ensemble has 50,000 members, and the validation ensemble has 10,000 members. The rock-physics parameters are generated in wide ranges to cover all of the physically reasonable solution space (see the range in Table 2).

Network selection -Results and discussion
All networks show considerable learning. The loss functions are reduced by at least an order of magnitude, and the accuracy increases by approximately two orders of magnitude (Figure 3). Deeper networks show a considerably faster loss reduction. Networks #2 and #4, with the fewest number of neurons in the last layer, show the least loss reduction, which can be partly attributed to the decrease of the dropout, which has a higher effect on lower layers. The final accuracy ranges between 99.96% and 99.98%, which can be considered very good in terms of fit. However small, a factor of two in the misfit remains. The dropout only affects the learning phase and therefore the loss function, but during prediction all neurons are used. The accuracy is therefore calculated including the dropout neurons, wherefore smaller shallower networks show good accuracy values. Network #6 is chosen for further calculations because it shows the highest accuracy. Because dropout is affected by randomness, the ranking is a snapshot because the loss and the accuracy include a statistical component. An accurate interpretation would require us to determine the statistical components characteristics. However, as described above, this is not the aim of the network selection step.

Feasibility tests -Setup
Feasibility tests are carried out to assess the inversion power for different rock-physics models and parameterizations. The selected network #6 is applied to a training ensemble with 50,000 and a validation ensemble with 10,000 members.
Three formulations of the rock-physics models are tested in three 0D cases. This allows us to evaluate the applicability of the developed approach on different rock-physics problems for standard seismic parameters (cases 1 and 2) and as well for one example of the developed rock-physics saturation and pressure discriminations (case 3). In a fully brine saturated medium, case 1 inverts for porosity ϕ and the dry frame moduli K d ; G d . Case 2 inverts in a partially saturated medium for porosity ϕ, gas saturation S g , and the consolidation parameter cs. Cases 1 and 2 are trained on the BG model (Table 2), analogous to conventionally inverted examples by Dupuy et al. (2016). Case 3 is trained on the HMG model with pressure dependence PL.
Each case is computed with two subcases: the first comprises P-and S-wave velocity and density (V P , V S , and ρ), and the second comprises the latter plus attenuation attributes (Q P and Q S ). This should determine the impact of the two attenuation attributes on inversion quality. Each of the six subcases is trained with an individual ensemble.
For each case, three reference states are defined, named by letters A, B, and C, for example, case 1-B. For these reference states, the corresponding poroelastic attributes are inverted. The network is trained with an error-free ensemble. The statistical errors during determination of the poroelastic attributes are addressed by Monte Carlo simulations during the inversion step. For each ensemble member, an error realization is added to the reference set of poroelastic attributes prior to inversion. Three error levels (σ 1;2;3 ) are defined. They represent the accuracy for determining the poroelastic attributes with different seismic acquisition methods and subsequent inversion (see Table 3). Error 1 corresponds to a typical high-resolution surface seismic setup with errors in the range of AE100 m∕s for velocities (Table 4). Error 2 is realistic for an accurate vertical seismic profile (VSP) because the errors are reduced by a factor of two as a result of the increasing bandwidth in the range of 8-400 Hz (Charles et al., 2019). Error 3 corresponds to cross-well acquisition, as planned for the Svelvik campaign. With a pick accuracy of two samples at a sampling rate of 0.03 ms, an accuracy of 0.06 ms can be expected. This translates to an error band in velocities of approximately AE6 m∕s for a velocity of roughly 1700 m∕s.  (Table 1), using early stopping with a stop set to 100 epochs. Note: σ 1 corresponds to standard surface seismic, σ 2 corresponds to an accurate VSP setup, and σ 3 corresponds to an accurate cross-well setup. The errors are multiplied with realizations from a AE1 − σ windowed standard normal distribution and then added to the respective forward calculated poroelastic attribute. Table 4. Reference rock-physics parameters of cases 1-3 (denoted "True") and the corresponding inversion results.

AE0.02
Note: The true sets of poroelastic parameters are overlain by a measurement error of a VSP seismic (σ 2 , Table 3). The error bounds reflect only this measurement error and include 90% of the realizations. Part (a) shows case 1, which is rather a traditional setting inverting for porosity, bulk, and shear modulus in a fully saturated medium. Part (b) shows case 2, which inverts for porosity and, consolidation parameters as well as a saturation change. Part (c) shows case 3, which inverts for porosity and, clay content as well as pressure and saturation changes. Table 4 summarizes the results of feasibility test cases 1-3, calculated with error σ 2 , corresponding to a VSP acquisition. For case 1, the mean of ϕ and K d is determined accurately (less than 5% deviation), whereas G d has errors of approximately 10%-15%. Adding Q P and Q S as inputs shows moderate improvements for the deviation of the mean and also a reduction of the error bound, with the exception of G d for reference parameters A, where the deviation of the mean increases from approximately 10% to 15%. For ϕ and G d , the inversion induces a systematic bias (or epistemic error) that is larger than the aleatoric error that is induced by the seismic acquisition method. For case 2, ϕ and cs are determined very well (0%-5% deviation from the true value), with S g showing an error of 0%-20%. With adding the attenuation parameters Q P and Q S , the accuracy mostly increases and, most important, the largest error, which is the error of S g from case 2-A, is halved to 10%, which we consider acceptable. The error bounds, that represent the measurement error only, always decrease when attenuation parameters are included. The deviation of the mean tends to improve with the attenuation parameters included. However, some deviation increases because the networks with and without attenuation parameters are trained with two different ensembles, respectively. Further, the dropout is a stochastic effect for each network, wherefore the models are not perfect; that is, they have an epistemic uncertainty and the results have a model-dependent stochastic component. As an intermediate conclusion, the neural network shows a generally good inversion capability for established rockphysics models. Therefore, the analyses continue with case 3 to evaluate the applicability to invert for pressure and saturation, which is the aim of this study. The underlying rock-physics model for case 3 is HMG with the pressure dependence PL. Results are listed in Table 4c and are additionally visualized in Figure 4. The mean of the inverted rock-physics parameters deviates less than 5% from the true values, which we consider very accurate. Adding attenuation parameters has a decreasing effect on the pressure, causing a slight increase of the misfit for case 3-A and 3-B, but a slight decrease for case 3-C. Analogous to cases 1 and 2, including the attenuation parameters decreases the error bounds. The calibration quality for all three cases is generally very good.

Feasibility tests -Results and discussion
The error characteristics of the determined rock-physics parameters are shown in Figure 4. Crossplots of P 1 show the largest error clouds, especially in relation to the saturation and the clay content. The inversion for pressure would not be meaningful with a surface seismic (corresponding to error σ 1 ), but the area of the error cloud reduces by approximately one order of magnitude with a VSP and two orders of magnitude for a cross-well seismic such that the inversion appears to be quite reliable with these methods.
The feasibility test shows that the neural network can determine the rock-physics parameters generally with sufficient accuracy.

RESERVOIR APPLICATION
Because the neural network has a demonstrated ability for discrimination of pressure and saturation in a 0D approach, it is evaluated how it can be used for a field application. This is carried out exemplarily at the Svelvik field site for CO 2 storage, Norway (Weinzierl et al., 2018). Using both models (HMG and BG) each combined with both pressure dependencies (PA and PL), four networks are trained. Figure 4. Crossplot for case 3 of the feasibility test using attenuation as additional attributes. The terms σ 1 , σ 2 , and σ 3 refer to the errors of different seismic methods defined in Table 3. Feasibility test on a separation of saturation and pressure in the HMG model and pressure dependence PA with prediction results of the preferred network for case 3 using V P ; V S ; and ρ for times T 0 and T 1 . The true values are visualized as a dashed crosshair below the diagonal and as a vertical line in the histograms along the diagonal. For decreasing errors σ 1 , σ 2 , and σ 3 , contours are drawn off-diagonal. Note: M ϕ and M κ are porosity and permeability multipliers, and P ew and P eg are Brooks-Corey parameters for the two-phase flow behavior.

Reservoir application -Setup
The Svelvik site is located on a glacial ridge with the subsurface consisting of glaciofluvial sand and gravel (Sørensen et al., 1990). A glacial clay layer is present between 50 and 60 m that acts as cap rock to the reservoir (Hagby, 2018). The main properties affecting the sensitivity of the simulated injection plume extent are porosity and permeability. Formation velocities are known from the injection well. Porosities are calculated on a Greenberg-Castagna relation (Greenberg and Castagna, 1992). Permeabilities are derived from the porosities using the Kozeny-Carman equation (Carman, 1961). To evaluate the applicability of the current approach to different reservoir conditions and to detect leakage, three scenarios are analyzed.
The base scenario represents the best prior knowledge of the field site. For the low-containment scenario, the cap rock can be more easily penetrated by CO 2 , by increasing porosity and permeability, reducing the capillary entry pressure. Additionally, a lower permeability of the aquifer favors leakage. For the high scenario, the reverse is done, with Brooks-Corey parameters chosen such that no leakage occurs into the cap rock. The scenarios are derived with a multiplicator for the porosity (M ϕ ) and permeability (M κ ) ( Table 5). For geologic consistency, V cl is adjusted to the new porosities and permeabilities for the high and low cases.
The capillary pressure P c is defined by with the capillary entry pressure P e (Table 5), the water saturation S w , the residual water saturation S wr ¼ 0.28, and the saturation exponent λ ¼ 3 constant for all scenarios. In total, 23 tons of CO 2 are injected with a rate of 370 kg/d. The outer boundary conditions are no-flow with a pore volume multiplier on the outer cells. The effective reservoir volume is 9.3 million cubic meters (9.3 Mm 3 ). The reservoir model results are shown in Figure 5. In all scenarios, the pressure buildup was slightly higher than 2 bars, with the reservoir pressure increasing by approximately 2.1 bars and an additional dynamic pressure increases of 0.15 bars in the vicinity of the injection well (Figure 5a).
The CO 2 saturation reaches values of 70% close to the injection well, with lower values at a larger distance. For the high scenario, no CO 2 enters the cap rock, whereas for the low and base scenarios, considerable concentrations of up to 27% are reached (Figure 5b).
The changes of the acoustic impedance are displayed separately for the impact of the pressure and saturation changes ( Figure 5). Because most of the pressure buildup is static, the impedance ratio shows a rather flat profile (Figure 5c, 5e, and 5g). The differences for the low, base, and high case are small. The total change of pressure-induced impedance is approximately 1.5% after 23 tons of injected CO 2 . At the end of the injection, the saturation-induced acoustic impedance ratio is approximately 40% lower compared to the baseline. The saturation-induced impedance ratio differs increasingly for the scenarios with increasing simulation time, mainly because of different amounts of CO 2 migrating into the cap rock. Figure 5. Results of the reservoir model. (a and b) The pressure and saturation along a north-south profile after the injection of 23 tons of CO 2 for the base scenario. The impact of pressure and saturation on the acoustic impedance is visualized in (c-h) for increasing injection volumes in the highest reservoir layer, indicated by the white arrows in (a and b). The low, base, and high scenarios are shown in colors green, red, and blue lines, respectively. The black vertical line indicates the injection well. Figure 6. From left to right: Four rock-physics input parameters and five forward-computed poroelastic attributes calculated with the HMG model and pressure dependence PL along the injection well after 23 tons of injected CO 2 . If a dashed line is present, it refers to the situation before CO 2 injection.
Deep learning a rock-physics model MR61 Four individual networks for both rock-physics models (HMG and BG) each combined with both pressure dependencies (PA and PL) are trained with the hydrostatic initial pressure as P 0 . The networks are trained with an ensemble of 200,000 members. The reservoir model training phase (as the largest model) was finished in approximately 1 h on a GeForce GTX 1070. The prediction for 100 K configurations can be performed in less than 2 s.

Reservoir application -Results and discussion
The inversion capability is demonstrated at the injection well location because the saturation and pressure contrasts are the highest here. In the case of real-world application, the local poroelastic attributes at the Svelvik#2 injection well can be determined by 2D seismic inversion due to the cross-well setup.
The output from the reservoir model, geologic model, and the poroelastic attributes forward modeled with the HMG-PL model is shown in Figure 6. The rock-physics parameters from the reservoir simulation are ΔS g , ΔP, and the geologically derived parameters ϕ and V cl . The attribute V p is strongly affected by saturation, whereas V s is more affected by the porosity and the pressure. Similarly, Q p is more affected by saturation and Q s more by pressure. The density ρ is mainly affected by the clay content, but also by the saturation.
The inversion results of saturation, porosity, and clay content are quite close to the reference truth for most scenarios and rock-physics models. Therefore, the misfit is 10 times exaggerated for ϕ, S g , and V cl to allow for a better interpretation, but ΔP predictions are not exaggerated because they have a higher deviation (Figure 7). Porosity is inverted quite accurately, with slight advantages for the HMG model. Saturation is also generally inverted quite accurately. The highest deviations occur for regions without CO 2 saturation with differences of three saturation percentage points for the BG model in combination with the Avseth pressure dependency. The deviation is highest above the reservoir, apparently correlated to the lower reference porosity.
The pressure changes are very accurate for PL, whereas PA shows very good values only in the reservoir. Above the pressure is underestimated and below overestimated by up to 0.8 bars, apparently mainly affected by the depth and therefore by the hydrostatic pore pressure. The lowest pressures of the training ensembles are 5 bar, wherefore greater than 50 m the pressure model is undefined. Nevertheless, this hardly affects the models. PL does not show an extra error here because the pressure difference is considered, whereas PA is implemented based on the difference of the absolute values. Although PL is based on absolute pressures, the additional deviation outside the training interval is marginal. For the clay content, the training range has a more pronounced effect. For the clay content approaching the training boundary of 0.1, as between 50 and 60 m for the low-containment scenario, the error appears to be slightly higher compared to other depths. The effect is stronger and results in an overestimation when the clay content is slightly lower than the training range as between 60 and 70 m for the high-containment scenario. All inversion results are satisfying, with the best results obtained for the HMG-PL model.
The HMG model shows very good prediction quality because it refers to unconsolidated rocks. However, although the BG model is developed for consolidated environments, it shows satisfying results. Therefore, this approach is also applicable to CO 2 storage formations, which are located in deep consolidated formations. A comparison under CO 2 conditions probably would show advantages for the BG model. All simulations are calculated with individual ensembles and individual seeds for dropout, wherefore the analysis includes epistemic errors (errors that refer to the inversion method). Nevertheless, these errors are apparently smaller than the systematic deviations of the methods.
The effect of the measurement error (also referred as aleatoric error) on the inversion quality is exemplarily analyzed with the HMG-PL model for the base case. The measurement error of an accurate cross-well seismic is applied (σ 3 , Table 3). The inversion for saturation is most reliable. In the reservoir and other regions where CO 2 is present, there is a variation width of typically approximately AE2%−3% in saturation, with a maximum of almost AE5% in the reservoir, where the highest saturations are present (Figure 8). However, the simulated saturations of greater than 50%, at which the highest bandwidth occurs, might be higher than found in the field.
The error band of the pressure is approximately AE0.7 bar for regions where CO 2 is present, which is a mediocre accuracy compared to traditional pressure measurements. Nevertheless, it is considerably lower than the pressure variation itself and therefore might provide valu- Figure 7. Error-free predictions for different rock-physics formulations. HMG and BG are each combined with pressure dependence PA (Avseth et al., 2010) and PL (Lang and Grana, 2019). The rows from top to bottom show the low-, base-, and high-containment scenarios after injection of 23 tons of CO 2 . The black line is the synthetic truth, and it is referred to by the x-axis. The colored lines show the misfit of the predictions in 10 fold exaggeration, and only for the pressure difference the misfit is not exaggerated. The shaded areas in the pressure and clay content columns are outside the parameter range of the ensembles.
able information for regions with low pressure gauge coverage. Above the reservoir, however, the error bound grows to AE1 bar. Further, the values are more equally distributed and have a smaller centric tendency. It is remarkable to distinguish the pressure with such accuracy because the seismic impedance varies by 40% for saturations and only by 1.5% for pressure. With the deep neural network, it is possible to take advantage of the nonlinear effects on the different poroelastic attributes. The error bounds of the porosity are approximately AE2%, with a pronounced central tendency, which is considered quite accurate. The clay content has larger error bounds of AE5%.
However, it has to be considered that the current analysis is restricted to a subset of possible parameters. Under real-world conditions, a null space of different inversion results with equal quality of fit to the data would occur. A promising development is the recent advance in FWI techniques. They are also based on deep convolutional networks, use the full-wavefield information; therefore, they allow us to invert for high-resolution velocities. We think that their combination with our approach has the potential for providing sufficiently accurate poroelastic attributes that allow discrimination of pressure and saturation.
In this paper, the synthetic truth itself is generated by a rockphysics model. In the real world, the rock may show differences from the rock-physics model formulation. This is particularly important because for the current study the favorable assumption of an intermediate patchy gas distribution in the subsurface is made. The variation to more homogeneous gas distribution would increase the nonlinearity and, therefore, the error of the current method (Eid et al., 2015). Even more important is the difficulty in correctly defining an effective patchiness.

CONCLUSION
Two rock-physics models are developed that allow us to discriminate pressure and saturation. The first is a soft-sand formulation based on the HMG equations, and the second is a hard rock formulation based on the BG equations. Although the HMG and BG equations consider different physical processes, their forward and also inverse behavior is similar for the current parameterization, with the HMG model showing a slightly better behavior for the analyzed field example of the shallow Svelvik aquifer. The BG model is more promising for the intended real-world CO 2 storage application in deeper and consolidated formations.
It is recommended to include as many poroelastic attributes as possible; including P-and Swave attenuation, the accuracy tends to increase. Pressure inversion provides meaningful results. However, the accuracy of determining the pressure is still lower compared with the other rockphysics parameters. Differential analysis, including baseline data acquired without subsurface gas saturation, is a prerequisite for pressure and saturation discrimination. The method appears promising for gas storage and other applications, as long as the gas content of the baseline is zero such that many of the unknown errors cancel out. Compared to the traditional AVO-based methods, the rock-physics approach is a significant advance in the determination of pressure from poroelastic attributes.
Many assumptions have to be made in developing a site-specific rock-physics description of the subsurface. For the current study, the favorable assumption of an intermediate patchy gas distribution in the subsurface is made. Under the conditions of a known intermediatepatchy gas distribution, the epistemic error, that is, the error of the inversion algorithm itself, is smaller than the conceptual error of the rock physics. The latter is smaller than the aleatoric error, that is, the measurement error for an accurate cross-well seismic. For real-world applications, a sufficiently patchy gas distribution is a prerequisite.
The developed neural network was found applicable for inverting the rock-physics equations. Although the quality of neural network results may vary under different conditions, we see great potential to replace traditional inversion tools, especially if the bandwidth of the expected results is known, as is the case in gas storage applications. For industry application, the rapid results after the monitoring campaigns are a significant advantage to traditional inversion, allowing a faster reaction to unforeseen events.
An accurate determination of the poroelastic attributes is the current bottleneck of the method. Recent FWI techniques, also based on deep convolutional networks, use the full-wavefield information and therefore allow us to invert for high-resolution velocities. We think that their combination with our approach has the potential for a much better discrimination of pressure and saturation. The developed methodology may then be used to derive high-resolution petrophysical properties.

ACKNOWLEDGMENTS
This work has been produced with support from the Pre-ACT project (project no. 271497) funded by RCN (Norway), Gassnova (Norway), BEIS (UK), RVO (Netherlands), and BMWi (Germany) and cofunded by the European Commission under the Horizon Figure 8. Impact of measurement error on the prediction accuracy with the HMG-PL model at the injection well location after injection of 23 tons of CO 2 for the base scenario. Stochastic realizations are based on measurement error σ 3 (Table 3), valid for an accurate cross-well seismic. The dashed line is the reference values, and the solid line is the error-free inversion. The color refers to the density of the realizations relative to equal distribution over the respective x-axis. Max indicates a sixfold higher density compared to the equal distribution, and min a sixth in the density. White indicates no realizations. The depths follow the reservoir discretization. knowledge the industry partners for their contributions: Total, Equinor, Shell, and TAQA. We would like to thank the editor in chief J. Shragge and assistant editor E. Gasperikova as well as the three anonymous reviewers for the many constructive comments.

DATA AND MATERIALS AVAILABILITY
Data associated with this research are available and can be obtained by contacting the corresponding author.

Dry moduli
The first rock-physics model is the Hertz-Mindlin soft-sand model (HMG), in which the Gassmann equation (see equation 1) determines the bulk and shear moduli of the saturated and dry rock based on the porosity ϕ and fluid modulus: (A-1) (A-5) with the shear Poisson's ratio ν as ν ¼ 3K ma − 2G ma 2ð3K ma þ G ma Þ : (A-7) As a simplified approach, the dry moduli are calibrated with pressure P equal to the injection location at 65 m depth at 6.5 bar. It would be more accurate to calibrate each zone or formation at its depth. The second rock-physics model is the BG model, outlined in detail in Pride et al. (2004). The dry bulk and shear moduli of the rockphysics model are dependent on a consolidation parameter (cs) and are defined by : (A-9)

Solid and fluid mixing
The density of the subsurface is calculated as the matrix density and fluid density filling the pore space: ρ ¼ ϕρ fl þ ð1 − ϕÞρ ma : (A-10) We consider a mixture of quartz (K 1 ; G 1 ) and clay (K 2 ; G 2 ) with the corresponding volume fractions (f 1 ¼ ð1 − V cl Þ; f 2 ¼ V cl ). For averaging, we choose the Hashin-Shtrikman method with the upper and lower bounds obtained by interchanging subscripts 1 and 2, respectively: f 2 ðG 2 −G 1 Þ −1 þ2f 1 ðK 1 þ2G 1 Þ∕½5G 1 ðK 1 þ4G 1 ∕3Þ : (A-12) In our case, matrix constituents are mixed with S m ¼ 0.5 yielding the arithmetic average of the lower and upper Hashin-Shtrikman bounds: K ma ¼ S m K HSþ þ ð1 − S m ÞK HS− : (A-13) For both models, fluid mixing is achieved according to (Brie et al., 1995): with the exponent set fixed to e ¼ 5.

Viscoelasticity
The velocity and attenuation are calculated based on the bulk modulus K fl , density ρ fl , and viscosity η. The complex permeability κðωÞ is dependent on the permeability κ 0 and the angular frequency ω: (A-15) with κ 0 ¼ 10 −12 ½m 2 being fixed. The angular frequency ω c is defined as Deep learning a rock-physics model