A matrix \(\mathbf{W}\) transforms data from \(\mathbf{X}\) to \(\mathbf{Z}\) space. The matrix \(\mathbf{R}\) is constrained to approximate the inverse of \(\mathbf{W}\) with a reconstruction loss \(||\mathbf{x} - \mathbf{\hat{x}}||^2\). The likelihood of the data is efficiently optimized with respect to both \(\mathbf{W}\) and \(\mathbf{R}\) by approximating the gradient of the log Jacobian determinant with the learned inverse.