Modeling Vital signs with Gaussian Processes
How to model a time series with a Gaussian process, plots included
As an alternative to linear regression, using a gaussian processes one can predict a distribution for each input, allowing for confidence levels instead of a hard prediction. Here, we use the GPyTorch library to model patient vital sign data from the 2019 Physionet Sepsis Competition: https://physionet.org/content/challenge-2019/1.0.0/
Note: Much of this tutorial is from https://docs.gpytorch.ai/en/v1.1.1/examples/01_Exact_GPs/Simple_GP_Regression.html, but I wanted to document my own use case. This is not an introduction to guassian processes, but a tutorial on how to apply GPyTorch.
Motivation: A formidiable obstacle to modeling health data is that it is very sparse. The application in mind for using a guassian process to model the patient's vital signs are for imputation the of missing values.
We'll be using a random patient from the sepsis data provided by https://physionet.org/content/challenge-2019/1.0.0/ .
Let's check out patient 57779:
df = pd.read_csv('p005779.psv', sep='|')
df.head()
df[['HR','O2Sat', 'Temp', 'SBP', 'MAP', 'DBP', 'Resp']].isnull().sum()
Temperature is a feature that has a high number of missing values, so we'll use that for modeling.
Our data type for GPyTorch models are torch tensors.
temp = torch.tensor(df.Temp.values, dtype=torch.float32)
With missing values our training x and y points will be need to be chosen as:
- x = index values where there is non null values
- y = observed temperature values that are non null
train_x = torch.where(~torch.isnan(temp))[0]
train_x
train_y = temp[~torch.isnan(temp)]
train_y
Similiary to Pytorch, you must define a class for each model you construct. With GPyTorch, there are 5 objects you must specify:
- GP model: gpytorch.models.ExactGP
- Likelihood: gpytorch.likelihoods.GaussianLikelihood()
- Mean: gpytorch.means.ConstantMean()
- Kernal: gpytorch.kernels.ScaleKernel(gpytorch.kernels.RBFKernel())
- Distribution: gpytorch.distributions.MultivariateNormal()
These are all the "default" settings for a gaussian process. If you had a choice to tweak one, choose the kernal. A good place to learn more is at: https://www.cs.toronto.edu/~duvenaud/cookbook/
class ExactGPModel(gpytorch.models.ExactGP):
def __init__(self, train_x, train_y, likelihood):
super(ExactGPModel, self).__init__(train_x, train_y, likelihood)
self.mean_module = gpytorch.means.ConstantMean()
self.covar_module = gpytorch.kernels.ScaleKernel(gpytorch.kernels.RBFKernel())
def forward(self, x):
mean_x = self.mean_module(x)
covar_x = self.covar_module(x)
return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)
likelihood = gpytorch.likelihoods.GaussianLikelihood()
model = ExactGPModel(train_x, train_y, likelihood)
training_iter = 10000
# Find optimal model hyperparameters
model.train()
likelihood.train()
# Use the adam optimizer
optimizer = torch.optim.Adam([
{'params': model.parameters()}, # Includes GaussianLikelihood parameters
], lr=0.1)
# "Loss" for GPs - the marginal log likelihood
mll = gpytorch.mlls.ExactMarginalLogLikelihood(likelihood, model)
for i in range(training_iter):
# Zero gradients from previous iteration
optimizer.zero_grad()
# Output from model
output = model(train_x)
# Calc loss and backprop gradients
loss = -mll(output, train_y)
loss.backward()
if i % 1000 == 0:
print('Iter %d/%d - Loss: %.3f lengthscale: %.3f noise: %.3f' % (
i, training_iter, loss.item(),
model.covar_module.base_kernel.lengthscale.item(),
model.likelihood.noise.item()
))
optimizer.step()
Now we get out predictions from the model:
# Get into evaluation (predictive posterior) mode
model.eval()
likelihood.eval()
# Make predictions by feeding model through likelihood
with torch.no_grad(), gpytorch.settings.fast_pred_var():
test_x = torch.tensor(list(df.index))
observed_pred = likelihood(model(test_x))
The GPytorch tutorial provides matplotlib code for plotting the mean and confidence itervals of our posterior:
with torch.no_grad():
# Initialize plot
f, ax = plt.subplots(1, 1, figsize=(8, 6))
# Get upper and lower confidence bounds
lower, upper = observed_pred.confidence_region()
# Plot training data as black stars
ax.plot(train_x.numpy(), train_y.numpy(), 'k*')
# Plot predictive means as blue line
ax.plot(test_x.numpy(), observed_pred.mean.numpy(), 'b')
# Shade between the lower and upper confidence bounds
ax.fill_between(test_x.numpy(), lower.numpy(), upper.numpy(), alpha=0.5)
ax.legend(['Observed Data', 'Mean', 'Confidence'])
But we can do better. PyMC3 has plotting functionality that allows us to sample from our posterior and plot the results.
from pymc3.gp.util import plot_gp_dist
gp_samples = torch.stack([observed_pred.sample() for i in range(100)])
fig, ax = plt.subplots(figsize=(8,6))
plot_gp_dist(ax, gp_samples.numpy(), test_x.numpy())
df.reset_index().plot(kind='scatter', x='index', y='Temp', c='k', s=50, ax=ax)
With just 100 samples we get a clear picture of where the guassian process mean fits thorugh out data. While not just looking cool, graphing in this way helps to illustrate how the model is generated from a posterior distribution.