KVM: PPC: Book3S HV: Clear pending decrementer exceptions on nested guest entry [Linux 5.3]

This Linux kernel change "KVM: PPC: Book3S HV: Clear pending decrementer exceptions on nested guest entry" is included in the Linux 5.3 release. This change is authored by Suraj Jitindar Singh <sjitindarsingh [at] gmail.com> on Thu Jun 20 11:46:51 2019 +1000. The commit for this change in Linux stable tree is 3c25ab3 (patch).

KVM: PPC: Book3S HV: Clear pending decrementer exceptions on nested guest entry

If we enter an L1 guest with a pending decrementer exception then this
is cleared on guest exit if the guest has writtien a positive value
into the decrementer (indicating that it handled the decrementer
exception) since there is no other way to detect that the guest has
handled the pending exception and that it should be dequeued. In the
event that the L1 guest tries to run a nested (L2) guest immediately
after this and the L2 guest decrementer is negative (which is loaded
by L1 before making the H_ENTER_NESTED hcall), then the pending
decrementer exception isn't cleared and the L2 entry is blocked since
L1 has a pending exception, even though L1 may have already handled
the exception and written a positive value for it's decrementer. This
results in a loop of L1 trying to enter the L2 guest and L0 blocking
the entry since L1 has an interrupt pending with the outcome being
that L2 never gets to run and hangs.

Fix this by clearing any pending decrementer exceptions when L1 makes
the H_ENTER_NESTED hcall since it won't do this if it's decrementer
has gone negative, and anyway it's decrementer has been communicated
to L0 in the hdec_expires field and L0 will return control to L1 when
this goes negative by delivering an H_DECREMENTER exception.

Fixes: 95a6432ce903 ("KVM: PPC: Book3S HV: Streamlined guest entry/exit path on P9 for radix guests")
Cc: stable@vger.kernel.org # v4.20+
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

There are 11 lines of Linux source code added/deleted in this change. Code changes to Linux kernel are as follows.

 arch/powerpc/kvm/book3s_hv.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index ffd891d..a104743 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4124,8 +4124,15 @@ int kvmhv_run_single_vcpu(struct kvm_run *kvm_run,

    preempt_enable();

-   /* cancel pending decrementer exception if DEC is now positive */
-   if (get_tb() < vcpu->arch.dec_expires && kvmppc_core_pending_dec(vcpu))
+   /*
+    * cancel pending decrementer exception if DEC is now positive, or if
+    * entering a nested guest in which case the decrementer is now owned
+    * by L2 and the L1 decrementer is provided in hdec_expires
+    */
+   if (kvmppc_core_pending_dec(vcpu) &&
+           ((get_tb() < vcpu->arch.dec_expires) ||
+            (trap == BOOK3S_INTERRUPT_SYSCALL &&
+             kvmppc_get_gpr(vcpu, 3) == H_ENTER_NESTED)))
        kvmppc_core_dequeue_dec(vcpu);

    trace_kvm_guest_exit(vcpu);

Leave a Reply

Your email address will not be published. Required fields are marked *