All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: marc.zyngier@arm.com (Marc Zyngier)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 07/10] KVM: arm/arm64: vgic: Allow HW interrupts to be queued to a guest
Date: Wed, 17 Jun 2015 13:23:04 +0100	[thread overview]
Message-ID: <558166A8.3070901@arm.com> (raw)
In-Reply-To: <55815F4C.2090602@linaro.org>

Hi Eric,

On 17/06/15 12:51, Eric Auger wrote:
> Hi Marc,
> On 06/08/2015 07:04 PM, Marc Zyngier wrote:
>> To allow a HW interrupt to be injected into a guest, we lookup the
>> guest virtual interrupt in the irq_phys_map rbtree, and if we have
>> a match, encode both interrupts in the LR.
>>
>> We also mark the interrupt as "active" at the host distributor level.
>>
>> On guest EOI on the virtual interrupt, the host interrupt will be
>> deactivated.
>
> a "standard" physical IRQ would be first handled by the host handler
> which would ack and deactivate it a first time. Here, if my
> understanding is correct, the virtual counter PPI never hits. Instead we
> "emulate" it on world-switch by directly setting the dist state. Is that
> correct? If yes it is quite a specific handling of an "HW" IRQ.

This is (mostly) correct. Because we deal with HW that is shared between
guests, we absolutely need to make that HW quiescent before getting back
to the host. Setting the active bit in the distributor allows us to
restore the HW in a state that shows a pending interrupt at the guest
level, but ensure that the interrupt doesn't fire at the host level.

As for the "specificity", this is how the architecture has been
designed, and the way we're expected to deal with this kind of shared
HW. Rest assured I didn't come up with that on my own! ;-)

> 
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  virt/kvm/arm/vgic.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++---
>>  1 file changed, 68 insertions(+), 3 deletions(-)
>>
>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>> index c6604f2..495ac7d 100644
>> --- a/virt/kvm/arm/vgic.c
>> +++ b/virt/kvm/arm/vgic.c
>> @@ -1120,6 +1120,26 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
>>  	if (!vgic_irq_is_edge(vcpu, irq))
>>  		vlr.state |= LR_EOI_INT;
>>  
>> +	if (vlr.irq >= VGIC_NR_SGIS) {
>> +		struct irq_phys_map *map;
>> +		map = vgic_irq_map_search(vcpu, irq);
>> +
>> +		if (map) {
>> +			int ret;
>> +
>> +			BUG_ON(!map->active);
>> +			vlr.hwirq = map->phys_irq;
>> +			vlr.state |= LR_HW;
>> +			vlr.state &= ~LR_EOI_INT;
>> +
>> +			ret = irq_set_irqchip_state(map->irq,
>> +						    IRQCHIP_STATE_ACTIVE,
>> +						    true);
>> +			vgic_irq_set_queued(vcpu, irq);
>
> queued state was used for level sensitive IRQs only. Forwarded or "HW"
> IRQs theoretically can be edge or sensitive, right? If yes may be worth
> to justify the usage of queued state for forwarded IRQ? Also

That's because it is illegal to set a HW interrupt to be PENDING+ACTIVE,
which means we have to prevent the interrupt to be injected multiple
times. The behaviour is sufficiently close to what we do for a level
interrupt that we use the same state.

> vgic_irq_set_queued rather was called in parent vgic_queue_hwirq today.

I tried to keep the HW bit madness as localized as possible. Letting it
spread further away seems to make the code more difficult to read IMHO.

> 
>> +			WARN_ON(ret);
>> +		}
>> +	}
>> +
>>  	vgic_set_lr(vcpu, lr_nr, vlr);
>>  	vgic_sync_lr_elrsr(vcpu, lr_nr, vlr);
>>  }
>> @@ -1344,6 +1364,35 @@ static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
>>  	return level_pending;
>>  }
>>  
>> +/* Return 1 if HW interrupt went from active to inactive, and 0 otherwise */
>> +static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr)
>> +{
>> +	struct irq_phys_map *map;
>> +	int ret;
>> +
>> +	if (!(vlr.state & LR_HW))
>> +		return 0;
>> +
>> +	map = vgic_irq_map_search(vcpu, vlr.irq);
>> +	BUG_ON(!map || !map->active);
>> +
>> +	ret = irq_get_irqchip_state(map->irq,
>> +				    IRQCHIP_STATE_ACTIVE,
>> +				    &map->active);
>
> Doesn't it work because the virtual timer was disabled during the world
> switch. Does it characterize all "shared" devices? Difficult for me to
> understand how much this is specific to arch timer integration?

Shared devices cannot be left running when the guest is not running
because (a) we have lost the context (the guest), and (b) we need to
give it to another guest. This is a fundamental property of this kind of
resource.

This is by no mean specific to the timer, BTW. The VGIC itself is a
shared resource, and we nuke it on each exit, for the same reason. The
only difference is that we don't propagate the VGIC interrupt to a guest.

>> +
>> +	WARN_ON(ret);
>> +
>> +	if (map->active) {
>> +		ret = irq_set_irqchip_state(map->irq,
>> +					    IRQCHIP_STATE_ACTIVE,
>> +					    false);
>> +		WARN_ON(ret);
>> +		return 0;
>> +	}
>> +
>> +	return 1;
>> +}
>> +
>>  /* Sync back the VGIC state after a guest run */
>>  static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
>>  {
>> @@ -1358,14 +1407,30 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
>>  	elrsr = vgic_get_elrsr(vcpu);
>>  	elrsr_ptr = u64_to_bitmask(&elrsr);
>>  
>> -	/* Clear mappings for empty LRs */
>> -	for_each_set_bit(lr, elrsr_ptr, vgic->nr_lr) {
>> +	/* Deal with HW interrupts, and clear mappings for empty LRs */
>> +	for (lr = 0; lr < vgic->nr_lr; lr++) {
>>  		struct vgic_lr vlr;
>>  
>> -		if (!test_and_clear_bit(lr, vgic_cpu->lr_used))
>> +		if (!test_bit(lr, vgic_cpu->lr_used))
>>  			continue;
>>  
>>  		vlr = vgic_get_lr(vcpu, lr);
>> +		if (vgic_sync_hwirq(vcpu, vlr)) {
>> +			/*
>> +			 * So this is a HW interrupt that the guest
>> +			 * EOI-ed. Clean the LR state and allow the
>> +			 * interrupt to be queued again.
>> +			 */
>> +			vlr.state &= ~LR_HW;
>> +			vlr.hwirq = 0;
>> +			vgic_set_lr(vcpu, lr, vlr);
>> +			vgic_irq_clear_queued(vcpu, vlr.irq)
>
> not necessarily a level sensitive IRQ?

As explained above, we have the same requirements when an interrupt is
forwarded to a guest.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

WARNING: multiple messages have this Message-ID (diff)
From: Marc Zyngier <marc.zyngier@arm.com>
To: Eric Auger <eric.auger@linaro.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>
Cc: "Christoffer Dall" <christoffer.dall@linaro.org>,
	"Alex Bennée" <alex.bennee@linaro.org>,
	"Andre Przywara" <Andre.Przywara@arm.com>
Subject: Re: [PATCH 07/10] KVM: arm/arm64: vgic: Allow HW interrupts to be queued to a guest
Date: Wed, 17 Jun 2015 13:23:04 +0100	[thread overview]
Message-ID: <558166A8.3070901@arm.com> (raw)
In-Reply-To: <55815F4C.2090602@linaro.org>

Hi Eric,

On 17/06/15 12:51, Eric Auger wrote:
> Hi Marc,
> On 06/08/2015 07:04 PM, Marc Zyngier wrote:
>> To allow a HW interrupt to be injected into a guest, we lookup the
>> guest virtual interrupt in the irq_phys_map rbtree, and if we have
>> a match, encode both interrupts in the LR.
>>
>> We also mark the interrupt as "active" at the host distributor level.
>>
>> On guest EOI on the virtual interrupt, the host interrupt will be
>> deactivated.
>
> a "standard" physical IRQ would be first handled by the host handler
> which would ack and deactivate it a first time. Here, if my
> understanding is correct, the virtual counter PPI never hits. Instead we
> "emulate" it on world-switch by directly setting the dist state. Is that
> correct? If yes it is quite a specific handling of an "HW" IRQ.

This is (mostly) correct. Because we deal with HW that is shared between
guests, we absolutely need to make that HW quiescent before getting back
to the host. Setting the active bit in the distributor allows us to
restore the HW in a state that shows a pending interrupt at the guest
level, but ensure that the interrupt doesn't fire at the host level.

As for the "specificity", this is how the architecture has been
designed, and the way we're expected to deal with this kind of shared
HW. Rest assured I didn't come up with that on my own! ;-)

> 
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  virt/kvm/arm/vgic.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++---
>>  1 file changed, 68 insertions(+), 3 deletions(-)
>>
>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>> index c6604f2..495ac7d 100644
>> --- a/virt/kvm/arm/vgic.c
>> +++ b/virt/kvm/arm/vgic.c
>> @@ -1120,6 +1120,26 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
>>  	if (!vgic_irq_is_edge(vcpu, irq))
>>  		vlr.state |= LR_EOI_INT;
>>  
>> +	if (vlr.irq >= VGIC_NR_SGIS) {
>> +		struct irq_phys_map *map;
>> +		map = vgic_irq_map_search(vcpu, irq);
>> +
>> +		if (map) {
>> +			int ret;
>> +
>> +			BUG_ON(!map->active);
>> +			vlr.hwirq = map->phys_irq;
>> +			vlr.state |= LR_HW;
>> +			vlr.state &= ~LR_EOI_INT;
>> +
>> +			ret = irq_set_irqchip_state(map->irq,
>> +						    IRQCHIP_STATE_ACTIVE,
>> +						    true);
>> +			vgic_irq_set_queued(vcpu, irq);
>
> queued state was used for level sensitive IRQs only. Forwarded or "HW"
> IRQs theoretically can be edge or sensitive, right? If yes may be worth
> to justify the usage of queued state for forwarded IRQ? Also

That's because it is illegal to set a HW interrupt to be PENDING+ACTIVE,
which means we have to prevent the interrupt to be injected multiple
times. The behaviour is sufficiently close to what we do for a level
interrupt that we use the same state.

> vgic_irq_set_queued rather was called in parent vgic_queue_hwirq today.

I tried to keep the HW bit madness as localized as possible. Letting it
spread further away seems to make the code more difficult to read IMHO.

> 
>> +			WARN_ON(ret);
>> +		}
>> +	}
>> +
>>  	vgic_set_lr(vcpu, lr_nr, vlr);
>>  	vgic_sync_lr_elrsr(vcpu, lr_nr, vlr);
>>  }
>> @@ -1344,6 +1364,35 @@ static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
>>  	return level_pending;
>>  }
>>  
>> +/* Return 1 if HW interrupt went from active to inactive, and 0 otherwise */
>> +static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr)
>> +{
>> +	struct irq_phys_map *map;
>> +	int ret;
>> +
>> +	if (!(vlr.state & LR_HW))
>> +		return 0;
>> +
>> +	map = vgic_irq_map_search(vcpu, vlr.irq);
>> +	BUG_ON(!map || !map->active);
>> +
>> +	ret = irq_get_irqchip_state(map->irq,
>> +				    IRQCHIP_STATE_ACTIVE,
>> +				    &map->active);
>
> Doesn't it work because the virtual timer was disabled during the world
> switch. Does it characterize all "shared" devices? Difficult for me to
> understand how much this is specific to arch timer integration?

Shared devices cannot be left running when the guest is not running
because (a) we have lost the context (the guest), and (b) we need to
give it to another guest. This is a fundamental property of this kind of
resource.

This is by no mean specific to the timer, BTW. The VGIC itself is a
shared resource, and we nuke it on each exit, for the same reason. The
only difference is that we don't propagate the VGIC interrupt to a guest.

>> +
>> +	WARN_ON(ret);
>> +
>> +	if (map->active) {
>> +		ret = irq_set_irqchip_state(map->irq,
>> +					    IRQCHIP_STATE_ACTIVE,
>> +					    false);
>> +		WARN_ON(ret);
>> +		return 0;
>> +	}
>> +
>> +	return 1;
>> +}
>> +
>>  /* Sync back the VGIC state after a guest run */
>>  static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
>>  {
>> @@ -1358,14 +1407,30 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
>>  	elrsr = vgic_get_elrsr(vcpu);
>>  	elrsr_ptr = u64_to_bitmask(&elrsr);
>>  
>> -	/* Clear mappings for empty LRs */
>> -	for_each_set_bit(lr, elrsr_ptr, vgic->nr_lr) {
>> +	/* Deal with HW interrupts, and clear mappings for empty LRs */
>> +	for (lr = 0; lr < vgic->nr_lr; lr++) {
>>  		struct vgic_lr vlr;
>>  
>> -		if (!test_and_clear_bit(lr, vgic_cpu->lr_used))
>> +		if (!test_bit(lr, vgic_cpu->lr_used))
>>  			continue;
>>  
>>  		vlr = vgic_get_lr(vcpu, lr);
>> +		if (vgic_sync_hwirq(vcpu, vlr)) {
>> +			/*
>> +			 * So this is a HW interrupt that the guest
>> +			 * EOI-ed. Clean the LR state and allow the
>> +			 * interrupt to be queued again.
>> +			 */
>> +			vlr.state &= ~LR_HW;
>> +			vlr.hwirq = 0;
>> +			vgic_set_lr(vcpu, lr, vlr);
>> +			vgic_irq_clear_queued(vcpu, vlr.irq)
>
> not necessarily a level sensitive IRQ?

As explained above, we have the same requirements when an interrupt is
forwarded to a guest.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

  reply	other threads:[~2015-06-17 12:23 UTC|newest]

Thread overview: 118+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-08 17:03 [PATCH 00/10] arm/arm64: KVM: Active interrupt state switching for shared devices Marc Zyngier
2015-06-08 17:03 ` Marc Zyngier
2015-06-08 17:03 ` [PATCH 01/10] arm/arm64: KVM: Fix ordering of timer/GIC on guest entry Marc Zyngier
2015-06-08 17:03   ` Marc Zyngier
2015-06-09 11:29   ` Alex Bennée
2015-06-09 11:29     ` Alex Bennée
2015-06-30 20:19   ` Christoffer Dall
2015-06-30 20:19     ` Christoffer Dall
2015-06-08 17:03 ` [PATCH 02/10] arm/arm64: KVM: Move vgic handling to a non-preemptible section Marc Zyngier
2015-06-08 17:03   ` Marc Zyngier
2015-06-09 11:38   ` Alex Bennée
2015-06-09 11:38     ` Alex Bennée
2015-06-30 20:19   ` Christoffer Dall
2015-06-30 20:19     ` Christoffer Dall
2015-06-08 17:03 ` [PATCH 03/10] KVM: arm/arm64: vgic: Convert struct vgic_lr to use bitfields Marc Zyngier
2015-06-08 17:03   ` Marc Zyngier
2015-06-09 13:12   ` Alex Bennée
2015-06-09 13:12     ` Alex Bennée
2015-06-10 17:23   ` Andre Przywara
2015-06-10 17:23     ` Andre Przywara
2015-06-10 18:04     ` Marc Zyngier
2015-06-10 18:04       ` Marc Zyngier
2015-06-08 17:03 ` [PATCH 04/10] KVM: arm/arm64: vgic: Allow HW irq to be encoded in LR Marc Zyngier
2015-06-08 17:03   ` Marc Zyngier
2015-06-09 13:21   ` Alex Bennée
2015-06-09 13:21     ` Alex Bennée
2015-06-09 14:03     ` Marc Zyngier
2015-06-09 14:03       ` Marc Zyngier
2015-06-17 11:53   ` Eric Auger
2015-06-17 11:53     ` Eric Auger
2015-06-17 12:39     ` Marc Zyngier
2015-06-17 12:39       ` Marc Zyngier
2015-06-17 13:21     ` Peter Maydell
2015-06-17 13:21       ` Peter Maydell
2015-06-17 13:34       ` Marc Zyngier
2015-06-17 13:34         ` Marc Zyngier
2015-06-08 17:04 ` [PATCH 05/10] KVM: arm/arm64: vgic: Relax vgic_can_sample_irq for edge IRQs Marc Zyngier
2015-06-08 17:04   ` Marc Zyngier
2015-06-30 20:19   ` Christoffer Dall
2015-06-30 20:19     ` Christoffer Dall
2015-07-01  9:17     ` Marc Zyngier
2015-07-01  9:17       ` Marc Zyngier
2015-07-01 11:58       ` Christoffer Dall
2015-07-01 11:58         ` Christoffer Dall
2015-07-01 18:18         ` Marc Zyngier
2015-07-01 18:18           ` Marc Zyngier
2015-07-02 16:23           ` Christoffer Dall
2015-07-02 16:23             ` Christoffer Dall
2015-07-03  9:50             ` Marc Zyngier
2015-07-03  9:50               ` Marc Zyngier
2015-07-03  9:57               ` Peter Maydell
2015-07-03  9:57                 ` Peter Maydell
2015-06-08 17:04 ` [PATCH 06/10] KVM: arm/arm64: vgic: Allow dynamic mapping of physical/virtual interrupts Marc Zyngier
2015-06-08 17:04   ` Marc Zyngier
2015-06-11  8:43   ` Andre Przywara
2015-06-11  8:43     ` Andre Przywara
2015-06-11  8:56     ` Marc Zyngier
2015-06-11  8:56       ` Marc Zyngier
2015-06-15 15:44   ` Eric Auger
2015-06-15 15:44     ` Eric Auger
2015-06-16  8:28     ` Marc Zyngier
2015-06-16  8:28       ` Marc Zyngier
2015-06-16  9:10       ` Eric Auger
2015-06-16  9:10         ` Eric Auger
2015-06-30 20:19   ` Christoffer Dall
2015-06-30 20:19     ` Christoffer Dall
2015-07-01 10:20     ` Marc Zyngier
2015-07-01 10:20       ` Marc Zyngier
2015-07-01 11:45       ` Christoffer Dall
2015-07-01 11:45         ` Christoffer Dall
2015-06-08 17:04 ` [PATCH 07/10] KVM: arm/arm64: vgic: Allow HW interrupts to be queued to a guest Marc Zyngier
2015-06-08 17:04   ` Marc Zyngier
2015-06-11  8:44   ` Andre Przywara
2015-06-11  8:44     ` Andre Przywara
2015-06-11  9:15     ` Marc Zyngier
2015-06-11  9:15       ` Marc Zyngier
2015-06-11  9:44       ` Andre Przywara
2015-06-11  9:44         ` Andre Przywara
2015-06-11 10:02         ` Marc Zyngier
2015-06-11 10:02           ` Marc Zyngier
2015-06-15 16:11           ` Eric Auger
2015-06-15 16:11             ` Eric Auger
2015-06-17 11:51   ` Eric Auger
2015-06-17 11:51     ` Eric Auger
2015-06-17 12:23     ` Marc Zyngier [this message]
2015-06-17 12:23       ` Marc Zyngier
2015-06-08 17:04 ` [PATCH 08/10] KVM: arm/arm64: vgic: Add vgic_{get, set}_phys_irq_active Marc Zyngier
2015-06-08 17:04   ` [PATCH 08/10] KVM: arm/arm64: vgic: Add vgic_{get,set}_phys_irq_active Marc Zyngier
2015-06-17 15:11   ` [PATCH 08/10] KVM: arm/arm64: vgic: Add vgic_{get, set}_phys_irq_active Eric Auger
2015-06-17 15:11     ` [PATCH 08/10] KVM: arm/arm64: vgic: Add vgic_{get,set}_phys_irq_active Eric Auger
2015-06-08 17:04 ` [PATCH 09/10] KVM: arm/arm64: timer: Allow the timer to control the active state Marc Zyngier
2015-06-08 17:04   ` Marc Zyngier
2015-06-08 17:04 ` [PATCH 10/10] KVM: arm/arm64: vgic: Allow non-shared device HW interrupts Marc Zyngier
2015-06-08 17:04   ` Marc Zyngier
2015-06-17 15:11   ` Eric Auger
2015-06-17 15:11     ` Eric Auger
2015-06-17 15:37     ` Marc Zyngier
2015-06-17 15:37       ` Marc Zyngier
2015-06-17 15:50       ` Eric Auger
2015-06-17 15:50         ` Eric Auger
2015-06-18  8:37         ` Marc Zyngier
2015-06-18  8:37           ` Marc Zyngier
2015-06-18 17:51           ` Eric Auger
2015-06-18 17:51             ` Eric Auger
2015-06-30 20:19   ` Christoffer Dall
2015-06-30 20:19     ` Christoffer Dall
2015-07-01  8:26     ` Marc Zyngier
2015-07-01  8:26       ` Marc Zyngier
2015-07-01  8:57       ` Christoffer Dall
2015-07-01  8:57         ` Christoffer Dall
2015-06-10  8:33 ` [PATCH 00/10] arm/arm64: KVM: Active interrupt state switching for shared devices Eric Auger
2015-06-10  8:33   ` Eric Auger
2015-06-10  9:03   ` Marc Zyngier
2015-06-10  9:03     ` Marc Zyngier
2015-06-10 11:13     ` Eric Auger
2015-06-10 11:13       ` Eric Auger
2015-06-18  6:51 ` Eric Auger
2015-06-18  6:51   ` Eric Auger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=558166A8.3070901@arm.com \
    --to=marc.zyngier@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.