All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/4] better support for bursty DMA usages
@ 2015-07-08 10:10 Pierre-Louis Bossart
  2015-07-08 10:10 ` [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds Pierre-Louis Bossart
                   ` (5 more replies)
  0 siblings, 6 replies; 34+ messages in thread
From: Pierre-Louis Bossart @ 2015-07-08 10:10 UTC (permalink / raw)
  To: alsa-devel; +Cc: Pierre-Louis Bossart

Set of patches to fix issues with hw_ptr fuzziness [1] and increased buffering
 w/ DSPs

1. disable rewinds to allow for new HDaudio SPIB DMA functionality (fetch up to
the application pointer, rewinds not supported)
2. report max in-flight bytes to avoid problems with stale data (late wake-ups,
rewinds)

[1] http://mailman.alsa-project.org/pipermail/alsa-devel/2015-June/093646.html

TODO:
1. fixes and alsa-lib updates (compile-tested only for now)
2. get feedback
3. if supported, set DMA buffering based on negotiation between driver and app (capabilities vs. 
latency needs)

Pierre-Louis Bossart (4):
  ALSA: core: let low-level driver or userspace disable rewinds
  ALSA: core: add .notify callback for pcm ops
  ALSA: core: add report of max dma burst
  ALSA: hda: add default value for max_dma_burst

 include/sound/pcm.h            |  5 +++++
 include/uapi/sound/asound.h    |  6 ++++--
 sound/core/pcm_lib.c           | 19 +++++++++++++++++++
 sound/core/pcm_native.c        | 33 ++++++++++++++++++++++++++++++++-
 sound/pci/hda/hda_controller.c |  1 +
 5 files changed, 61 insertions(+), 3 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds
  2015-07-08 10:10 [RFC PATCH 0/4] better support for bursty DMA usages Pierre-Louis Bossart
@ 2015-07-08 10:10 ` Pierre-Louis Bossart
  2015-07-08 14:21   ` Takashi Iwai
  2015-07-11 17:06   ` Alexander E. Patrakov
  2015-07-08 10:10 ` [RFC PATCH 2/4] ALSA: core: add .notify callback for pcm ops Pierre-Louis Bossart
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 34+ messages in thread
From: Pierre-Louis Bossart @ 2015-07-08 10:10 UTC (permalink / raw)
  To: alsa-devel; +Cc: Pierre-Louis Bossart

Add new hw_params flag to explicitly tell driver that rewinds
will never be used. This can be used by low-level driver to
optimize DMA operations and reduce power consumption.
Use this flag only when data written in ring buffer will
never be invalidated, e.g. any update of appl_ptr is final.

Caveat: there is currently no way to query capabilities without
opening a pcm stream, so applications might need to serially
open all exposed devices, check what they support by looking at
hw_params->info and close them (this is what PulseAudio does so
might not be an issue)

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 include/sound/pcm.h         | 1 +
 include/uapi/sound/asound.h | 1 +
 sound/core/pcm_native.c     | 8 ++++++++
 3 files changed, 10 insertions(+)

diff --git a/include/sound/pcm.h b/include/sound/pcm.h
index 691e7ee..25310b7 100644
--- a/include/sound/pcm.h
+++ b/include/sound/pcm.h
@@ -370,6 +370,7 @@ struct snd_pcm_runtime {
 	unsigned int rate_num;
 	unsigned int rate_den;
 	unsigned int no_period_wakeup: 1;
+	unsigned int no_rewinds:1;
 
 	/* -- SW params -- */
 	int tstamp_mode;		/* mmap timestamp is updated */
diff --git a/include/uapi/sound/asound.h b/include/uapi/sound/asound.h
index a45be6b..b62b162 100644
--- a/include/uapi/sound/asound.h
+++ b/include/uapi/sound/asound.h
@@ -356,6 +356,7 @@ typedef int snd_pcm_hw_param_t;
 #define SNDRV_PCM_HW_PARAMS_NORESAMPLE	(1<<0)	/* avoid rate resampling */
 #define SNDRV_PCM_HW_PARAMS_EXPORT_BUFFER	(1<<1)	/* export buffer */
 #define SNDRV_PCM_HW_PARAMS_NO_PERIOD_WAKEUP	(1<<2)	/* disable period wakeups */
+#define SNDRV_PCM_HW_PARAMS_NO_REWINDS	        (1<<3)	/* disable rewinds */
 
 struct snd_interval {
 	unsigned int min, max;
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index d126c03..a70e52d 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -543,6 +543,8 @@ static int snd_pcm_hw_params(struct snd_pcm_substream *substream,
 	runtime->no_period_wakeup =
 			(params->info & SNDRV_PCM_INFO_NO_PERIOD_WAKEUP) &&
 			(params->flags & SNDRV_PCM_HW_PARAMS_NO_PERIOD_WAKEUP);
+	runtime->no_rewinds =
+			params->flags & SNDRV_PCM_HW_PARAMS_NO_REWINDS;
 
 	bits = snd_pcm_format_physical_width(runtime->format);
 	runtime->sample_bits = bits;
@@ -2428,6 +2430,9 @@ static snd_pcm_sframes_t snd_pcm_playback_rewind(struct snd_pcm_substream *subst
 	if (frames == 0)
 		return 0;
 
+	if (runtime->no_rewinds)
+		return 0;
+
 	snd_pcm_stream_lock_irq(substream);
 	switch (runtime->status->state) {
 	case SNDRV_PCM_STATE_PREPARED:
@@ -2476,6 +2481,9 @@ static snd_pcm_sframes_t snd_pcm_capture_rewind(struct snd_pcm_substream *substr
 	if (frames == 0)
 		return 0;
 
+	if (runtime->no_rewinds)
+		return 0;
+
 	snd_pcm_stream_lock_irq(substream);
 	switch (runtime->status->state) {
 	case SNDRV_PCM_STATE_PREPARED:
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH 2/4] ALSA: core: add .notify callback for pcm ops
  2015-07-08 10:10 [RFC PATCH 0/4] better support for bursty DMA usages Pierre-Louis Bossart
  2015-07-08 10:10 ` [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds Pierre-Louis Bossart
@ 2015-07-08 10:10 ` Pierre-Louis Bossart
  2015-07-08 14:27   ` Takashi Iwai
  2015-07-08 10:10 ` [RFC PATCH 3/4] ALSA: core: add report of max dma burst Pierre-Louis Bossart
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 34+ messages in thread
From: Pierre-Louis Bossart @ 2015-07-08 10:10 UTC (permalink / raw)
  To: alsa-devel; +Cc: Pierre-Louis Bossart

When appl_ptr is updated let low-level driver know.

This is only enabled when the NO_REWIND hardware flag is used,
so that the low-level driver/hardware to opportunistically pre-fetch
data.

FIXME: should we rely on .ack for this?
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 include/sound/pcm.h     |  2 ++
 sound/core/pcm_native.c | 18 +++++++++++++++++-
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/include/sound/pcm.h b/include/sound/pcm.h
index 25310b7..d5eff03 100644
--- a/include/sound/pcm.h
+++ b/include/sound/pcm.h
@@ -87,6 +87,8 @@ struct snd_pcm_ops {
 			     unsigned long offset);
 	int (*mmap)(struct snd_pcm_substream *substream, struct vm_area_struct *vma);
 	int (*ack)(struct snd_pcm_substream *substream);
+	int (*notify)(struct snd_pcm_substream *substream);
+	/* FIXME: what's the difference between ack and notify ? */
 };
 
 /*
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index a70e52d..dd519b8 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -2565,6 +2565,10 @@ static snd_pcm_sframes_t snd_pcm_playback_forward(struct snd_pcm_substream *subs
 		appl_ptr -= runtime->boundary;
 	runtime->control->appl_ptr = appl_ptr;
 	ret = frames;
+
+	if (runtime->no_rewinds && substream->ops->notify)
+		substream->ops->notify(substream);
+
  __end:
 	snd_pcm_stream_unlock_irq(substream);
 	return ret;
@@ -2614,6 +2618,10 @@ static snd_pcm_sframes_t snd_pcm_capture_forward(struct snd_pcm_substream *subst
 		appl_ptr -= runtime->boundary;
 	runtime->control->appl_ptr = appl_ptr;
 	ret = frames;
+
+	if (runtime->no_rewinds && substream->ops->notify)
+		substream->ops->notify(substream);
+
  __end:
 	snd_pcm_stream_unlock_irq(substream);
 	return ret;
@@ -2713,8 +2721,16 @@ static int snd_pcm_sync_ptr(struct snd_pcm_substream *substream,
 			return err;
 	}
 	snd_pcm_stream_lock_irq(substream);
-	if (!(sync_ptr.flags & SNDRV_PCM_SYNC_PTR_APPL))
+	if (!(sync_ptr.flags & SNDRV_PCM_SYNC_PTR_APPL)) {
+		/* FIXME: this code is used by mmap_commit, should it handle boundary
+		 *   wrap-around as done for read/write in pcm_lib.c
+		 */
+
 		control->appl_ptr = sync_ptr.c.control.appl_ptr;
+		/* if supported, let low-level driver know about appl_ptr change */
+		if (runtime->no_rewinds && substream->ops->notify)
+			substream->ops->notify(substream);
+	}
 	else
 		sync_ptr.c.control.appl_ptr = control->appl_ptr;
 	if (!(sync_ptr.flags & SNDRV_PCM_SYNC_PTR_AVAIL_MIN))
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH 3/4] ALSA: core: add report of max dma burst
  2015-07-08 10:10 [RFC PATCH 0/4] better support for bursty DMA usages Pierre-Louis Bossart
  2015-07-08 10:10 ` [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds Pierre-Louis Bossart
  2015-07-08 10:10 ` [RFC PATCH 2/4] ALSA: core: add .notify callback for pcm ops Pierre-Louis Bossart
@ 2015-07-08 10:10 ` Pierre-Louis Bossart
  2015-07-08 14:37   ` Takashi Iwai
  2015-07-08 10:10 ` [RFC PATCH 4/4] ALSA: hda: add default value for max_dma_burst Pierre-Louis Bossart
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 34+ messages in thread
From: Pierre-Louis Bossart @ 2015-07-08 10:10 UTC (permalink / raw)
  To: alsa-devel; +Cc: Pierre-Louis Bossart

Report how many bytes the DMA can burst before updating the hw_ptr.
This can help fix two issues with stale data discussed many times over
on the alsa-devel mailing list (refilling/reading ring buffer too late or
rewinding too close to the hw_ptr)

FIXME:
1. this was copy/pasted/edited based on the fifo_size fields, not
sure why we would need this IOCTL1
2. we also need the ability to set the actual fifo size, if suppported,
by the hardware, to negociate the prefetch amount between application
and driver. By default the default should be max buffering to allow
for lower power, but for low-latency use cases we may want to reduce
the amount of prefetching.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 include/sound/pcm.h         |  2 ++
 include/uapi/sound/asound.h |  5 +++--
 sound/core/pcm_lib.c        | 19 +++++++++++++++++++
 sound/core/pcm_native.c     |  7 +++++++
 4 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/include/sound/pcm.h b/include/sound/pcm.h
index d5eff03..57c0571 100644
--- a/include/sound/pcm.h
+++ b/include/sound/pcm.h
@@ -56,6 +56,7 @@ struct snd_pcm_hardware {
 	unsigned int periods_min;	/* min # of periods */
 	unsigned int periods_max;	/* max # of periods */
 	size_t fifo_size;		/* fifo size in bytes */
+	unsigned int max_dma_burst;     /* max DMA in-flight bytes, linked to hw_ptr precision/fuzziness */
 };
 
 struct snd_pcm_substream;
@@ -106,6 +107,7 @@ struct snd_pcm_ops {
 #define SNDRV_PCM_IOCTL1_CHANNEL_INFO	2
 #define SNDRV_PCM_IOCTL1_GSTATE		3
 #define SNDRV_PCM_IOCTL1_FIFO_SIZE	4
+#define SNDRV_PCM_IOCTL1_MAX_DMA_BURST	5
 
 #define SNDRV_PCM_TRIGGER_STOP		0
 #define SNDRV_PCM_TRIGGER_START		1
diff --git a/include/uapi/sound/asound.h b/include/uapi/sound/asound.h
index b62b162..3dc049a 100644
--- a/include/uapi/sound/asound.h
+++ b/include/uapi/sound/asound.h
@@ -386,8 +386,9 @@ struct snd_pcm_hw_params {
 	unsigned int msbits;		/* R: used most significant bits */
 	unsigned int rate_num;		/* R: rate numerator */
 	unsigned int rate_den;		/* R: rate denominator */
-	snd_pcm_uframes_t fifo_size;	/* R: chip FIFO size in frames */
-	unsigned char reserved[64];	/* reserved for future */
+	snd_pcm_uframes_t fifo_size;	/* R: chip FIFO size in frames, indicates buffering after hw_ptr */
+	unsigned int max_dma_burst;     /* R: max in-flight bytes, indicates buffering before hw_ptr */
+	unsigned char reserved[60];	/* reserved for future */
 };
 
 enum {
diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c
index 7d45645..dc89aa0 100644
--- a/sound/core/pcm_lib.c
+++ b/sound/core/pcm_lib.c
@@ -1826,6 +1826,22 @@ static int snd_pcm_lib_ioctl_fifo_size(struct snd_pcm_substream *substream,
 	return 0;
 }
 
+static int snd_pcm_lib_ioctl_max_dma_burst(struct snd_pcm_substream *substream,
+					      void *arg)
+{
+	struct snd_pcm_hw_params *params = arg;
+
+	/* FIXME: add sanity checks:
+	 * max burst < ring buffer
+	 * max burst >= min_period_size
+	 * not sure if we can check against period sizes?
+	 * any others ?
+	 */
+	params->max_dma_burst = substream->runtime->hw.max_dma_burst;
+
+	return 0;
+}
+
 /**
  * snd_pcm_lib_ioctl - a generic PCM ioctl callback
  * @substream: the pcm substream instance
@@ -1849,6 +1865,9 @@ int snd_pcm_lib_ioctl(struct snd_pcm_substream *substream,
 		return snd_pcm_lib_ioctl_channel_info(substream, arg);
 	case SNDRV_PCM_IOCTL1_FIFO_SIZE:
 		return snd_pcm_lib_ioctl_fifo_size(substream, arg);
+	case SNDRV_PCM_IOCTL1_MAX_DMA_BURST:
+		return snd_pcm_lib_ioctl_max_dma_burst(substream, arg);
+
 	}
 	return -ENXIO;
 }
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index dd519b8..3d379b8 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -280,6 +280,7 @@ int snd_pcm_hw_refine(struct snd_pcm_substream *substream,
 
 	params->info = 0;
 	params->fifo_size = 0;
+	params->max_dma_burst = 0;
 	if (params->rmask & (1 << SNDRV_PCM_HW_PARAM_SAMPLE_BITS))
 		params->msbits = 0;
 	if (params->rmask & (1 << SNDRV_PCM_HW_PARAM_RATE)) {
@@ -437,6 +438,12 @@ int snd_pcm_hw_refine(struct snd_pcm_substream *substream,
 				return changed;
 		}
 	}
+	if (!params->max_dma_burst) {
+		changed = substream->ops->ioctl(substream,
+						SNDRV_PCM_IOCTL1_MAX_DMA_BURST, params);
+		if (changed < 0)
+			return changed;
+	}
 	params->rmask = 0;
 	return 0;
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH 4/4] ALSA: hda: add default value for max_dma_burst
  2015-07-08 10:10 [RFC PATCH 0/4] better support for bursty DMA usages Pierre-Louis Bossart
                   ` (2 preceding siblings ...)
  2015-07-08 10:10 ` [RFC PATCH 3/4] ALSA: core: add report of max dma burst Pierre-Louis Bossart
@ 2015-07-08 10:10 ` Pierre-Louis Bossart
  2015-07-08 14:31 ` [RFC PATCH 0/4] better support for bursty DMA usages Takashi Iwai
  2015-07-15 10:14 ` Raymond Yau
  5 siblings, 0 replies; 34+ messages in thread
From: Pierre-Louis Bossart @ 2015-07-08 10:10 UTC (permalink / raw)
  To: alsa-devel; +Cc: Pierre-Louis Bossart

set value to 128 bytes for legacy HDAudio, can be overridden
as needed (on a per-stream basis) for enhanced hardware with
more buffering capabilities

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 sound/pci/hda/hda_controller.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sound/pci/hda/hda_controller.c b/sound/pci/hda/hda_controller.c
index 9444559..958c8f1 100644
--- a/sound/pci/hda/hda_controller.c
+++ b/sound/pci/hda/hda_controller.c
@@ -391,6 +391,7 @@ static struct snd_pcm_hardware azx_pcm_hw = {
 	.periods_min =		2,
 	.periods_max =		AZX_MAX_FRAG,
 	.fifo_size =		0,
+	.max_dma_burst = 128 /* default value for all legacy HDAudio controllers, override as needed */
 };
 
 static int azx_pcm_open(struct snd_pcm_substream *substream)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds
  2015-07-08 10:10 ` [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds Pierre-Louis Bossart
@ 2015-07-08 14:21   ` Takashi Iwai
  2015-07-08 16:58     ` Pierre-Louis Bossart
  2015-07-11 17:06   ` Alexander E. Patrakov
  1 sibling, 1 reply; 34+ messages in thread
From: Takashi Iwai @ 2015-07-08 14:21 UTC (permalink / raw)
  To: Pierre-Louis Bossart; +Cc: alsa-devel

On Wed, 08 Jul 2015 12:10:33 +0200,
Pierre-Louis Bossart wrote:
> 
> Add new hw_params flag to explicitly tell driver that rewinds
> will never be used. This can be used by low-level driver to
> optimize DMA operations and reduce power consumption.
> Use this flag only when data written in ring buffer will
> never be invalidated, e.g. any update of appl_ptr is final.
> 
> Caveat: there is currently no way to query capabilities without
> opening a pcm stream, so applications might need to serially
> open all exposed devices, check what they support by looking at
> hw_params->info and close them (this is what PulseAudio does so
> might not be an issue)

The forward should also fail with such hardware, no?


Takashi

> 
> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> ---
>  include/sound/pcm.h         | 1 +
>  include/uapi/sound/asound.h | 1 +
>  sound/core/pcm_native.c     | 8 ++++++++
>  3 files changed, 10 insertions(+)
> 
> diff --git a/include/sound/pcm.h b/include/sound/pcm.h
> index 691e7ee..25310b7 100644
> --- a/include/sound/pcm.h
> +++ b/include/sound/pcm.h
> @@ -370,6 +370,7 @@ struct snd_pcm_runtime {
>  	unsigned int rate_num;
>  	unsigned int rate_den;
>  	unsigned int no_period_wakeup: 1;
> +	unsigned int no_rewinds:1;
>  
>  	/* -- SW params -- */
>  	int tstamp_mode;		/* mmap timestamp is updated */
> diff --git a/include/uapi/sound/asound.h b/include/uapi/sound/asound.h
> index a45be6b..b62b162 100644
> --- a/include/uapi/sound/asound.h
> +++ b/include/uapi/sound/asound.h
> @@ -356,6 +356,7 @@ typedef int snd_pcm_hw_param_t;
>  #define SNDRV_PCM_HW_PARAMS_NORESAMPLE	(1<<0)	/* avoid rate resampling */
>  #define SNDRV_PCM_HW_PARAMS_EXPORT_BUFFER	(1<<1)	/* export buffer */
>  #define SNDRV_PCM_HW_PARAMS_NO_PERIOD_WAKEUP	(1<<2)	/* disable period wakeups */
> +#define SNDRV_PCM_HW_PARAMS_NO_REWINDS	        (1<<3)	/* disable rewinds */
>  
>  struct snd_interval {
>  	unsigned int min, max;
> diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
> index d126c03..a70e52d 100644
> --- a/sound/core/pcm_native.c
> +++ b/sound/core/pcm_native.c
> @@ -543,6 +543,8 @@ static int snd_pcm_hw_params(struct snd_pcm_substream *substream,
>  	runtime->no_period_wakeup =
>  			(params->info & SNDRV_PCM_INFO_NO_PERIOD_WAKEUP) &&
>  			(params->flags & SNDRV_PCM_HW_PARAMS_NO_PERIOD_WAKEUP);
> +	runtime->no_rewinds =
> +			params->flags & SNDRV_PCM_HW_PARAMS_NO_REWINDS;
>  
>  	bits = snd_pcm_format_physical_width(runtime->format);
>  	runtime->sample_bits = bits;
> @@ -2428,6 +2430,9 @@ static snd_pcm_sframes_t snd_pcm_playback_rewind(struct snd_pcm_substream *subst
>  	if (frames == 0)
>  		return 0;
>  
> +	if (runtime->no_rewinds)
> +		return 0;
> +
>  	snd_pcm_stream_lock_irq(substream);
>  	switch (runtime->status->state) {
>  	case SNDRV_PCM_STATE_PREPARED:
> @@ -2476,6 +2481,9 @@ static snd_pcm_sframes_t snd_pcm_capture_rewind(struct snd_pcm_substream *substr
>  	if (frames == 0)
>  		return 0;
>  
> +	if (runtime->no_rewinds)
> +		return 0;
> +
>  	snd_pcm_stream_lock_irq(substream);
>  	switch (runtime->status->state) {
>  	case SNDRV_PCM_STATE_PREPARED:
> -- 
> 1.9.1
> 
> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@alsa-project.org
> http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 2/4] ALSA: core: add .notify callback for pcm ops
  2015-07-08 10:10 ` [RFC PATCH 2/4] ALSA: core: add .notify callback for pcm ops Pierre-Louis Bossart
@ 2015-07-08 14:27   ` Takashi Iwai
  2015-07-08 17:10     ` Pierre-Louis Bossart
  2015-07-09  7:25     ` Raymond Yau
  0 siblings, 2 replies; 34+ messages in thread
From: Takashi Iwai @ 2015-07-08 14:27 UTC (permalink / raw)
  To: Pierre-Louis Bossart; +Cc: alsa-devel

On Wed, 08 Jul 2015 12:10:34 +0200,
Pierre-Louis Bossart wrote:
> 
> When appl_ptr is updated let low-level driver know.
> 
> This is only enabled when the NO_REWIND hardware flag is used,
> so that the low-level driver/hardware to opportunistically pre-fetch
> data.
> 
> FIXME: should we rely on .ack for this?
> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>

Hmm, OK, so the forward is allowed but with workarounds...
But then why rewind won't work in a similar way?  DSP might be able to
cancel some of inflight data.

In other words, I see no reason to strict notify callback only for
no_rewinds.  This is an optional ops in anyway.

Also, I find the name "notify" a bit too ambiguous.  In this case,
it's notifying the applptr change.  So, a name related with the
function would be more understandable.


thanks,

Takashi

> ---
>  include/sound/pcm.h     |  2 ++
>  sound/core/pcm_native.c | 18 +++++++++++++++++-
>  2 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/include/sound/pcm.h b/include/sound/pcm.h
> index 25310b7..d5eff03 100644
> --- a/include/sound/pcm.h
> +++ b/include/sound/pcm.h
> @@ -87,6 +87,8 @@ struct snd_pcm_ops {
>  			     unsigned long offset);
>  	int (*mmap)(struct snd_pcm_substream *substream, struct vm_area_struct *vma);
>  	int (*ack)(struct snd_pcm_substream *substream);
> +	int (*notify)(struct snd_pcm_substream *substream);
> +	/* FIXME: what's the difference between ack and notify ? */
>  };
>  
>  /*
> diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
> index a70e52d..dd519b8 100644
> --- a/sound/core/pcm_native.c
> +++ b/sound/core/pcm_native.c
> @@ -2565,6 +2565,10 @@ static snd_pcm_sframes_t snd_pcm_playback_forward(struct snd_pcm_substream *subs
>  		appl_ptr -= runtime->boundary;
>  	runtime->control->appl_ptr = appl_ptr;
>  	ret = frames;
> +
> +	if (runtime->no_rewinds && substream->ops->notify)
> +		substream->ops->notify(substream);
> +
>   __end:
>  	snd_pcm_stream_unlock_irq(substream);
>  	return ret;
> @@ -2614,6 +2618,10 @@ static snd_pcm_sframes_t snd_pcm_capture_forward(struct snd_pcm_substream *subst
>  		appl_ptr -= runtime->boundary;
>  	runtime->control->appl_ptr = appl_ptr;
>  	ret = frames;
> +
> +	if (runtime->no_rewinds && substream->ops->notify)
> +		substream->ops->notify(substream);
> +
>   __end:
>  	snd_pcm_stream_unlock_irq(substream);
>  	return ret;
> @@ -2713,8 +2721,16 @@ static int snd_pcm_sync_ptr(struct snd_pcm_substream *substream,
>  			return err;
>  	}
>  	snd_pcm_stream_lock_irq(substream);
> -	if (!(sync_ptr.flags & SNDRV_PCM_SYNC_PTR_APPL))
> +	if (!(sync_ptr.flags & SNDRV_PCM_SYNC_PTR_APPL)) {
> +		/* FIXME: this code is used by mmap_commit, should it handle boundary
> +		 *   wrap-around as done for read/write in pcm_lib.c
> +		 */
> +
>  		control->appl_ptr = sync_ptr.c.control.appl_ptr;
> +		/* if supported, let low-level driver know about appl_ptr change */
> +		if (runtime->no_rewinds && substream->ops->notify)
> +			substream->ops->notify(substream);
> +	}
>  	else
>  		sync_ptr.c.control.appl_ptr = control->appl_ptr;
>  	if (!(sync_ptr.flags & SNDRV_PCM_SYNC_PTR_AVAIL_MIN))
> -- 
> 1.9.1
> 
> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@alsa-project.org
> http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 0/4] better support for bursty DMA usages
  2015-07-08 10:10 [RFC PATCH 0/4] better support for bursty DMA usages Pierre-Louis Bossart
                   ` (3 preceding siblings ...)
  2015-07-08 10:10 ` [RFC PATCH 4/4] ALSA: hda: add default value for max_dma_burst Pierre-Louis Bossart
@ 2015-07-08 14:31 ` Takashi Iwai
  2015-07-08 17:50   ` Pierre-Louis Bossart
  2015-07-15 10:14 ` Raymond Yau
  5 siblings, 1 reply; 34+ messages in thread
From: Takashi Iwai @ 2015-07-08 14:31 UTC (permalink / raw)
  To: Pierre-Louis Bossart; +Cc: alsa-devel

On Wed, 08 Jul 2015 12:10:32 +0200,
Pierre-Louis Bossart wrote:
> 
> Set of patches to fix issues with hw_ptr fuzziness [1] and increased buffering
>  w/ DSPs
> 
> 1. disable rewinds to allow for new HDaudio SPIB DMA functionality (fetch up to
> the application pointer, rewinds not supported)
> 2. report max in-flight bytes to avoid problems with stale data (late wake-ups,
> rewinds)
> 
> [1] http://mailman.alsa-project.org/pipermail/alsa-devel/2015-June/093646.html
> 
> TODO:
> 1. fixes and alsa-lib updates (compile-tested only for now)
> 2. get feedback
> 3. if supported, set DMA buffering based on negotiation between driver and app (capabilities vs. 
> latency needs)

Thanks for heading up.  I wanted to start working on this by myself,
too, but it was pending due to the horribly hot summer days here :)

The new flag and the callback are fairly similar as what I had in my
mind.  They look simple enough (although details need more discussion
and evaulation). 

For the DMA burst thing, I'm not quite sure whether it's the best
form, including its naming.  But I'd like to be neutral about this,
and hopefully others will give opinion for this or give alternatives.


Takashi


> Pierre-Louis Bossart (4):
>   ALSA: core: let low-level driver or userspace disable rewinds
>   ALSA: core: add .notify callback for pcm ops
>   ALSA: core: add report of max dma burst
>   ALSA: hda: add default value for max_dma_burst
> 
>  include/sound/pcm.h            |  5 +++++
>  include/uapi/sound/asound.h    |  6 ++++--
>  sound/core/pcm_lib.c           | 19 +++++++++++++++++++
>  sound/core/pcm_native.c        | 33 ++++++++++++++++++++++++++++++++-
>  sound/pci/hda/hda_controller.c |  1 +
>  5 files changed, 61 insertions(+), 3 deletions(-)
> 
> -- 
> 1.9.1
> 
> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@alsa-project.org
> http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 3/4] ALSA: core: add report of max dma burst
  2015-07-08 10:10 ` [RFC PATCH 3/4] ALSA: core: add report of max dma burst Pierre-Louis Bossart
@ 2015-07-08 14:37   ` Takashi Iwai
  2015-07-08 17:46     ` Pierre-Louis Bossart
  2015-07-11 17:46     ` Alexander E. Patrakov
  0 siblings, 2 replies; 34+ messages in thread
From: Takashi Iwai @ 2015-07-08 14:37 UTC (permalink / raw)
  To: Pierre-Louis Bossart; +Cc: alsa-devel

On Wed, 08 Jul 2015 12:10:35 +0200,
Pierre-Louis Bossart wrote:
> 
> Report how many bytes the DMA can burst before updating the hw_ptr.
> This can help fix two issues with stale data discussed many times over
> on the alsa-devel mailing list (refilling/reading ring buffer too late or
> rewinding too close to the hw_ptr)
> 
> FIXME:
> 1. this was copy/pasted/edited based on the fifo_size fields, not
> sure why we would need this IOCTL1

fifo_size would fit, but it means that it also changes the current
behavior.  I don't believe that currently there are many programs
relying on this value, but who knows.

> 2. we also need the ability to set the actual fifo size, if suppported,
> by the hardware, to negociate the prefetch amount between application
> and driver. By default the default should be max buffering to allow
> for lower power, but for low-latency use cases we may want to reduce
> the amount of prefetching.

Right.  So a hw_parmas field looks suitable for that purpose.


Takashi


> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> ---
>  include/sound/pcm.h         |  2 ++
>  include/uapi/sound/asound.h |  5 +++--
>  sound/core/pcm_lib.c        | 19 +++++++++++++++++++
>  sound/core/pcm_native.c     |  7 +++++++
>  4 files changed, 31 insertions(+), 2 deletions(-)
> 
> diff --git a/include/sound/pcm.h b/include/sound/pcm.h
> index d5eff03..57c0571 100644
> --- a/include/sound/pcm.h
> +++ b/include/sound/pcm.h
> @@ -56,6 +56,7 @@ struct snd_pcm_hardware {
>  	unsigned int periods_min;	/* min # of periods */
>  	unsigned int periods_max;	/* max # of periods */
>  	size_t fifo_size;		/* fifo size in bytes */
> +	unsigned int max_dma_burst;     /* max DMA in-flight bytes, linked to hw_ptr precision/fuzziness */
>  };
>  
>  struct snd_pcm_substream;
> @@ -106,6 +107,7 @@ struct snd_pcm_ops {
>  #define SNDRV_PCM_IOCTL1_CHANNEL_INFO	2
>  #define SNDRV_PCM_IOCTL1_GSTATE		3
>  #define SNDRV_PCM_IOCTL1_FIFO_SIZE	4
> +#define SNDRV_PCM_IOCTL1_MAX_DMA_BURST	5
>  
>  #define SNDRV_PCM_TRIGGER_STOP		0
>  #define SNDRV_PCM_TRIGGER_START		1
> diff --git a/include/uapi/sound/asound.h b/include/uapi/sound/asound.h
> index b62b162..3dc049a 100644
> --- a/include/uapi/sound/asound.h
> +++ b/include/uapi/sound/asound.h
> @@ -386,8 +386,9 @@ struct snd_pcm_hw_params {
>  	unsigned int msbits;		/* R: used most significant bits */
>  	unsigned int rate_num;		/* R: rate numerator */
>  	unsigned int rate_den;		/* R: rate denominator */
> -	snd_pcm_uframes_t fifo_size;	/* R: chip FIFO size in frames */
> -	unsigned char reserved[64];	/* reserved for future */
> +	snd_pcm_uframes_t fifo_size;	/* R: chip FIFO size in frames, indicates buffering after hw_ptr */
> +	unsigned int max_dma_burst;     /* R: max in-flight bytes, indicates buffering before hw_ptr */
> +	unsigned char reserved[60];	/* reserved for future */
>  };
>  
>  enum {
> diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c
> index 7d45645..dc89aa0 100644
> --- a/sound/core/pcm_lib.c
> +++ b/sound/core/pcm_lib.c
> @@ -1826,6 +1826,22 @@ static int snd_pcm_lib_ioctl_fifo_size(struct snd_pcm_substream *substream,
>  	return 0;
>  }
>  
> +static int snd_pcm_lib_ioctl_max_dma_burst(struct snd_pcm_substream *substream,
> +					      void *arg)
> +{
> +	struct snd_pcm_hw_params *params = arg;
> +
> +	/* FIXME: add sanity checks:
> +	 * max burst < ring buffer
> +	 * max burst >= min_period_size
> +	 * not sure if we can check against period sizes?
> +	 * any others ?
> +	 */
> +	params->max_dma_burst = substream->runtime->hw.max_dma_burst;
> +
> +	return 0;
> +}
> +
>  /**
>   * snd_pcm_lib_ioctl - a generic PCM ioctl callback
>   * @substream: the pcm substream instance
> @@ -1849,6 +1865,9 @@ int snd_pcm_lib_ioctl(struct snd_pcm_substream *substream,
>  		return snd_pcm_lib_ioctl_channel_info(substream, arg);
>  	case SNDRV_PCM_IOCTL1_FIFO_SIZE:
>  		return snd_pcm_lib_ioctl_fifo_size(substream, arg);
> +	case SNDRV_PCM_IOCTL1_MAX_DMA_BURST:
> +		return snd_pcm_lib_ioctl_max_dma_burst(substream, arg);
> +
>  	}
>  	return -ENXIO;
>  }
> diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
> index dd519b8..3d379b8 100644
> --- a/sound/core/pcm_native.c
> +++ b/sound/core/pcm_native.c
> @@ -280,6 +280,7 @@ int snd_pcm_hw_refine(struct snd_pcm_substream *substream,
>  
>  	params->info = 0;
>  	params->fifo_size = 0;
> +	params->max_dma_burst = 0;
>  	if (params->rmask & (1 << SNDRV_PCM_HW_PARAM_SAMPLE_BITS))
>  		params->msbits = 0;
>  	if (params->rmask & (1 << SNDRV_PCM_HW_PARAM_RATE)) {
> @@ -437,6 +438,12 @@ int snd_pcm_hw_refine(struct snd_pcm_substream *substream,
>  				return changed;
>  		}
>  	}
> +	if (!params->max_dma_burst) {
> +		changed = substream->ops->ioctl(substream,
> +						SNDRV_PCM_IOCTL1_MAX_DMA_BURST, params);
> +		if (changed < 0)
> +			return changed;
> +	}
>  	params->rmask = 0;
>  	return 0;
>  }
> -- 
> 1.9.1
> 
> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@alsa-project.org
> http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds
  2015-07-08 14:21   ` Takashi Iwai
@ 2015-07-08 16:58     ` Pierre-Louis Bossart
  0 siblings, 0 replies; 34+ messages in thread
From: Pierre-Louis Bossart @ 2015-07-08 16:58 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel


>> Add new hw_params flag to explicitly tell driver that rewinds
>> will never be used. This can be used by low-level driver to
>> optimize DMA operations and reduce power consumption.
>> Use this flag only when data written in ring buffer will
>> never be invalidated, e.g. any update of appl_ptr is final.
>>
>> Caveat: there is currently no way to query capabilities without
>> opening a pcm stream, so applications might need to serially
>> open all exposed devices, check what they support by looking at
>> hw_params->info and close them (this is what PulseAudio does so
>> might not be an issue)
>
> The forward should also fail with such hardware, no?

The way I understand the forward is just an update of the application 
pointer without writing/reading new data, so that works. The hardware 
will just use whatever is in the ring buffer. It's weird but it would work.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 2/4] ALSA: core: add .notify callback for pcm ops
  2015-07-08 14:27   ` Takashi Iwai
@ 2015-07-08 17:10     ` Pierre-Louis Bossart
  2015-07-09 14:44       ` Takashi Iwai
  2015-07-09  7:25     ` Raymond Yau
  1 sibling, 1 reply; 34+ messages in thread
From: Pierre-Louis Bossart @ 2015-07-08 17:10 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel


>> This is only enabled when the NO_REWIND hardware flag is used,
>> so that the low-level driver/hardware to opportunistically pre-fetch
>> data.
>>
>> FIXME: should we rely on .ack for this?
>> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
>
> Hmm, OK, so the forward is allowed but with workarounds...
> But then why rewind won't work in a similar way?  DSP might be able to
> cancel some of inflight data.

Nope, this is explicitly not supported, so unfortunately if we want to 
optimize for power and let hardware fetch data when it's most 
appropriate rewinds need to be disabled.

> In other words, I see no reason to strict notify callback only for
> no_rewinds.  This is an optional ops in anyway.

It's fine to remove the check. I added this based on internal review 
comments but I agree with your point.

> Also, I find the name "notify" a bit too ambiguous.  In this case,
> it's notifying the applptr change.  So, a name related with the
> function would be more understandable.

The first open I had was to know if we could use .ack for this? if a 
different callback is needed, we can use 'appl_ptr_update' instead of 
'notify'

>
>
> thanks,
>
> Takashi
>
>> ---
>>   include/sound/pcm.h     |  2 ++
>>   sound/core/pcm_native.c | 18 +++++++++++++++++-
>>   2 files changed, 19 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/sound/pcm.h b/include/sound/pcm.h
>> index 25310b7..d5eff03 100644
>> --- a/include/sound/pcm.h
>> +++ b/include/sound/pcm.h
>> @@ -87,6 +87,8 @@ struct snd_pcm_ops {
>>   			     unsigned long offset);
>>   	int (*mmap)(struct snd_pcm_substream *substream, struct vm_area_struct *vma);
>>   	int (*ack)(struct snd_pcm_substream *substream);
>> +	int (*notify)(struct snd_pcm_substream *substream);
>> +	/* FIXME: what's the difference between ack and notify ? */
>>   };
>>
>>   /*
>> diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
>> index a70e52d..dd519b8 100644
>> --- a/sound/core/pcm_native.c
>> +++ b/sound/core/pcm_native.c
>> @@ -2565,6 +2565,10 @@ static snd_pcm_sframes_t snd_pcm_playback_forward(struct snd_pcm_substream *subs
>>   		appl_ptr -= runtime->boundary;
>>   	runtime->control->appl_ptr = appl_ptr;
>>   	ret = frames;
>> +
>> +	if (runtime->no_rewinds && substream->ops->notify)
>> +		substream->ops->notify(substream);
>> +
>>    __end:
>>   	snd_pcm_stream_unlock_irq(substream);
>>   	return ret;
>> @@ -2614,6 +2618,10 @@ static snd_pcm_sframes_t snd_pcm_capture_forward(struct snd_pcm_substream *subst
>>   		appl_ptr -= runtime->boundary;
>>   	runtime->control->appl_ptr = appl_ptr;
>>   	ret = frames;
>> +
>> +	if (runtime->no_rewinds && substream->ops->notify)
>> +		substream->ops->notify(substream);
>> +
>>    __end:
>>   	snd_pcm_stream_unlock_irq(substream);
>>   	return ret;
>> @@ -2713,8 +2721,16 @@ static int snd_pcm_sync_ptr(struct snd_pcm_substream *substream,
>>   			return err;
>>   	}
>>   	snd_pcm_stream_lock_irq(substream);
>> -	if (!(sync_ptr.flags & SNDRV_PCM_SYNC_PTR_APPL))
>> +	if (!(sync_ptr.flags & SNDRV_PCM_SYNC_PTR_APPL)) {
>> +		/* FIXME: this code is used by mmap_commit, should it handle boundary
>> +		 *   wrap-around as done for read/write in pcm_lib.c
>> +		 */
>> +
>>   		control->appl_ptr = sync_ptr.c.control.appl_ptr;
>> +		/* if supported, let low-level driver know about appl_ptr change */
>> +		if (runtime->no_rewinds && substream->ops->notify)
>> +			substream->ops->notify(substream);
>> +	}
>>   	else
>>   		sync_ptr.c.control.appl_ptr = control->appl_ptr;
>>   	if (!(sync_ptr.flags & SNDRV_PCM_SYNC_PTR_AVAIL_MIN))
>> --
>> 1.9.1
>>
>> _______________________________________________
>> Alsa-devel mailing list
>> Alsa-devel@alsa-project.org
>> http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
>>
> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@alsa-project.org
> http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 3/4] ALSA: core: add report of max dma burst
  2015-07-08 14:37   ` Takashi Iwai
@ 2015-07-08 17:46     ` Pierre-Louis Bossart
  2015-07-10  2:35       ` Raymond Yau
  2015-07-11 17:46     ` Alexander E. Patrakov
  1 sibling, 1 reply; 34+ messages in thread
From: Pierre-Louis Bossart @ 2015-07-08 17:46 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel


>> FIXME:
>> 1. this was copy/pasted/edited based on the fifo_size fields, not
>> sure why we would need this IOCTL1
>
> fifo_size would fit, but it means that it also changes the current
> behavior.  I don't believe that currently there are many programs
> relying on this value, but who knows.

I saw a mention of fifo_size in the VIA HDA controller, that's why I 
went ahead with a different field. The fifo_size could be useful as a 
max value for the internal hardware delay, when app use the 'delay' 
field in the status structure they get a very dynamic value that isn't 
necessarily straightforward to use.

>
>> 2. we also need the ability to set the actual fifo size, if suppported,
>> by the hardware, to negociate the prefetch amount between application
>> and driver. By default the default should be max buffering to allow
>> for lower power, but for low-latency use cases we may want to reduce
>> the amount of prefetching.
>
> Right.  So a hw_parmas field looks suitable for that purpose.

That 'set' capability is a lot more complicated, I am really having a 
hard time with all the constraints for hw_params and how to represent 
min,max and step values... If anyone is willing to help on that part I 
wouldn't mind, this is really a part of ALSA I never looked into...

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 0/4] better support for bursty DMA usages
  2015-07-08 14:31 ` [RFC PATCH 0/4] better support for bursty DMA usages Takashi Iwai
@ 2015-07-08 17:50   ` Pierre-Louis Bossart
  0 siblings, 0 replies; 34+ messages in thread
From: Pierre-Louis Bossart @ 2015-07-08 17:50 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel

On 7/8/15 4:31 PM, Takashi Iwai wrote:
> On Wed, 08 Jul 2015 12:10:32 +0200,
> Pierre-Louis Bossart wrote:
>>
>> Set of patches to fix issues with hw_ptr fuzziness [1] and increased buffering
>>   w/ DSPs
>>
>> 1. disable rewinds to allow for new HDaudio SPIB DMA functionality (fetch up to
>> the application pointer, rewinds not supported)
>> 2. report max in-flight bytes to avoid problems with stale data (late wake-ups,
>> rewinds)
>>
>> [1] http://mailman.alsa-project.org/pipermail/alsa-devel/2015-June/093646.html
>>
>> TODO:
>> 1. fixes and alsa-lib updates (compile-tested only for now)
>> 2. get feedback
>> 3. if supported, set DMA buffering based on negotiation between driver and app (capabilities vs.
>> latency needs)
>
> Thanks for heading up.  I wanted to start working on this by myself,
> too, but it was pending due to the horribly hot summer days here :)

Thanks for the quick review Takashi, much appreciated. The work was done 
before the heat wave on my side...

> The new flag and the callback are fairly similar as what I had in my
> mind.  They look simple enough (although details need more discussion
> and evaulation).
>
> For the DMA burst thing, I'm not quite sure whether it's the best
> form, including its naming.  But I'd like to be neutral about this,
> and hopefully others will give opinion for this or give alternatives.

Yes, this is really a first draft to try and solve the problems 
mentioned multiple times by Arun and Alexander and help use the latest 
and greatest hardware. If others have comments we are all ears.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 2/4] ALSA: core: add .notify callback for pcm ops
  2015-07-08 14:27   ` Takashi Iwai
  2015-07-08 17:10     ` Pierre-Louis Bossart
@ 2015-07-09  7:25     ` Raymond Yau
  2015-07-09  7:35       ` Pierre-Louis Bossart
  1 sibling, 1 reply; 34+ messages in thread
From: Raymond Yau @ 2015-07-09  7:25 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: ALSA Development Mailing List, Pierre-Louis Bossart

> >
> > When appl_ptr is updated let low-level driver know.
> >
> > This is only enabled when the NO_REWIND hardware flag is used,
> > so that the low-level driver/hardware to opportunistically pre-fetch
> > data.
> >
> > FIXME: should we rely on .ack for this?
> > Signed-off-by: Pierre-Louis Bossart <
pierre-louis.bossart@linux.intel.com>
>
> Hmm, OK, so the forward is allowed but with workarounds...
> But then why rewind won't work in a similar way?  DSP might be able to
> cancel some of inflight data.
>
> In other words, I see no reason to strict notify callback only for
> no_rewinds.  This is an optional ops in anyway.
>
> Also, I find the name "notify" a bit too ambiguous.  In this case,
> it's notifying the applptr change.  So, a name related with the
> function would be more understandable.
>
>

If driver specify no rewind flag, should alsa lib

1) return error when application call snd_pcm_rewind() and
snd_pcm_forward() ?
2) return zero when call snd_pcm_rewindable() and snd_pcm_forwardable()

How can the application recover when hw_ptr is behind appl_ptr when stop
threshold is set to boundary ?

Do you mean compressed audio stream don't support rewind and forward ?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 2/4] ALSA: core: add .notify callback for pcm ops
  2015-07-09  7:25     ` Raymond Yau
@ 2015-07-09  7:35       ` Pierre-Louis Bossart
  0 siblings, 0 replies; 34+ messages in thread
From: Pierre-Louis Bossart @ 2015-07-09  7:35 UTC (permalink / raw)
  To: Raymond Yau, Takashi Iwai; +Cc: ALSA Development Mailing List

On 7/9/15 9:25 AM, Raymond Yau wrote:
>>>
>>> When appl_ptr is updated let low-level driver know.
>>>
>>> This is only enabled when the NO_REWIND hardware flag is used,
>>> so that the low-level driver/hardware to opportunistically pre-fetch
>>> data.
>>>
>>> FIXME: should we rely on .ack for this?
>>> Signed-off-by: Pierre-Louis Bossart <
> pierre-louis.bossart@linux.intel.com>
>>
>> Hmm, OK, so the forward is allowed but with workarounds...
>> But then why rewind won't work in a similar way?  DSP might be able to
>> cancel some of inflight data.
>>
>> In other words, I see no reason to strict notify callback only for
>> no_rewinds.  This is an optional ops in anyway.
>>
>> Also, I find the name "notify" a bit too ambiguous.  In this case,
>> it's notifying the applptr change.  So, a name related with the
>> function would be more understandable.
>>
>>
>
> If driver specify no rewind flag, should alsa lib
>
> 1) return error when application call snd_pcm_rewind() and
> snd_pcm_forward() ?

no, it would return 0 on rewind and the number of frames on forward. In 
other words the value returned is consistent with the function prototype 
which has no scope for errors.

> 2) return zero when call snd_pcm_rewindable() and snd_pcm_forwardable()

again zero for rewind and the actual number for forward.

> How can the application recover when hw_ptr is behind appl_ptr when stop
> threshold is set to boundary ?

Don't understand the question, there is no change here.

> Do you mean compressed audio stream don't support rewind and forward ?

You can't rewind in a compressed bitstream in general without losing 
sync or missing state information when there are dependencies between 
frames (linear prediction or filter with history). Forward is the same.
Some encoders allow for skips to well identified markers but random 
access is not possible or desired in general.

> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@alsa-project.org
> http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 2/4] ALSA: core: add .notify callback for pcm ops
  2015-07-08 17:10     ` Pierre-Louis Bossart
@ 2015-07-09 14:44       ` Takashi Iwai
  0 siblings, 0 replies; 34+ messages in thread
From: Takashi Iwai @ 2015-07-09 14:44 UTC (permalink / raw)
  To: Pierre-Louis Bossart; +Cc: alsa-devel

On Wed, 08 Jul 2015 19:10:32 +0200,
Pierre-Louis Bossart wrote:
> 
> 
> >> This is only enabled when the NO_REWIND hardware flag is used,
> >> so that the low-level driver/hardware to opportunistically pre-fetch
> >> data.
> >>
> >> FIXME: should we rely on .ack for this?
> >> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> >
> > Hmm, OK, so the forward is allowed but with workarounds...
> > But then why rewind won't work in a similar way?  DSP might be able to
> > cancel some of inflight data.
> 
> Nope, this is explicitly not supported, so unfortunately if we want to 
> optimize for power and let hardware fetch data when it's most 
> appropriate rewinds need to be disabled.
> 
> > In other words, I see no reason to strict notify callback only for
> > no_rewinds.  This is an optional ops in anyway.
> 
> It's fine to remove the check. I added this based on internal review 
> comments but I agree with your point.

OK, then let's treat the NO_REWIND and new ops individually.

> > Also, I find the name "notify" a bit too ambiguous.  In this case,
> > it's notifying the applptr change.  So, a name related with the
> > function would be more understandable.
> 
> The first open I had was to know if we could use .ack for this? if a 
> different callback is needed, we can use 'appl_ptr_update' instead of 
> 'notify'

As there are no many users of ack callback, I don't mind to reuse it.
But then we need to extend the function to receive a new argument
indicating the type of ack, right?


thanks,

Takashi

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 3/4] ALSA: core: add report of max dma burst
  2015-07-08 17:46     ` Pierre-Louis Bossart
@ 2015-07-10  2:35       ` Raymond Yau
  2015-07-10 17:13         ` Lars-Peter Clausen
  0 siblings, 1 reply; 34+ messages in thread
From: Raymond Yau @ 2015-07-10  2:35 UTC (permalink / raw)
  To: Pierre-Louis Bossart; +Cc: Takashi Iwai, ALSA Development Mailing List

>
>
>>> FIXME:
>>> 1. this was copy/pasted/edited based on the fifo_size fields, not
>>> sure why we would need this IOCTL1
>>
>>
>> fifo_size would fit, but it means that it also changes the current
>> behavior.  I don't believe that currently there are many programs
>> relying on this value, but who knows.
>
>
> I saw a mention of fifo_size in the VIA HDA controller, that's why I went
ahead with a different field. The fifo_size could be useful as a max value
for the internal hardware delay, when app use the 'delay' field in the
status structure they get a very dynamic value that isn't necessarily
straightforward to use.
>
>
>>
>>> 2. we also need the ability to set the actual fifo size, if suppported,
>>> by the hardware, to negociate the prefetch amount between application
>>> and driver. By default the default should be max buffering to allow
>>> for lower power, but for low-latency use cases we may want to reduce
>>> the amount of prefetching.
>>
>>
>> Right.  So a hw_parmas field looks suitable for that purpose.
>
>
> That 'set' capability is a lot more complicated, I am really having a
hard time with all the constraints for hw_params and how to represent
min,max and step values... If anyone is willing to help on that part I
wouldn't mind, this is really a part of ALSA I never looked into...

What is the usage of runtime->dma_bytes in
snd_pcm_hw_constraints_complete() in core/pcm_native.c ?

Was the usage similar to 128 bytes alignment ?

/* FIXME: remove */
if (runtime->dma_bytes) {
err = snd_pcm_hw_constraint_minmax(runtime,
SNDRV_PCM_HW_PARAM_BUFFER_BYTES, 0, runtime->dma_bytes);
if (err < 0)
return err;
}

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 3/4] ALSA: core: add report of max dma burst
  2015-07-10  2:35       ` Raymond Yau
@ 2015-07-10 17:13         ` Lars-Peter Clausen
  0 siblings, 0 replies; 34+ messages in thread
From: Lars-Peter Clausen @ 2015-07-10 17:13 UTC (permalink / raw)
  To: Raymond Yau, Pierre-Louis Bossart
  Cc: Takashi Iwai, ALSA Development Mailing List

On 07/10/2015 04:35 AM, Raymond Yau wrote:
[...]
> What is the usage of runtime->dma_bytes in
> snd_pcm_hw_constraints_complete() in core/pcm_native.c ?
>
> Was the usage similar to 128 bytes alignment ?
>
> /* FIXME: remove */
> if (runtime->dma_bytes) {
> err = snd_pcm_hw_constraint_minmax(runtime,
> SNDRV_PCM_HW_PARAM_BUFFER_BYTES, 0, runtime->dma_bytes);
> if (err < 0)
> return err;
> }

The FIXME is probably correct. dma_bytes should always be 0 at that point. 
This is a freshly allocated runtime struct and dma_bytes only gets set once 
a buffer is allocated, which is done later on.

- Lars

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds
  2015-07-08 10:10 ` [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds Pierre-Louis Bossart
  2015-07-08 14:21   ` Takashi Iwai
@ 2015-07-11 17:06   ` Alexander E. Patrakov
  2015-07-14  3:32     ` Raymond Yau
  2015-07-28 14:19     ` Pierre-Louis Bossart
  1 sibling, 2 replies; 34+ messages in thread
From: Alexander E. Patrakov @ 2015-07-11 17:06 UTC (permalink / raw)
  To: alsa-devel; +Cc: Takashi Iwai, Pierre-Louis Bossart

08.07.2015 15:10, Pierre-Louis Bossart wrote:
> Add new hw_params flag to explicitly tell driver that rewinds
> will never be used. This can be used by low-level driver to
> optimize DMA operations and reduce power consumption.
> Use this flag only when data written in ring buffer will
> never be invalidated, e.g. any update of appl_ptr is final.
>
> Caveat: there is currently no way to query capabilities without
> opening a pcm stream, so applications might need to serially
> open all exposed devices, check what they support by looking at
> hw_params->info and close them (this is what PulseAudio does so
> might not be an issue)

Thanks for the patch. I was also about to add this flag, for the use in 
userspace. It's good to know that it is also useful in the kernel space.

I usually don't comment on kernel patches, but this one provokes some 
questions and thoughts.

1. What amount of power-saving are we talking about, for Intel chips? 
(ideally, this should be in the commit message)

2. Shouldn't SND_PCM_APPEND (the mode used by dmix and friends) be also 
made incompatible with this flag, because, when mixing, it essentially 
overwrites incompletely-mixed data, i.e. rewinds without saying the word 
"rewind"? Shouldn't all other kinds of freewheeling be made incompatible 
with this flag, because the card, essentially, is never told about the 
application pointer? Or is this a userspace-only concern?

3. I have not seen any justification for the drastic measure of making a 
DMA-based device completely unrewindable. Maybe a more polite "please 
make this a batch/blocktransfer card" request, thus disallowing only 
sub-period rewinds, would still be useful for powersaving, without 
killing dmix.

4. If this "no rewinds" mode is not made the default, then exactly 
nobody will use it. Everyone except sound servers opens the default 
device with the default flags. I understand the potential to break 
existing userspace, especially PulseAudio, but we really need to think 
more here.

So, here is an alternative (or maybe additional - your flag does make 
sense, after all) hackish idea that should be compatible with the 
existing userspace.

Please make it possible to "allow", "put in auto mode" or "disallow" the 
device to DMA up to one period in advance. Please use the following 
logic when in "auto mode":

If the card is batch or blocktransfer "by itself", let it be. We can't 
do anything here, the DMA operations are already "optimized" whether we 
want it or not.

If the card does not support disabling the period wakeup, then FIXME - I 
don't know what to do. For compatibility, I think we can't optimize DMA 
operations here.

If the card supports disabling the period wakeup, and it is disabled, 
then don't allow it to optimize DMA operations.

If the card supports disabling the period wakeup, but it is not 
disabled, then allow it to optimize DMA operations.

As you have probably guessed, the idea here is to detect typical setup 
made by PulseAudio. I suspect that this also affects CRAS.

Also maybe this idea by David Henningsson is good:

"""
E g, if the application sets a low period size
but also sets the "disable period interrupts" flag, that could be an
indicator that it wants lots of interrupts just to update the pointer,
but nothing else. Maybe that's even the behaviour today (haven't checked).
"""

(taken from 
http://mailman.alsa-project.org/pipermail/alsa-devel/2014-September/080885.html 
)

>
> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> ---
>   include/sound/pcm.h         | 1 +
>   include/uapi/sound/asound.h | 1 +
>   sound/core/pcm_native.c     | 8 ++++++++
>   3 files changed, 10 insertions(+)
>
> diff --git a/include/sound/pcm.h b/include/sound/pcm.h
> index 691e7ee..25310b7 100644
> --- a/include/sound/pcm.h
> +++ b/include/sound/pcm.h
> @@ -370,6 +370,7 @@ struct snd_pcm_runtime {
>   	unsigned int rate_num;
>   	unsigned int rate_den;
>   	unsigned int no_period_wakeup: 1;
> +	unsigned int no_rewinds:1;
>
>   	/* -- SW params -- */
>   	int tstamp_mode;		/* mmap timestamp is updated */
> diff --git a/include/uapi/sound/asound.h b/include/uapi/sound/asound.h
> index a45be6b..b62b162 100644
> --- a/include/uapi/sound/asound.h
> +++ b/include/uapi/sound/asound.h
> @@ -356,6 +356,7 @@ typedef int snd_pcm_hw_param_t;
>   #define SNDRV_PCM_HW_PARAMS_NORESAMPLE	(1<<0)	/* avoid rate resampling */
>   #define SNDRV_PCM_HW_PARAMS_EXPORT_BUFFER	(1<<1)	/* export buffer */
>   #define SNDRV_PCM_HW_PARAMS_NO_PERIOD_WAKEUP	(1<<2)	/* disable period wakeups */
> +#define SNDRV_PCM_HW_PARAMS_NO_REWINDS	        (1<<3)	/* disable rewinds */
>
>   struct snd_interval {
>   	unsigned int min, max;
> diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
> index d126c03..a70e52d 100644
> --- a/sound/core/pcm_native.c
> +++ b/sound/core/pcm_native.c
> @@ -543,6 +543,8 @@ static int snd_pcm_hw_params(struct snd_pcm_substream *substream,
>   	runtime->no_period_wakeup =
>   			(params->info & SNDRV_PCM_INFO_NO_PERIOD_WAKEUP) &&
>   			(params->flags & SNDRV_PCM_HW_PARAMS_NO_PERIOD_WAKEUP);
> +	runtime->no_rewinds =
> +			params->flags & SNDRV_PCM_HW_PARAMS_NO_REWINDS;
>
>   	bits = snd_pcm_format_physical_width(runtime->format);
>   	runtime->sample_bits = bits;
> @@ -2428,6 +2430,9 @@ static snd_pcm_sframes_t snd_pcm_playback_rewind(struct snd_pcm_substream *subst
>   	if (frames == 0)
>   		return 0;
>
> +	if (runtime->no_rewinds)
> +		return 0;
> +
>   	snd_pcm_stream_lock_irq(substream);
>   	switch (runtime->status->state) {
>   	case SNDRV_PCM_STATE_PREPARED:
> @@ -2476,6 +2481,9 @@ static snd_pcm_sframes_t snd_pcm_capture_rewind(struct snd_pcm_substream *substr
>   	if (frames == 0)
>   		return 0;
>
> +	if (runtime->no_rewinds)
> +		return 0;
> +
>   	snd_pcm_stream_lock_irq(substream);
>   	switch (runtime->status->state) {
>   	case SNDRV_PCM_STATE_PREPARED:
>


-- 
Alexander E. Patrakov

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 3/4] ALSA: core: add report of max dma burst
  2015-07-08 14:37   ` Takashi Iwai
  2015-07-08 17:46     ` Pierre-Louis Bossart
@ 2015-07-11 17:46     ` Alexander E. Patrakov
  2015-07-11 18:28       ` Jaroslav Kysela
  1 sibling, 1 reply; 34+ messages in thread
From: Alexander E. Patrakov @ 2015-07-11 17:46 UTC (permalink / raw)
  To: Takashi Iwai, Pierre-Louis Bossart; +Cc: alsa-devel

08.07.2015 19:37, Takashi Iwai wrote:
> On Wed, 08 Jul 2015 12:10:35 +0200,
> Pierre-Louis Bossart wrote:
>>
>> Report how many bytes the DMA can burst before updating the hw_ptr.
>> This can help fix two issues with stale data discussed many times over
>> on the alsa-devel mailing list (refilling/reading ring buffer too late or
>> rewinding too close to the hw_ptr)
>>
>> FIXME:
>> 1. this was copy/pasted/edited based on the fifo_size fields, not
>> sure why we would need this IOCTL1
>
> fifo_size would fit, but it means that it also changes the current
> behavior.  I don't believe that currently there are many programs
> relying on this value, but who knows.

Debian Code Search knows, if that matters. Result: it is used by 
bindings, wrappers and reimplementations only (lush, oss4, alsa-lib 
itself, pyglet, libtritonus-java).

https://codesearch.debian.net/results/snd_pcm_hw_params_get_fifo_size/page_0

https://codesearch.debian.net/results/snd_pcm_hw_params_get_fifo_size/page_1

>
>> 2. we also need the ability to set the actual fifo size, if suppported,
>> by the hardware, to negociate the prefetch amount between application
>> and driver. By default the default should be max buffering to allow
>> for lower power, but for low-latency use cases we may want to reduce
>> the amount of prefetching.
>
> Right.  So a hw_parmas field looks suitable for that purpose.
>
>
> Takashi
>
>
>> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
>> ---
>>   include/sound/pcm.h         |  2 ++
>>   include/uapi/sound/asound.h |  5 +++--
>>   sound/core/pcm_lib.c        | 19 +++++++++++++++++++
>>   sound/core/pcm_native.c     |  7 +++++++
>>   4 files changed, 31 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/sound/pcm.h b/include/sound/pcm.h
>> index d5eff03..57c0571 100644
>> --- a/include/sound/pcm.h
>> +++ b/include/sound/pcm.h
>> @@ -56,6 +56,7 @@ struct snd_pcm_hardware {
>>   	unsigned int periods_min;	/* min # of periods */
>>   	unsigned int periods_max;	/* max # of periods */
>>   	size_t fifo_size;		/* fifo size in bytes */
>> +	unsigned int max_dma_burst;     /* max DMA in-flight bytes, linked to hw_ptr precision/fuzziness */
>>   };
>>
>>   struct snd_pcm_substream;
>> @@ -106,6 +107,7 @@ struct snd_pcm_ops {
>>   #define SNDRV_PCM_IOCTL1_CHANNEL_INFO	2
>>   #define SNDRV_PCM_IOCTL1_GSTATE		3
>>   #define SNDRV_PCM_IOCTL1_FIFO_SIZE	4
>> +#define SNDRV_PCM_IOCTL1_MAX_DMA_BURST	5
>>
>>   #define SNDRV_PCM_TRIGGER_STOP		0
>>   #define SNDRV_PCM_TRIGGER_START		1
>> diff --git a/include/uapi/sound/asound.h b/include/uapi/sound/asound.h
>> index b62b162..3dc049a 100644
>> --- a/include/uapi/sound/asound.h
>> +++ b/include/uapi/sound/asound.h
>> @@ -386,8 +386,9 @@ struct snd_pcm_hw_params {
>>   	unsigned int msbits;		/* R: used most significant bits */
>>   	unsigned int rate_num;		/* R: rate numerator */
>>   	unsigned int rate_den;		/* R: rate denominator */
>> -	snd_pcm_uframes_t fifo_size;	/* R: chip FIFO size in frames */
>> -	unsigned char reserved[64];	/* reserved for future */
>> +	snd_pcm_uframes_t fifo_size;	/* R: chip FIFO size in frames, indicates buffering after hw_ptr */
>> +	unsigned int max_dma_burst;     /* R: max in-flight bytes, indicates buffering before hw_ptr */
>> +	unsigned char reserved[60];	/* reserved for future */
>>   };
>>
>>   enum {
>> diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c
>> index 7d45645..dc89aa0 100644
>> --- a/sound/core/pcm_lib.c
>> +++ b/sound/core/pcm_lib.c
>> @@ -1826,6 +1826,22 @@ static int snd_pcm_lib_ioctl_fifo_size(struct snd_pcm_substream *substream,
>>   	return 0;
>>   }
>>
>> +static int snd_pcm_lib_ioctl_max_dma_burst(struct snd_pcm_substream *substream,
>> +					      void *arg)
>> +{
>> +	struct snd_pcm_hw_params *params = arg;
>> +
>> +	/* FIXME: add sanity checks:
>> +	 * max burst < ring buffer
>> +	 * max burst >= min_period_size
>> +	 * not sure if we can check against period sizes?
>> +	 * any others ?
>> +	 */
>> +	params->max_dma_burst = substream->runtime->hw.max_dma_burst;
>> +
>> +	return 0;
>> +}
>> +
>>   /**
>>    * snd_pcm_lib_ioctl - a generic PCM ioctl callback
>>    * @substream: the pcm substream instance
>> @@ -1849,6 +1865,9 @@ int snd_pcm_lib_ioctl(struct snd_pcm_substream *substream,
>>   		return snd_pcm_lib_ioctl_channel_info(substream, arg);
>>   	case SNDRV_PCM_IOCTL1_FIFO_SIZE:
>>   		return snd_pcm_lib_ioctl_fifo_size(substream, arg);
>> +	case SNDRV_PCM_IOCTL1_MAX_DMA_BURST:
>> +		return snd_pcm_lib_ioctl_max_dma_burst(substream, arg);
>> +
>>   	}
>>   	return -ENXIO;
>>   }
>> diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
>> index dd519b8..3d379b8 100644
>> --- a/sound/core/pcm_native.c
>> +++ b/sound/core/pcm_native.c
>> @@ -280,6 +280,7 @@ int snd_pcm_hw_refine(struct snd_pcm_substream *substream,
>>
>>   	params->info = 0;
>>   	params->fifo_size = 0;
>> +	params->max_dma_burst = 0;
>>   	if (params->rmask & (1 << SNDRV_PCM_HW_PARAM_SAMPLE_BITS))
>>   		params->msbits = 0;
>>   	if (params->rmask & (1 << SNDRV_PCM_HW_PARAM_RATE)) {
>> @@ -437,6 +438,12 @@ int snd_pcm_hw_refine(struct snd_pcm_substream *substream,
>>   				return changed;
>>   		}
>>   	}
>> +	if (!params->max_dma_burst) {
>> +		changed = substream->ops->ioctl(substream,
>> +						SNDRV_PCM_IOCTL1_MAX_DMA_BURST, params);
>> +		if (changed < 0)
>> +			return changed;
>> +	}
>>   	params->rmask = 0;
>>   	return 0;
>>   }
>> --
>> 1.9.1
>>
>> _______________________________________________
>> Alsa-devel mailing list
>> Alsa-devel@alsa-project.org
>> http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
>>
> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@alsa-project.org
> http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 3/4] ALSA: core: add report of max dma burst
  2015-07-11 17:46     ` Alexander E. Patrakov
@ 2015-07-11 18:28       ` Jaroslav Kysela
  2015-07-16  2:11         ` Raymond Yau
  0 siblings, 1 reply; 34+ messages in thread
From: Jaroslav Kysela @ 2015-07-11 18:28 UTC (permalink / raw)
  To: Alexander E. Patrakov, Takashi Iwai, Pierre-Louis Bossart; +Cc: alsa-devel


>>> +	unsigned int max_dma_burst;     /* R: max in-flight bytes, indicates buffering before hw_ptr */

I'm not sure if we should name fields like dma in the abstract ioctl
layer. The words in the comment seems more appropriate or just remove
dma and keep only max_burst.

				Thanks,
					Jaroslav

-- 
Jaroslav Kysela <perex@perex.cz>
Linux Kernel Sound Maintainer
ALSA Project; Red Hat, Inc.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds
  2015-07-11 17:06   ` Alexander E. Patrakov
@ 2015-07-14  3:32     ` Raymond Yau
  2015-07-14 17:39       ` Alexander E. Patrakov
  2015-07-28 14:19     ` Pierre-Louis Bossart
  1 sibling, 1 reply; 34+ messages in thread
From: Raymond Yau @ 2015-07-14  3:32 UTC (permalink / raw)
  To: Alexander E. Patrakov
  Cc: Takashi Iwai, ALSA Development Mailing List, Pierre-Louis Bossart

 Add new hw_params flag to explicitly tell driver that rewinds
>> will never be used. This can be used by low-level driver to
>> optimize DMA operations and reduce power consumption.
>> Use this flag only when data written in ring buffer will
>> never be invalidated, e.g. any update of appl_ptr is final.
>>
>> Caveat: there is currently no way to query capabilities without
>> opening a pcm stream, so applications might need to serially
>> open all exposed devices, check what they support by looking at
>> hw_params->info and close them (this is what PulseAudio does so
>> might not be an issue)
>>
>
> How about spdif->status AES0_NONAUDIO is set when AC3 passthrough SPDIF ?


>
> 1. What amount of power-saving are we talking about, for Intel chips?
> (ideally, this should be in the commit message)
>

Refer to HDA043-A

Energy efficient audio buffering and dynamic FIFO limit change

3.3.11 Offset 12h: GCAP2 –Global Capabilities 2

FIFO Size Change (FIFOSC) This bit isRO

if GCAP2.EEAC = 0.

In static case (GCAP2.EEAC = 1 and GCAP2.DFFLCC = 0), this bit will only
have effect before the first time RUN bit is set. Software must not attempt
to write this bit, unless EECAP.EEAC bit is set. HW will treat this bit as
read -only if the capability is not supported
.
In dynamic case (GCAP2.EEAC = 1 and GCAP2.DFFLCC = 1), on top of supporting
the static behavior describe above, this bit can also be used to
communicate the dynamic change request in FIFO size between SW and HW when
the stream is active


3.3.41 Offset 90: {IOB}SDnFIFOS – Input/Output/Bidirectional Stream
Descriptor n FIFO Size

When HW supports EE Audio capability,the FIFOS this value will represent
the minimum size of cyclic buffer for efficient HW operation


Do application need to know driver is using static or dynamic case ?

is the minimum size of cyclic buffer for efficient HW operation larger than
current hw params (period_bytes_min * periods_min) ?



> 2. Shouldn't SND_PCM_APPEND (the mode used by dmix and friends) be also
> made incompatible with this flag, because, when mixing, it essentially
> overwrites incompletely-mixed data, i.e. rewinds without saying the word
> "rewind"? Shouldn't all other kinds of freewheeling be made incompatible
> with this flag, because the card, essentially, is never told about the
> application pointer? Or is this a userspace-only concern?
>

if SND_PCM_APPEND use fmode = O_APPEND when snd_device_open(), this seem
support sequential write only and no random write

Do it mean pulseaudio cannot rewind dmix as dmix alway append data to hw
device ?


>
> 3. I have not seen any justification for the drastic measure of making a
> DMA-based device completely unrewindable. Maybe a more polite "please make
> this a batch/blocktransfer card" request, thus disallowing only sub-period
> rewinds, would still be useful for powersaving, without killing dmix.
>

Can those non-interleaved access pcm device with two periods support rewind
?
_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds
  2015-07-14  3:32     ` Raymond Yau
@ 2015-07-14 17:39       ` Alexander E. Patrakov
  2015-07-15  1:23         ` Raymond Yau
  0 siblings, 1 reply; 34+ messages in thread
From: Alexander E. Patrakov @ 2015-07-14 17:39 UTC (permalink / raw)
  To: Raymond Yau
  Cc: Takashi Iwai, ALSA Development Mailing List, Pierre-Louis Bossart

14.07.2015 08:32, Raymond Yau wrote:
>
>
>     1. What amount of power-saving are we talking about, for Intel
>     chips? (ideally, this should be in the commit message)
>
>
> Refer to HDA043-A
>
> Energy efficient audio buffering and dynamic FIFO limit change

<deleted a long quote>

I expected an answer in the form "XXX mW out of YYY, for such-and-such 
buffer and period sizes".

>     2. Shouldn't SND_PCM_APPEND (the mode used by dmix and friends) be
>     also made incompatible with this flag, because, when mixing, it
>     essentially overwrites incompletely-mixed data, i.e. rewinds without
>     saying the word "rewind"? Shouldn't all other kinds of freewheeling
>     be made incompatible with this flag, because the card, essentially,
>     is never told about the application pointer? Or is this a
>     userspace-only concern?
>
>
> if SND_PCM_APPEND use fmode = O_APPEND when snd_device_open(), this seem
> support sequential write only and no random write

dmix uses mmap and never advances application pointer on the slave 
(which is a hw device). While each instance of dmix can indeed perform 
mmap-based writes sequentially, what the device sees is not sequential. 
E.g., for two periods and two applications, started simultaneously, the 
sequence is like this:

1st app writes 1st period
1st app writes 2nd period
2nd app overwrites 1st period with the mix
2nd app overwrites 2nd periodwith the mix
hw completes playing 1st period
1st app writes 1st period
2nd app overwrites 1st period with the mix
hw completes playing 2nd period
2nd app writes 2nd period
1st app overwrites 2nd period with the mix
hw completes playing 1st period

...and so on, and we have not considered xruns here, which complicate 
the picture even further. The point is: what the hardware sees is not a 
stream of sequential writes, so the no-rewind flag is not appropriate.

And it is never notified about the application pointer, so the flag is 
definitely not appropriate again.

>
> Do it mean pulseaudio cannot rewind dmix as dmix alway append data to hw
> device ?

No, see above.

Regarding dmix rewindability status: dmix is not actually rewindable 
(but pretends to be), and, while (unlike for extplug) I don't see any 
reason why it must be unrewindable, there is no opposition to mark it as 
unrewindable and then call the problem solved.

Regarding PulseAudio: for PATCH 1/4, it is completely irrelevant. 
Pulseaudio wants to be responsive to the events that new streams appear 
and existing ones change software-based volume. Let's suppose that it 
wants to respond in no later than 10 ms (just a made-up number). Then, 
at no point there should be more than 10 ms of buffered audio that 
cannot be replaced. If PulseAudio starts using this new flag, then it 
has to wake up the CPU every 10 ms. I.e. "let the audio DMA engine wake 
up every 0.125 ms, as it does by default, and let the CPU wake up every 
1000 ms unless something important happens (in which case there is a 
rewind)" would be replaced by "let the audio DMA engine wake up once in 
10 ms, and also let the CPU wake up every 10 ms", which is much worse. 
Unfortunately, the patch does not offer a way to say "let the DMA engine 
wake up every 10 ms, and CPU wake up every 1000 ms (and occasionally 
perform rewinds, but leave 10 ms untouched)".

>
>
>     3. I have not seen any justification for the drastic measure of
>     making a DMA-based device completely unrewindable. Maybe a more
>     polite "please make this a batch/blocktransfer card" request, thus
>     disallowing only sub-period rewinds, would still be useful for
>     powersaving, without killing dmix.
>
>
> Can those non-interleaved access pcm device with two periods support
> rewind ?

Yes, but very carefully. E.g., suppose that the period size is 10 ms. 
Then, this sequence of events is valid, does not produce xruns and does 
not cause the card to play stale data:

0 ms: Application fills 2 periods via mmap, notices the timestamp and 
starts the PCM.
10 ms: Hardware plays out the 1st period and wakes up the application. 
Application writes the 1st period via mmap and notices the timestamp.
20 ms: Hardware plays out the 2nd period and wakes up the application. 
Application writes the 2nd period via mmap and notices the timestamp.
23 ms: Application wakes up for an unrelated reason and notices the 
timestamp. It knows that the sound card still has 17 ms to play, and 10 
ms (one period, as usual for batch/blocktransfer cards) are 
unrewindable. So it rewinds 6 ms and writes new data.
30 ms: Hardware plays out the 1st period and wakes up the application. 
Application writes the 1st period via mmap and notices the timestamp.

... and so on.

-- 
Alexander E. Patrakov

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds
  2015-07-14 17:39       ` Alexander E. Patrakov
@ 2015-07-15  1:23         ` Raymond Yau
  2015-07-15  4:51           ` Alexander E. Patrakov
  0 siblings, 1 reply; 34+ messages in thread
From: Raymond Yau @ 2015-07-15  1:23 UTC (permalink / raw)
  To: Alexander E. Patrakov
  Cc: Takashi Iwai, ALSA Development Mailing List, Pierre-Louis Bossart

>
> I expected an answer in the form "XXX mW out of YYY, for such-and-such
buffer and period sizes".

The new parameter is to inform the application that  the application can
only use sequential write and not random write

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds
  2015-07-15  1:23         ` Raymond Yau
@ 2015-07-15  4:51           ` Alexander E. Patrakov
  0 siblings, 0 replies; 34+ messages in thread
From: Alexander E. Patrakov @ 2015-07-15  4:51 UTC (permalink / raw)
  To: Raymond Yau
  Cc: Takashi Iwai, ALSA Development Mailing List, Pierre-Louis Bossart

15.07.2015 06:23, Raymond Yau wrote:
>
>  >
>  > I expected an answer in the form "XXX mW out of YYY, for
> such-and-such buffer and period sizes".
>
> The new parameter is to inform the application that  the application can
> only use sequential write and not random write

Please reread the description, it also works the other way:

"""Add new hw_params flag to explicitly tell driver that rewinds
will never be used."""

-- 
Alexander E. Patrakov

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 0/4] better support for bursty DMA usages
  2015-07-08 10:10 [RFC PATCH 0/4] better support for bursty DMA usages Pierre-Louis Bossart
                   ` (4 preceding siblings ...)
  2015-07-08 14:31 ` [RFC PATCH 0/4] better support for bursty DMA usages Takashi Iwai
@ 2015-07-15 10:14 ` Raymond Yau
  2015-07-15 10:38   ` Alexander E. Patrakov
  5 siblings, 1 reply; 34+ messages in thread
From: Raymond Yau @ 2015-07-15 10:14 UTC (permalink / raw)
  To: Pierre-Louis Bossart; +Cc: ALSA Development Mailing List

>
> Set of patches to fix issues with hw_ptr fuzziness [1] and increased
buffering
>  w/ DSPs
>
> 1. disable rewinds to allow for new HDaudio SPIB DMA functionality (fetch
up to
> the application pointer, rewinds not supported)
> 2. report max in-flight bytes to avoid problems with stale data (late
wake-ups,
> rewinds)
>
> [1]
http://mailman.alsa-project.org/pipermail/alsa-devel/2015-June/093646.html
>
> TODO:
> 1. fixes and alsa-lib updates (compile-tested only for now)

http://cgit.freedesktop.org/pulseaudio/pulseaudio/commit/src/modules/alsa/alsa-sink.c?id=cb55b00ccd25d965b1222e74375aee05427a449b

Do you need a new api snd_pcm_hw_params_can_rewind() for the application to
know pcm device does not support rewind ?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 0/4] better support for bursty DMA usages
  2015-07-15 10:14 ` Raymond Yau
@ 2015-07-15 10:38   ` Alexander E. Patrakov
  2015-07-22 14:44     ` Raymond Yau
  0 siblings, 1 reply; 34+ messages in thread
From: Alexander E. Patrakov @ 2015-07-15 10:38 UTC (permalink / raw)
  To: Raymond Yau, Pierre-Louis Bossart; +Cc: ALSA Development Mailing List

15.07.2015 15:14, Raymond Yau wrote:
>> TODO:
>> 1. fixes and alsa-lib updates (compile-tested only for now)
>
> http://cgit.freedesktop.org/pulseaudio/pulseaudio/commit/src/modules/alsa/alsa-sink.c?id=cb55b00ccd25d965b1222e74375aee05427a449b
>
> Do you need a new api snd_pcm_hw_params_can_rewind() for the application to
> know pcm device does not support rewind ?
Yes, we do need that.

Please see my LAC 2015 paper: http://lac.linuxaudio.org/2015/papers/10.pdf

-- 
Alexander E. Patrakov

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 3/4] ALSA: core: add report of max dma burst
  2015-07-11 18:28       ` Jaroslav Kysela
@ 2015-07-16  2:11         ` Raymond Yau
  0 siblings, 0 replies; 34+ messages in thread
From: Raymond Yau @ 2015-07-16  2:11 UTC (permalink / raw)
  To: Jaroslav Kysela
  Cc: Takashi Iwai, ALSA Development Mailing List,
	Alexander E. Patrakov, Pierre-Louis Bossart

>
>
> >>> +   unsigned int max_dma_burst;     /* R: max in-flight bytes,
indicates buffering before hw_ptr */
>
> I'm not sure if we should name fields like dma in the abstract ioctl
> layer. The words in the comment seems more appropriate or just remove
> dma and keep only max_burst.

Why not using hwptr_granularity if the usage is about the hw_ptr update
interval ?

max_dma_brust is confusing for firewire and usb audio

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 0/4] better support for bursty DMA usages
  2015-07-15 10:38   ` Alexander E. Patrakov
@ 2015-07-22 14:44     ` Raymond Yau
  0 siblings, 0 replies; 34+ messages in thread
From: Raymond Yau @ 2015-07-22 14:44 UTC (permalink / raw)
  To: Alexander E. Patrakov; +Cc: ALSA Development Mailing List, Pierre-Louis Bossart

>>>
>>> TODO:
>>> 1. fixes and alsa-lib updates (compile-tested only for now)
>>
>>
>>
http://cgit.freedesktop.org/pulseaudio/pulseaudio/commit/src/modules/alsa/alsa-sink.c?id=cb55b00ccd25d965b1222e74375aee05427a449b
>>
>> Do you need a new api snd_pcm_hw_params_can_rewind() for the application
to
>> know pcm device does not support rewind ?
>
> Yes, we do need that.
>
> Please see my LAC 2015 paper: http://lac.linuxaudio.org/2015/papers/10.pdf

What kind of driver or plugin will have this flag  ?

snd-aloop,

ioplug : pulse and  jack

Are those usb-audio configure as usb-passthrough inside vm still rewindable
?

How about multi plug when one of the slaves  has this flag ?

file plugin depend on whether slave device support rewind or not

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds
  2015-07-11 17:06   ` Alexander E. Patrakov
  2015-07-14  3:32     ` Raymond Yau
@ 2015-07-28 14:19     ` Pierre-Louis Bossart
  2015-07-28 15:43       ` Alexander E. Patrakov
  1 sibling, 1 reply; 34+ messages in thread
From: Pierre-Louis Bossart @ 2015-07-28 14:19 UTC (permalink / raw)
  To: Alexander E. Patrakov, alsa-devel; +Cc: Takashi Iwai

On 7/11/15 12:06 PM, Alexander E. Patrakov wrote:
> 08.07.2015 15:10, Pierre-Louis Bossart wrote:
>> Add new hw_params flag to explicitly tell driver that rewinds
>> will never be used. This can be used by low-level driver to
>> optimize DMA operations and reduce power consumption.
>> Use this flag only when data written in ring buffer will
>> never be invalidated, e.g. any update of appl_ptr is final.
>>
>> Caveat: there is currently no way to query capabilities without
>> opening a pcm stream, so applications might need to serially
>> open all exposed devices, check what they support by looking at
>> hw_params->info and close them (this is what PulseAudio does so
>> might not be an issue)
>
> Thanks for the patch. I was also about to add this flag, for the use in
> userspace. It's good to know that it is also useful in the kernel space.
>
> I usually don't comment on kernel patches, but this one provokes some
> questions and thoughts.
>
> 1. What amount of power-saving are we talking about, for Intel chips?
> (ideally, this should be in the commit message)

The power savings will depend on non-audio parameters, so it's 
impossible to state a value. And it's probably not public information 
anyway.

>
> 2. Shouldn't SND_PCM_APPEND (the mode used by dmix and friends) be also
> made incompatible with this flag, because, when mixing, it essentially
> overwrites incompletely-mixed data, i.e. rewinds without saying the word
> "rewind"? Shouldn't all other kinds of freewheeling be made incompatible
> with this flag, because the card, essentially, is never told about the
> application pointer? Or is this a userspace-only concern?

Good point, I wasn't aware of this flag. willl look into it.

> 3. I have not seen any justification for the drastic measure of making a
> DMA-based device completely unrewindable. Maybe a more polite "please
> make this a batch/blocktransfer card" request, thus disallowing only
> sub-period rewinds, would still be useful for powersaving, without
> killing dmix.
>
> 4. If this "no rewinds" mode is not made the default, then exactly
> nobody will use it. Everyone except sound servers opens the default
> device with the default flags. I understand the potential to break
> existing userspace, especially PulseAudio, but we really need to think
> more here.

Not sure I understand the issue. when a new functionality is added it 
takes time to be adopted. If we can push it in sound servers first then 
it creates a wide pool of users from day1.

>
> So, here is an alternative (or maybe additional - your flag does make
> sense, after all) hackish idea that should be compatible with the
> existing userspace.
>
> Please make it possible to "allow", "put in auto mode" or "disallow" the
> device to DMA up to one period in advance. Please use the following
> logic when in "auto mode":

This is a separate discussion. the patches I wrote are only to make sure 
the hardware can safely fetch the data that is provided by userspace. I 
think the open is more how we configure the DMA burst, something I 
haven't looked at yet.

>
> If the card is batch or blocktransfer "by itself", let it be. We can't
> do anything here, the DMA operations are already "optimized" whether we
> want it or not.
>
> If the card does not support disabling the period wakeup, then FIXME - I
> don't know what to do. For compatibility, I think we can't optimize DMA
> operations here.
>
> If the card supports disabling the period wakeup, and it is disabled,
> then don't allow it to optimize DMA operations.
>
> If the card supports disabling the period wakeup, but it is not
> disabled, then allow it to optimize DMA operations.
>
> As you have probably guessed, the idea here is to detect typical setup
> made by PulseAudio. I suspect that this also affects CRAS.
>
> Also maybe this idea by David Henningsson is good:
>
> """
> E g, if the application sets a low period size
> but also sets the "disable period interrupts" flag, that could be an
> indicator that it wants lots of interrupts just to update the pointer,
> but nothing else. Maybe that's even the behaviour today (haven't checked).
> """
>
> (taken from
> http://mailman.alsa-project.org/pipermail/alsa-devel/2014-September/080885.html
> )
>
>>
>> Signed-off-by: Pierre-Louis Bossart
>> <pierre-louis.bossart@linux.intel.com>
>> ---
>>   include/sound/pcm.h         | 1 +
>>   include/uapi/sound/asound.h | 1 +
>>   sound/core/pcm_native.c     | 8 ++++++++
>>   3 files changed, 10 insertions(+)
>>
>> diff --git a/include/sound/pcm.h b/include/sound/pcm.h
>> index 691e7ee..25310b7 100644
>> --- a/include/sound/pcm.h
>> +++ b/include/sound/pcm.h
>> @@ -370,6 +370,7 @@ struct snd_pcm_runtime {
>>       unsigned int rate_num;
>>       unsigned int rate_den;
>>       unsigned int no_period_wakeup: 1;
>> +    unsigned int no_rewinds:1;
>>
>>       /* -- SW params -- */
>>       int tstamp_mode;        /* mmap timestamp is updated */
>> diff --git a/include/uapi/sound/asound.h b/include/uapi/sound/asound.h
>> index a45be6b..b62b162 100644
>> --- a/include/uapi/sound/asound.h
>> +++ b/include/uapi/sound/asound.h
>> @@ -356,6 +356,7 @@ typedef int snd_pcm_hw_param_t;
>>   #define SNDRV_PCM_HW_PARAMS_NORESAMPLE    (1<<0)    /* avoid rate
>> resampling */
>>   #define SNDRV_PCM_HW_PARAMS_EXPORT_BUFFER    (1<<1)    /* export
>> buffer */
>>   #define SNDRV_PCM_HW_PARAMS_NO_PERIOD_WAKEUP    (1<<2)    /* disable
>> period wakeups */
>> +#define SNDRV_PCM_HW_PARAMS_NO_REWINDS            (1<<3)    /*
>> disable rewinds */
>>
>>   struct snd_interval {
>>       unsigned int min, max;
>> diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
>> index d126c03..a70e52d 100644
>> --- a/sound/core/pcm_native.c
>> +++ b/sound/core/pcm_native.c
>> @@ -543,6 +543,8 @@ static int snd_pcm_hw_params(struct
>> snd_pcm_substream *substream,
>>       runtime->no_period_wakeup =
>>               (params->info & SNDRV_PCM_INFO_NO_PERIOD_WAKEUP) &&
>>               (params->flags & SNDRV_PCM_HW_PARAMS_NO_PERIOD_WAKEUP);
>> +    runtime->no_rewinds =
>> +            params->flags & SNDRV_PCM_HW_PARAMS_NO_REWINDS;
>>
>>       bits = snd_pcm_format_physical_width(runtime->format);
>>       runtime->sample_bits = bits;
>> @@ -2428,6 +2430,9 @@ static snd_pcm_sframes_t
>> snd_pcm_playback_rewind(struct snd_pcm_substream *subst
>>       if (frames == 0)
>>           return 0;
>>
>> +    if (runtime->no_rewinds)
>> +        return 0;
>> +
>>       snd_pcm_stream_lock_irq(substream);
>>       switch (runtime->status->state) {
>>       case SNDRV_PCM_STATE_PREPARED:
>> @@ -2476,6 +2481,9 @@ static snd_pcm_sframes_t
>> snd_pcm_capture_rewind(struct snd_pcm_substream *substr
>>       if (frames == 0)
>>           return 0;
>>
>> +    if (runtime->no_rewinds)
>> +        return 0;
>> +
>>       snd_pcm_stream_lock_irq(substream);
>>       switch (runtime->status->state) {
>>       case SNDRV_PCM_STATE_PREPARED:
>>
>
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds
  2015-07-28 14:19     ` Pierre-Louis Bossart
@ 2015-07-28 15:43       ` Alexander E. Patrakov
  2015-07-29 17:46         ` Pierre-Louis Bossart
  0 siblings, 1 reply; 34+ messages in thread
From: Alexander E. Patrakov @ 2015-07-28 15:43 UTC (permalink / raw)
  To: Pierre-Louis Bossart, alsa-devel; +Cc: Takashi Iwai

28.07.2015 19:19, Pierre-Louis Bossart wrote:
> On 7/11/15 12:06 PM, Alexander E. Patrakov wrote:

>> 3. I have not seen any justification for the drastic measure of making a
>> DMA-based device completely unrewindable. Maybe a more polite "please
>> make this a batch/blocktransfer card" request, thus disallowing only
>> sub-period rewinds, would still be useful for powersaving, without
>> killing dmix.
>>
>> 4. If this "no rewinds" mode is not made the default, then exactly
>> nobody will use it. Everyone except sound servers opens the default
>> device with the default flags. I understand the potential to break
>> existing userspace, especially PulseAudio, but we really need to think
>> more here.
>
> Not sure I understand the issue. when a new functionality is added it
> takes time to be adopted. If we can push it in sound servers first then
> it creates a wide pool of users from day1.

The issue is that the proposed functionality, in the currently proposed 
"I promise to never rewind" form, is nearly useless by its very nature 
for any sound server that cares about power consumption significantly 
more than dmix does. It is indeed usable by JACK (by its design, it 
doesn't rewind, and uses low latency) and CRAS (which currently doesn't 
rewind, but I am not sure whether this is a bug or a deliberate decision 
based on non-public measurements of extra power savings that "rewinds + 
high latency" would allow).

As I have already explained, dmix, when mixing, writes to the hardware 
buffer multiple times, which is equivalent to rewinding. PulseAudio uses 
rewinds for a very specific purpose - to avoid CPU wakeups in the common 
"nothing unexpected happened" case, i.e. to allow very high average 
latency while keeping the latency of reaction to unexpected events low. 
So, by convincing PulseAudio to never rewind and to tell the driver 
about this fact, you can save some power in the card, but (if the 
proponents of rewinds are right - see my earlier request for non-public 
information) will waste way more power due to the need for much more 
frequent CPU wakeups ("cannot rewind but have to react to unexpected 
events within 20 ms" = "must wake up every 20 ms").

The only cases where this flag can be useful for sound servers are:

1. Sound servers that already, by design, always waste CPU power by 
running at low latency;

2. Sound cards like USB audio where the kernel already contains a 
low-latency thread that copies data to the card in the serialized way.

For PulseAudio on Intel hardware and HDA-compatible card, it would (if 
the situation on my Sony laptop is an exception and not the rule), as I 
have already mentioned, only trade some power consumption improvement in 
the card for bigger losses in the CPU.

And with (2), I am not sure whether this is a win at all: you are likely 
trading a kernel thread that wakes up, say, every 6 ms (with userspace 
waking up very rarely), to a userspace wakeup that happens every 6 or 
maybe 20 ms (as the sound server chooses), but incurs more context 
switches. It does shift the policy decision about the wakeup rate to 
userspace, though.

-- 
Alexander E. Patrakov

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds
  2015-07-28 15:43       ` Alexander E. Patrakov
@ 2015-07-29 17:46         ` Pierre-Louis Bossart
  2015-07-30  6:11           ` Alexander E. Patrakov
  0 siblings, 1 reply; 34+ messages in thread
From: Pierre-Louis Bossart @ 2015-07-29 17:46 UTC (permalink / raw)
  To: Alexander E. Patrakov, alsa-devel; +Cc: Takashi Iwai

On 7/28/15 10:43 AM, Alexander E. Patrakov wrote:
> 28.07.2015 19:19, Pierre-Louis Bossart wrote:
>> On 7/11/15 12:06 PM, Alexander E. Patrakov wrote:
>
>>> 3. I have not seen any justification for the drastic measure of making a
>>> DMA-based device completely unrewindable. Maybe a more polite "please
>>> make this a batch/blocktransfer card" request, thus disallowing only
>>> sub-period rewinds, would still be useful for powersaving, without
>>> killing dmix.
>>>
>>> 4. If this "no rewinds" mode is not made the default, then exactly
>>> nobody will use it. Everyone except sound servers opens the default
>>> device with the default flags. I understand the potential to break
>>> existing userspace, especially PulseAudio, but we really need to think
>>> more here.
>>
>> Not sure I understand the issue. when a new functionality is added it
>> takes time to be adopted. If we can push it in sound servers first then
>> it creates a wide pool of users from day1.
>
> The issue is that the proposed functionality, in the currently proposed
> "I promise to never rewind" form, is nearly useless by its very nature
> for any sound server that cares about power consumption significantly
> more than dmix does. It is indeed usable by JACK (by its design, it
> doesn't rewind, and uses low latency) and CRAS (which currently doesn't
> rewind, but I am not sure whether this is a bug or a deliberate decision
> based on non-public measurements of extra power savings that "rewinds +
> high latency" would allow).
>
> As I have already explained, dmix, when mixing, writes to the hardware
> buffer multiple times, which is equivalent to rewinding. PulseAudio uses
> rewinds for a very specific purpose - to avoid CPU wakeups in the common
> "nothing unexpected happened" case, i.e. to allow very high average
> latency while keeping the latency of reaction to unexpected events low.
> So, by convincing PulseAudio to never rewind and to tell the driver
> about this fact, you can save some power in the card, but (if the
> proponents of rewinds are right - see my earlier request for non-public
> information) will waste way more power due to the need for much more
> frequent CPU wakeups ("cannot rewind but have to react to unexpected
> events within 20 ms" = "must wake up every 20 ms").
>
> The only cases where this flag can be useful for sound servers are:
>
> 1. Sound servers that already, by design, always waste CPU power by
> running at low latency;

Your classification is not exhaustive enough. You need to take into 
account sound servers that have two outputs per endpoint, one for 
low-latency and one for low-power/deep-buffer with a DSP/hardware mixer.
Power is also no longer directly linked to wakeup rates only but also to 
DDR access patterns.
When a larger on-chip buffer is available, disabling rewinds does let 
the hardware know it can safely fetch data in bigger chunks rather than 
use small data bursts.
This sort of capabilities is becoming prevalent these days, and not just 
on Intel platforms - see e.g. Nexus5/9 devices -, and maybe PulseAudio 
and friends need to evolve to make use of these resources rather than 
stay the course with single output and rewind mechanisms that prevent 
power optimizations on newer platforms.

>
> 2. Sound cards like USB audio where the kernel already contains a
> low-latency thread that copies data to the card in the serialized way.
>
> For PulseAudio on Intel hardware and HDA-compatible card, it would (if
> the situation on my Sony laptop is an exception and not the rule), as I
> have already mentioned, only trade some power consumption improvement in
> the card for bigger losses in the CPU.
>
> And with (2), I am not sure whether this is a win at all: you are likely
> trading a kernel thread that wakes up, say, every 6 ms (with userspace
> waking up very rarely), to a userspace wakeup that happens every 6 or
> maybe 20 ms (as the sound server chooses), but incurs more context
> switches. It does shift the policy decision about the wakeup rate to
> userspace, though.
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds
  2015-07-29 17:46         ` Pierre-Louis Bossart
@ 2015-07-30  6:11           ` Alexander E. Patrakov
  2015-07-30 13:46             ` Pierre-Louis Bossart
  0 siblings, 1 reply; 34+ messages in thread
From: Alexander E. Patrakov @ 2015-07-30  6:11 UTC (permalink / raw)
  To: Pierre-Louis Bossart, alsa-devel; +Cc: Takashi Iwai

29.07.2015 22:46, Pierre-Louis Bossart wrote:
> On 7/28/15 10:43 AM, Alexander E. Patrakov wrote:
>> 28.07.2015 19:19, Pierre-Louis Bossart wrote:
>>> On 7/11/15 12:06 PM, Alexander E. Patrakov wrote:
>>
>>>> 3. I have not seen any justification for the drastic measure of
>>>> making a
>>>> DMA-based device completely unrewindable. Maybe a more polite "please
>>>> make this a batch/blocktransfer card" request, thus disallowing only
>>>> sub-period rewinds, would still be useful for powersaving, without
>>>> killing dmix.
>>>>
>>>> 4. If this "no rewinds" mode is not made the default, then exactly
>>>> nobody will use it. Everyone except sound servers opens the default
>>>> device with the default flags. I understand the potential to break
>>>> existing userspace, especially PulseAudio, but we really need to think
>>>> more here.
>>>
>>> Not sure I understand the issue. when a new functionality is added it
>>> takes time to be adopted. If we can push it in sound servers first then
>>> it creates a wide pool of users from day1.
>>
>> The issue is that the proposed functionality, in the currently proposed
>> "I promise to never rewind" form, is nearly useless by its very nature
>> for any sound server that cares about power consumption significantly
>> more than dmix does. It is indeed usable by JACK (by its design, it
>> doesn't rewind, and uses low latency) and CRAS (which currently doesn't
>> rewind, but I am not sure whether this is a bug or a deliberate decision
>> based on non-public measurements of extra power savings that "rewinds +
>> high latency" would allow).
>>
>> As I have already explained, dmix, when mixing, writes to the hardware
>> buffer multiple times, which is equivalent to rewinding. PulseAudio uses
>> rewinds for a very specific purpose - to avoid CPU wakeups in the common
>> "nothing unexpected happened" case, i.e. to allow very high average
>> latency while keeping the latency of reaction to unexpected events low.
>> So, by convincing PulseAudio to never rewind and to tell the driver
>> about this fact, you can save some power in the card, but (if the
>> proponents of rewinds are right - see my earlier request for non-public
>> information) will waste way more power due to the need for much more
>> frequent CPU wakeups ("cannot rewind but have to react to unexpected
>> events within 20 ms" = "must wake up every 20 ms").
>>
>> The only cases where this flag can be useful for sound servers are:
>>
>> 1. Sound servers that already, by design, always waste CPU power by
>> running at low latency;
>
> Your classification is not exhaustive enough. You need to take into
> account sound servers that have two outputs per endpoint, one for
> low-latency and one for low-power/deep-buffer with a DSP/hardware mixer.
> Power is also no longer directly linked to wakeup rates only but also to
> DDR access patterns.
> When a larger on-chip buffer is available, disabling rewinds does let
> the hardware know it can safely fetch data in bigger chunks rather than
> use small data bursts.
> This sort of capabilities is becoming prevalent these days, and not just
> on Intel platforms - see e.g. Nexus5/9 devices -, and maybe PulseAudio
> and friends need to evolve to make use of these resources rather than
> stay the course with single output and rewind mechanisms that prevent
> power optimizations on newer platforms.

OK, I see some new points there, but still no full picture. My point 
still is: deep (>= 30-40 ms) buffer without rewinds or without any other 
means to seamlessly cancel and replace what's there (up to a certain 
unrewindable latency, i.e. the same 30-40 ms, that a human won't object 
to) is absolutely useless to any current or future sound server.

So, in your example, "I promise to never rewind" flag can definitely be 
set for a low-latency endpoint, but not for the other one. That other 
endpoint would benefit from a way to say "I promise to never rewind 
closer than X samples to the hardware pointer" API, which you have 
flagged as a separate discussion. I would also not object to that 
endpoint becoming a batch/blocktransfer PCM.

Well, there is a hackish way to say "I promise to never rewind" on both 
endpoints even with a deep buffer on a low-power endpoint. The way is: 
instead of rewinding, open the low-latency endpoint and play a 
correction signal (i.e. the difference between the actual and the wanted 
contents of the deep buffer) through it, with low latency, and let the 
hardware mix. But I don't like this, for purely subjective reasons and 
for the need for that endpoint to have twice as much output amplitude 
than one would normally need. What do you propose instead?

Still, I think that a more general API that says "I promise to never 
rewind closer than X samples to the hardware pointer" would be more 
useful than the black-and-white "I promise to never rewind" call - 
assuming that it can be expressed to the hardware.

-- 
Alexander E. Patrakov

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds
  2015-07-30  6:11           ` Alexander E. Patrakov
@ 2015-07-30 13:46             ` Pierre-Louis Bossart
  0 siblings, 0 replies; 34+ messages in thread
From: Pierre-Louis Bossart @ 2015-07-30 13:46 UTC (permalink / raw)
  To: Alexander E. Patrakov, alsa-devel; +Cc: Takashi Iwai

 On 07/30/2015 01:11 AM, Alexander E. Patrakov wrote:
> 29.07.2015 22:46, Pierre-Louis Bossart wrote:
>> On 7/28/15 10:43 AM, Alexander E. Patrakov wrote:
>>> 28.07.2015 19:19, Pierre-Louis Bossart wrote:
>>>> On 7/11/15 12:06 PM, Alexander E. Patrakov wrote:
>>> 
>>>>> 3. I have not seen any justification for the drastic measure
>>>>> of making a DMA-based device completely unrewindable. Maybe a
>>>>> more polite "please make this a batch/blocktransfer card"
>>>>> request, thus disallowing only sub-period rewinds, would
>>>>> still be useful for powersaving, without killing dmix.
>>>>> 
>>>>> 4. If this "no rewinds" mode is not made the default, then
>>>>> exactly nobody will use it. Everyone except sound servers
>>>>> opens the default device with the default flags. I understand
>>>>> the potential to break existing userspace, especially
>>>>> PulseAudio, but we really need to think more here.
>>>> 
>>>> Not sure I understand the issue. when a new functionality is
>>>> added it takes time to be adopted. If we can push it in sound
>>>> servers first then it creates a wide pool of users from day1.
>>> 
>>> The issue is that the proposed functionality, in the currently
>>> proposed "I promise to never rewind" form, is nearly useless by
>>> its very nature for any sound server that cares about power
>>> consumption significantly more than dmix does. It is indeed
>>> usable by JACK (by its design, it doesn't rewind, and uses low
>>> latency) and CRAS (which currently doesn't rewind, but I am not
>>> sure whether this is a bug or a deliberate decision based on
>>> non-public measurements of extra power savings that "rewinds + 
>>> high latency" would allow).
>>> 
>>> As I have already explained, dmix, when mixing, writes to the
>>> hardware buffer multiple times, which is equivalent to rewinding.
>>> PulseAudio uses rewinds for a very specific purpose - to avoid
>>> CPU wakeups in the common "nothing unexpected happened" case,
>>> i.e. to allow very high average latency while keeping the latency
>>> of reaction to unexpected events low. So, by convincing
>>> PulseAudio to never rewind and to tell the driver about this
>>> fact, you can save some power in the card, but (if the proponents
>>> of rewinds are right - see my earlier request for non-public 
>>> information) will waste way more power due to the need for much
>>> more frequent CPU wakeups ("cannot rewind but have to react to
>>> unexpected events within 20 ms" = "must wake up every 20 ms").
>>> 
>>> The only cases where this flag can be useful for sound servers
>>> are:
>>> 
>>> 1. Sound servers that already, by design, always waste CPU power
>>> by running at low latency;
>> 
>> Your classification is not exhaustive enough. You need to take
>> into account sound servers that have two outputs per endpoint, one
>> for low-latency and one for low-power/deep-buffer with a
>> DSP/hardware mixer. Power is also no longer directly linked to
>> wakeup rates only but also to DDR access patterns. When a larger
>> on-chip buffer is available, disabling rewinds does let the
>> hardware know it can safely fetch data in bigger chunks rather
>> than use small data bursts. This sort of capabilities is becoming
>> prevalent these days, and not just on Intel platforms - see e.g.
>> Nexus5/9 devices -, and maybe PulseAudio and friends need to evolve
>> to make use of these resources rather than stay the course with
>> single output and rewind mechanisms that prevent power
>> optimizations on newer platforms.
> 
> OK, I see some new points there, but still no full picture. My point
> still is: deep (>= 30-40 ms) buffer without rewinds or without any
> other means to seamlessly cancel and replace what's there (up to at
> certain unrewindable latency, i.e. the same 30-40 ms, that a human
> won't object to) is absolutely useless to any current or future sound
> server.
> 
> So, in your example, "I promise to never rewind" flag can definitely
> be set for a low-latency endpoint, but not for the other one. That
> other endpoint would benefit from a way to say "I promise to never
> rewind closer than X samples to the hardware pointer" API, which youen
> have flagged as a separate discussion. I would also not object to
> that endpoint becoming a batch/blocktransfer PCM. 

You are using low-latency with two different meanings : one is a fixed low-latency for music applications, where rewinds are indeed useless. But there is also a case for variable latencies when you are sharing the same output between apps having different needs, and that could also be used with very limited buffering.

Also for the deep-buffer solution, as long as you dedicate this output to a single media consumption app (playback, video) then you have no need for rewinds.

In other words, whenever an output is used in a exclusive manner without being shared between apps having different latency needs then you can optimize. For everything else keep using rewinds with the max_burst information to know by how much you can rewind (and accepting that some power optimizations linked to data transfers will not happen)

Also the notion of block transfer is too limiting, the hardware granularity may be smaller than a period and the hardware may 
still be able to provide precise data on the hw_ptr and delays, so the traditional batch/block transfer definition is a tad obsolete.


> Well, there s a hackish way to say "I promisle to never rewind" on
> both endpoints even with a deep buffer on a low-power endpoint. The
> way is: instead of rewinding, open the low-latency endpoint and play
> a correction signal (i.e. the difference between the actual and the
> wanted contents of the deep buffer) through it, with low latency, and
> let the hardware mix. But I don't like this, for purely subjective
> reasons and for the need for that endpoint to have twice as much
> output amplitude than one would normally need. What do you propose
> instead?
> 
> Still, I think that a more general API that says "I promise to never
> rewind closer than X samples to the hardware pointer" would be morep
> useful than the black-and-white "I promise to never rewind" call
> assuming that it can be expressed to the hardware.

the patches provide information on the max burst so you can know by how much to rewind. Setting the max burst based on a negotiation between driver and app is an interesting concept that we looked at but there aren't too many devices that support this capability so this might not be worth the effort.
 

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2015-07-30 13:47 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-08 10:10 [RFC PATCH 0/4] better support for bursty DMA usages Pierre-Louis Bossart
2015-07-08 10:10 ` [RFC PATCH 1/4] ALSA: core: let low-level driver or userspace disable rewinds Pierre-Louis Bossart
2015-07-08 14:21   ` Takashi Iwai
2015-07-08 16:58     ` Pierre-Louis Bossart
2015-07-11 17:06   ` Alexander E. Patrakov
2015-07-14  3:32     ` Raymond Yau
2015-07-14 17:39       ` Alexander E. Patrakov
2015-07-15  1:23         ` Raymond Yau
2015-07-15  4:51           ` Alexander E. Patrakov
2015-07-28 14:19     ` Pierre-Louis Bossart
2015-07-28 15:43       ` Alexander E. Patrakov
2015-07-29 17:46         ` Pierre-Louis Bossart
2015-07-30  6:11           ` Alexander E. Patrakov
2015-07-30 13:46             ` Pierre-Louis Bossart
2015-07-08 10:10 ` [RFC PATCH 2/4] ALSA: core: add .notify callback for pcm ops Pierre-Louis Bossart
2015-07-08 14:27   ` Takashi Iwai
2015-07-08 17:10     ` Pierre-Louis Bossart
2015-07-09 14:44       ` Takashi Iwai
2015-07-09  7:25     ` Raymond Yau
2015-07-09  7:35       ` Pierre-Louis Bossart
2015-07-08 10:10 ` [RFC PATCH 3/4] ALSA: core: add report of max dma burst Pierre-Louis Bossart
2015-07-08 14:37   ` Takashi Iwai
2015-07-08 17:46     ` Pierre-Louis Bossart
2015-07-10  2:35       ` Raymond Yau
2015-07-10 17:13         ` Lars-Peter Clausen
2015-07-11 17:46     ` Alexander E. Patrakov
2015-07-11 18:28       ` Jaroslav Kysela
2015-07-16  2:11         ` Raymond Yau
2015-07-08 10:10 ` [RFC PATCH 4/4] ALSA: hda: add default value for max_dma_burst Pierre-Louis Bossart
2015-07-08 14:31 ` [RFC PATCH 0/4] better support for bursty DMA usages Takashi Iwai
2015-07-08 17:50   ` Pierre-Louis Bossart
2015-07-15 10:14 ` Raymond Yau
2015-07-15 10:38   ` Alexander E. Patrakov
2015-07-22 14:44     ` Raymond Yau

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.