[PATCH 0/3] allow zram to use zbud as underlying allocator

All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/3] allow zram to use zbud as underlying allocator
@ 2015-09-14 13:49 ` Vitaly Wool
  0 siblings, 0 replies; 63+ messages in thread
From: Vitaly Wool @ 2015-09-14 13:49 UTC (permalink / raw)
  To: minchan, sergey.senozhatsky, ddstreet; +Cc: linux-kernel, linux-mm

While using ZRAM on a small RAM footprint devices, together with KSM, I ran into several occasions when moving pages from compressed swap back into the "normal" part of RAM caused significant latencies in system operation. By using zbud I lose in compression ratio but gain in determinism, lower latencies and lower fragmentation, so in the coming patches I tried to generalize what I've done to enable zbud for zram so far.

-- 
Vitaly Wool <vitalywool@gmail.com>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [PATCH 0/3] allow zram to use zbud as underlying allocator
@ 2015-09-14 13:49 ` Vitaly Wool
  0 siblings, 0 replies; 63+ messages in thread
From: Vitaly Wool @ 2015-09-14 13:49 UTC (permalink / raw)
  To: minchan, sergey.senozhatsky, ddstreet; +Cc: linux-kernel, linux-mm

While using ZRAM on a small RAM footprint devices, together with KSM, I ran into several occasions when moving pages from compressed swap back into the "normal" part of RAM caused significant latencies in system operation. By using zbud I lose in compression ratio but gain in determinism, lower latencies and lower fragmentation, so in the coming patches I tried to generalize what I've done to enable zbud for zram so far.

-- 
Vitaly Wool <vitalywool@gmail.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [PATCH 1/3] zram: make max_zpage_size configurable
  2015-09-14 13:49 ` Vitaly Wool
@ 2015-09-14 13:50   ` Vitaly Wool
  -1 siblings, 0 replies; 63+ messages in thread
From: Vitaly Wool @ 2015-09-14 13:50 UTC (permalink / raw)
  To: minchan, sergey.senozhatsky, ddstreet; +Cc: linux-kernel, linux-mm

It makes sense to have control over what compression ratios are
ok to store pages uncompressed and what not. Moreover, if we end
up using zbud allocator for zram, any attempt to allocate a whole
page will fail, so we may want to avoid this as much as possible.

So, let's have max_zpage_size configurable as a module parameter.

Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
---
 drivers/block/zram/zram_drv.c | 13 +++++++++++++
 drivers/block/zram/zram_drv.h | 16 ----------------
 2 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 9fa15bb..6d9f1d1 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -42,6 +42,7 @@ static const char *default_compressor = "lzo";
 
 /* Module params (documentation at end) */
 static unsigned int num_devices = 1;
+static size_t max_zpage_size = PAGE_SIZE / 4 * 3;
 
 static inline void deprecated_attr_warn(const char *name)
 {
@@ -1411,6 +1412,16 @@ static int __init zram_init(void)
 		return ret;
 	}
 
+	/*
+	 * max_zpage_size must be less than or equal to:
+	 * ZS_MAX_ALLOC_SIZE. Otherwise, zs_malloc() would
+	 * always return failure.
+	 */
+	if (max_zpage_size > PAGE_SIZE) {
+		pr_err("Invalid max_zpage_size %ld\n", max_zpage_size);
+		return -EINVAL;
+	}
+
 	zram_major = register_blkdev(0, "zram");
 	if (zram_major <= 0) {
 		pr_err("Unable to get major number\n");
@@ -1444,6 +1455,8 @@ module_exit(zram_exit);
 
 module_param(num_devices, uint, 0);
 MODULE_PARM_DESC(num_devices, "Number of pre-created zram devices");
+module_param(max_zpage_size, ulong, 0);
+MODULE_PARM_DESC(max_zpage_size, "Threshold for storing compressed pages");
 
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_AUTHOR("Nitin Gupta <ngupta@vflare.org>");
diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
index 8e92339..3a29c33 100644
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -20,22 +20,6 @@
 
 #include "zcomp.h"
 
-/*-- Configurable parameters */
-
-/*
- * Pages that compress to size greater than this are stored
- * uncompressed in memory.
- */
-static const size_t max_zpage_size = PAGE_SIZE / 4 * 3;
-
-/*
- * NOTE: max_zpage_size must be less than or equal to:
- *   ZS_MAX_ALLOC_SIZE. Otherwise, zs_malloc() would
- * always return failure.
- */
-
-/*-- End of configurable params */
-
 #define SECTOR_SHIFT		9
 #define SECTORS_PER_PAGE_SHIFT	(PAGE_SHIFT - SECTOR_SHIFT)
 #define SECTORS_PER_PAGE	(1 << SECTORS_PER_PAGE_SHIFT)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH 1/3] zram: make max_zpage_size configurable
@ 2015-09-14 13:50   ` Vitaly Wool
  0 siblings, 0 replies; 63+ messages in thread
From: Vitaly Wool @ 2015-09-14 13:50 UTC (permalink / raw)
  To: minchan, sergey.senozhatsky, ddstreet; +Cc: linux-kernel, linux-mm

It makes sense to have control over what compression ratios are
ok to store pages uncompressed and what not. Moreover, if we end
up using zbud allocator for zram, any attempt to allocate a whole
page will fail, so we may want to avoid this as much as possible.

So, let's have max_zpage_size configurable as a module parameter.

Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
---
 drivers/block/zram/zram_drv.c | 13 +++++++++++++
 drivers/block/zram/zram_drv.h | 16 ----------------
 2 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 9fa15bb..6d9f1d1 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -42,6 +42,7 @@ static const char *default_compressor = "lzo";
 
 /* Module params (documentation at end) */
 static unsigned int num_devices = 1;
+static size_t max_zpage_size = PAGE_SIZE / 4 * 3;
 
 static inline void deprecated_attr_warn(const char *name)
 {
@@ -1411,6 +1412,16 @@ static int __init zram_init(void)
 		return ret;
 	}
 
+	/*
+	 * max_zpage_size must be less than or equal to:
+	 * ZS_MAX_ALLOC_SIZE. Otherwise, zs_malloc() would
+	 * always return failure.
+	 */
+	if (max_zpage_size > PAGE_SIZE) {
+		pr_err("Invalid max_zpage_size %ld\n", max_zpage_size);
+		return -EINVAL;
+	}
+
 	zram_major = register_blkdev(0, "zram");
 	if (zram_major <= 0) {
 		pr_err("Unable to get major number\n");
@@ -1444,6 +1455,8 @@ module_exit(zram_exit);
 
 module_param(num_devices, uint, 0);
 MODULE_PARM_DESC(num_devices, "Number of pre-created zram devices");
+module_param(max_zpage_size, ulong, 0);
+MODULE_PARM_DESC(max_zpage_size, "Threshold for storing compressed pages");
 
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_AUTHOR("Nitin Gupta <ngupta@vflare.org>");
diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
index 8e92339..3a29c33 100644
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -20,22 +20,6 @@
 
 #include "zcomp.h"
 
-/*-- Configurable parameters */
-
-/*
- * Pages that compress to size greater than this are stored
- * uncompressed in memory.
- */
-static const size_t max_zpage_size = PAGE_SIZE / 4 * 3;
-
-/*
- * NOTE: max_zpage_size must be less than or equal to:
- *   ZS_MAX_ALLOC_SIZE. Otherwise, zs_malloc() would
- * always return failure.
- */
-
-/*-- End of configurable params */
-
 #define SECTOR_SHIFT		9
 #define SECTORS_PER_PAGE_SHIFT	(PAGE_SHIFT - SECTOR_SHIFT)
 #define SECTORS_PER_PAGE	(1 << SECTORS_PER_PAGE_SHIFT)
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH 2/3] zpool/zsmalloc/zbud: align on interfaces
  2015-09-14 13:49 ` Vitaly Wool
@ 2015-09-14 13:51   ` Vitaly Wool
  -1 siblings, 0 replies; 63+ messages in thread
From: Vitaly Wool @ 2015-09-14 13:51 UTC (permalink / raw)
  To: minchan, sergey.senozhatsky, ddstreet; +Cc: linux-kernel, linux-mm

As a preparation step for zram to be able to use common zpool API,
there has to be some alignment done on it. This patch adds
functions that correspond to zsmalloc-specific API to the common
zpool API and takes care of the callbacks that have to be
introduced, too.

Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
---
 drivers/block/zram/zram_drv.c |  4 ++--
 include/linux/zpool.h         | 14 ++++++++++++++
 include/linux/zsmalloc.h      |  8 ++------
 mm/zbud.c                     | 12 ++++++++++++
 mm/zpool.c                    | 22 ++++++++++++++++++++++
 mm/zsmalloc.c                 | 23 ++++++++++++++++++++---
 6 files changed, 72 insertions(+), 11 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 6d9f1d1..49d5a65 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -427,12 +427,12 @@ static ssize_t mm_stat_show(struct device *dev,
 		struct device_attribute *attr, char *buf)
 {
 	struct zram *zram = dev_to_zram(dev);
-	struct zs_pool_stats pool_stats;
+	struct zpool_stats pool_stats;
 	u64 orig_size, mem_used = 0;
 	long max_used;
 	ssize_t ret;
 
-	memset(&pool_stats, 0x00, sizeof(struct zs_pool_stats));
+	memset(&pool_stats, 0x00, sizeof(struct zpool_stats));
 
 	down_read(&zram->init_lock);
 	if (init_done(zram)) {
diff --git a/include/linux/zpool.h b/include/linux/zpool.h
index 42f8ec9..f64cf86 100644
--- a/include/linux/zpool.h
+++ b/include/linux/zpool.h
@@ -17,6 +17,11 @@ struct zpool_ops {
 	int (*evict)(struct zpool *pool, unsigned long handle);
 };
 
+struct zpool_stats {
+	/* How many pages were migrated (freed) */
+	unsigned long pages_compacted;
+};
+
 /*
  * Control how a handle is mapped.  It will be ignored if the
  * implementation does not support it.  Its use is optional.
@@ -58,6 +63,10 @@ void *zpool_map_handle(struct zpool *pool, unsigned long handle,
 
 void zpool_unmap_handle(struct zpool *pool, unsigned long handle);
 
+int zpool_compact(struct zpool *pool, unsigned long *compacted);
+
+void zpool_stats(struct zpool *pool, struct zpool_stats *zstats);
+
 u64 zpool_get_total_size(struct zpool *pool);
 
 
@@ -72,6 +81,8 @@ u64 zpool_get_total_size(struct zpool *pool);
  * @shrink:	shrink the pool.
  * @map:	map a handle.
  * @unmap:	unmap a handle.
+ * @compact:	try to run compaction for the pool
+ * @stats:	get statistics for the pool
  * @total_size:	get total size of a pool.
  *
  * This is created by a zpool implementation and registered
@@ -98,6 +109,9 @@ struct zpool_driver {
 				enum zpool_mapmode mm);
 	void (*unmap)(void *pool, unsigned long handle);
 
+	int (*compact)(void *pool, unsigned long *compacted);
+	void (*stats)(void *pool, struct zpool_stats *stats);
+
 	u64 (*total_size)(void *pool);
 };
 
diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
index 6398dfa..5aee1c7 100644
--- a/include/linux/zsmalloc.h
+++ b/include/linux/zsmalloc.h
@@ -15,6 +15,7 @@
 #define _ZS_MALLOC_H_
 
 #include <linux/types.h>
+#include <linux/zpool.h>
 
 /*
  * zsmalloc mapping modes
@@ -34,11 +35,6 @@ enum zs_mapmode {
 	 */
 };
 
-struct zs_pool_stats {
-	/* How many pages were migrated (freed) */
-	unsigned long pages_compacted;
-};
-
 struct zs_pool;
 
 struct zs_pool *zs_create_pool(char *name, gfp_t flags);
@@ -54,5 +50,5 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
 unsigned long zs_get_total_pages(struct zs_pool *pool);
 unsigned long zs_compact(struct zs_pool *pool);
 
-void zs_pool_stats(struct zs_pool *pool, struct zs_pool_stats *stats);
+void zs_pool_stats(struct zs_pool *pool, struct zpool_stats *stats);
 #endif
diff --git a/mm/zbud.c b/mm/zbud.c
index fa48bcdf..0963342 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -195,6 +195,16 @@ static void zbud_zpool_unmap(void *pool, unsigned long handle)
 	zbud_unmap(pool, handle);
 }
 
+static int zbud_zpool_compact(void *pool, unsigned long *compacted)
+{
+	return -EINVAL;
+}
+
+static void zbud_zpool_stats(void *pool, struct zpool_stats *stats)
+{
+	/* no-op */
+}
+
 static u64 zbud_zpool_total_size(void *pool)
 {
 	return zbud_get_pool_size(pool) * PAGE_SIZE;
@@ -210,6 +220,8 @@ static struct zpool_driver zbud_zpool_driver = {
 	.shrink =	zbud_zpool_shrink,
 	.map =		zbud_zpool_map,
 	.unmap =	zbud_zpool_unmap,
+	.compact =	zbud_zpool_compact,
+	.stats =	zbud_zpool_stats,
 	.total_size =	zbud_zpool_total_size,
 };
 
diff --git a/mm/zpool.c b/mm/zpool.c
index 8f670d3..15a171a 100644
--- a/mm/zpool.c
+++ b/mm/zpool.c
@@ -341,6 +341,28 @@ void zpool_unmap_handle(struct zpool *zpool, unsigned long handle)
 }
 
 /**
+ * zpool_compact() - try to run compaction over zpool
+ * @pool	The zpool to compact
+ * @compacted	The number of migrated pages
+ *
+ * Returns: 0 on success, error code otherwise
+ */
+int zpool_compact(struct zpool *zpool, unsigned long *compacted)
+{
+	return zpool->driver->compact(zpool->pool, compacted);
+}
+
+/**
+ * zpool_stats() - obtain zpool statistics
+ * @pool	The zpool to get statistics for
+ * @zstats	stats to fill in
+ */
+void zpool_stats(struct zpool *zpool, struct zpool_stats *zstats)
+{
+	zpool->driver->stats(zpool->pool, zstats);
+}
+
+/**
  * zpool_get_total_size() - The total size of the pool
  * @pool	The zpool to check
  *
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index f135b1b..f002f57 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -245,7 +245,7 @@ struct zs_pool {
 	gfp_t flags;	/* allocation flags used when growing pool */
 	atomic_long_t pages_allocated;
 
-	struct zs_pool_stats stats;
+	struct zpool_stats stats;
 
 	/* Compact classes */
 	struct shrinker shrinker;
@@ -365,6 +365,21 @@ static void zs_zpool_unmap(void *pool, unsigned long handle)
 	zs_unmap_object(pool, handle);
 }
 
+static int zs_zpool_compact(void *pool, unsigned long *compacted)
+{
+	unsigned long c = zs_compact(pool);
+
+	if (compacted)
+		*compacted = c;
+	return 0;
+}
+
+
+static void zs_zpool_stats(void *pool, struct zpool_stats *stats)
+{
+	zs_pool_stats(pool, stats);
+}
+
 static u64 zs_zpool_total_size(void *pool)
 {
 	return zs_get_total_pages(pool) << PAGE_SHIFT;
@@ -380,6 +395,8 @@ static struct zpool_driver zs_zpool_driver = {
 	.shrink =	zs_zpool_shrink,
 	.map =		zs_zpool_map,
 	.unmap =	zs_zpool_unmap,
+	.compact =	zs_zpool_compact,
+	.stats =	zs_zpool_stats,
 	.total_size =	zs_zpool_total_size,
 };
 
@@ -1789,9 +1806,9 @@ unsigned long zs_compact(struct zs_pool *pool)
 }
 EXPORT_SYMBOL_GPL(zs_compact);
 
-void zs_pool_stats(struct zs_pool *pool, struct zs_pool_stats *stats)
+void zs_pool_stats(struct zs_pool *pool, struct zpool_stats *stats)
 {
-	memcpy(stats, &pool->stats, sizeof(struct zs_pool_stats));
+	memcpy(stats, &pool->stats, sizeof(struct zpool_stats));
 }
 EXPORT_SYMBOL_GPL(zs_pool_stats);
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH 2/3] zpool/zsmalloc/zbud: align on interfaces
@ 2015-09-14 13:51   ` Vitaly Wool
  0 siblings, 0 replies; 63+ messages in thread
From: Vitaly Wool @ 2015-09-14 13:51 UTC (permalink / raw)
  To: minchan, sergey.senozhatsky, ddstreet; +Cc: linux-kernel, linux-mm

As a preparation step for zram to be able to use common zpool API,
there has to be some alignment done on it. This patch adds
functions that correspond to zsmalloc-specific API to the common
zpool API and takes care of the callbacks that have to be
introduced, too.

Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
---
 drivers/block/zram/zram_drv.c |  4 ++--
 include/linux/zpool.h         | 14 ++++++++++++++
 include/linux/zsmalloc.h      |  8 ++------
 mm/zbud.c                     | 12 ++++++++++++
 mm/zpool.c                    | 22 ++++++++++++++++++++++
 mm/zsmalloc.c                 | 23 ++++++++++++++++++++---
 6 files changed, 72 insertions(+), 11 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 6d9f1d1..49d5a65 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -427,12 +427,12 @@ static ssize_t mm_stat_show(struct device *dev,
 		struct device_attribute *attr, char *buf)
 {
 	struct zram *zram = dev_to_zram(dev);
-	struct zs_pool_stats pool_stats;
+	struct zpool_stats pool_stats;
 	u64 orig_size, mem_used = 0;
 	long max_used;
 	ssize_t ret;
 
-	memset(&pool_stats, 0x00, sizeof(struct zs_pool_stats));
+	memset(&pool_stats, 0x00, sizeof(struct zpool_stats));
 
 	down_read(&zram->init_lock);
 	if (init_done(zram)) {
diff --git a/include/linux/zpool.h b/include/linux/zpool.h
index 42f8ec9..f64cf86 100644
--- a/include/linux/zpool.h
+++ b/include/linux/zpool.h
@@ -17,6 +17,11 @@ struct zpool_ops {
 	int (*evict)(struct zpool *pool, unsigned long handle);
 };
 
+struct zpool_stats {
+	/* How many pages were migrated (freed) */
+	unsigned long pages_compacted;
+};
+
 /*
  * Control how a handle is mapped.  It will be ignored if the
  * implementation does not support it.  Its use is optional.
@@ -58,6 +63,10 @@ void *zpool_map_handle(struct zpool *pool, unsigned long handle,
 
 void zpool_unmap_handle(struct zpool *pool, unsigned long handle);
 
+int zpool_compact(struct zpool *pool, unsigned long *compacted);
+
+void zpool_stats(struct zpool *pool, struct zpool_stats *zstats);
+
 u64 zpool_get_total_size(struct zpool *pool);
 
 
@@ -72,6 +81,8 @@ u64 zpool_get_total_size(struct zpool *pool);
  * @shrink:	shrink the pool.
  * @map:	map a handle.
  * @unmap:	unmap a handle.
+ * @compact:	try to run compaction for the pool
+ * @stats:	get statistics for the pool
  * @total_size:	get total size of a pool.
  *
  * This is created by a zpool implementation and registered
@@ -98,6 +109,9 @@ struct zpool_driver {
 				enum zpool_mapmode mm);
 	void (*unmap)(void *pool, unsigned long handle);
 
+	int (*compact)(void *pool, unsigned long *compacted);
+	void (*stats)(void *pool, struct zpool_stats *stats);
+
 	u64 (*total_size)(void *pool);
 };
 
diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
index 6398dfa..5aee1c7 100644
--- a/include/linux/zsmalloc.h
+++ b/include/linux/zsmalloc.h
@@ -15,6 +15,7 @@
 #define _ZS_MALLOC_H_
 
 #include <linux/types.h>
+#include <linux/zpool.h>
 
 /*
  * zsmalloc mapping modes
@@ -34,11 +35,6 @@ enum zs_mapmode {
 	 */
 };
 
-struct zs_pool_stats {
-	/* How many pages were migrated (freed) */
-	unsigned long pages_compacted;
-};
-
 struct zs_pool;
 
 struct zs_pool *zs_create_pool(char *name, gfp_t flags);
@@ -54,5 +50,5 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
 unsigned long zs_get_total_pages(struct zs_pool *pool);
 unsigned long zs_compact(struct zs_pool *pool);
 
-void zs_pool_stats(struct zs_pool *pool, struct zs_pool_stats *stats);
+void zs_pool_stats(struct zs_pool *pool, struct zpool_stats *stats);
 #endif
diff --git a/mm/zbud.c b/mm/zbud.c
index fa48bcdf..0963342 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -195,6 +195,16 @@ static void zbud_zpool_unmap(void *pool, unsigned long handle)
 	zbud_unmap(pool, handle);
 }
 
+static int zbud_zpool_compact(void *pool, unsigned long *compacted)
+{
+	return -EINVAL;
+}
+
+static void zbud_zpool_stats(void *pool, struct zpool_stats *stats)
+{
+	/* no-op */
+}
+
 static u64 zbud_zpool_total_size(void *pool)
 {
 	return zbud_get_pool_size(pool) * PAGE_SIZE;
@@ -210,6 +220,8 @@ static struct zpool_driver zbud_zpool_driver = {
 	.shrink =	zbud_zpool_shrink,
 	.map =		zbud_zpool_map,
 	.unmap =	zbud_zpool_unmap,
+	.compact =	zbud_zpool_compact,
+	.stats =	zbud_zpool_stats,
 	.total_size =	zbud_zpool_total_size,
 };
 
diff --git a/mm/zpool.c b/mm/zpool.c
index 8f670d3..15a171a 100644
--- a/mm/zpool.c
+++ b/mm/zpool.c
@@ -341,6 +341,28 @@ void zpool_unmap_handle(struct zpool *zpool, unsigned long handle)
 }
 
 /**
+ * zpool_compact() - try to run compaction over zpool
+ * @pool	The zpool to compact
+ * @compacted	The number of migrated pages
+ *
+ * Returns: 0 on success, error code otherwise
+ */
+int zpool_compact(struct zpool *zpool, unsigned long *compacted)
+{
+	return zpool->driver->compact(zpool->pool, compacted);
+}
+
+/**
+ * zpool_stats() - obtain zpool statistics
+ * @pool	The zpool to get statistics for
+ * @zstats	stats to fill in
+ */
+void zpool_stats(struct zpool *zpool, struct zpool_stats *zstats)
+{
+	zpool->driver->stats(zpool->pool, zstats);
+}
+
+/**
  * zpool_get_total_size() - The total size of the pool
  * @pool	The zpool to check
  *
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index f135b1b..f002f57 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -245,7 +245,7 @@ struct zs_pool {
 	gfp_t flags;	/* allocation flags used when growing pool */
 	atomic_long_t pages_allocated;
 
-	struct zs_pool_stats stats;
+	struct zpool_stats stats;
 
 	/* Compact classes */
 	struct shrinker shrinker;
@@ -365,6 +365,21 @@ static void zs_zpool_unmap(void *pool, unsigned long handle)
 	zs_unmap_object(pool, handle);
 }
 
+static int zs_zpool_compact(void *pool, unsigned long *compacted)
+{
+	unsigned long c = zs_compact(pool);
+
+	if (compacted)
+		*compacted = c;
+	return 0;
+}
+
+
+static void zs_zpool_stats(void *pool, struct zpool_stats *stats)
+{
+	zs_pool_stats(pool, stats);
+}
+
 static u64 zs_zpool_total_size(void *pool)
 {
 	return zs_get_total_pages(pool) << PAGE_SHIFT;
@@ -380,6 +395,8 @@ static struct zpool_driver zs_zpool_driver = {
 	.shrink =	zs_zpool_shrink,
 	.map =		zs_zpool_map,
 	.unmap =	zs_zpool_unmap,
+	.compact =	zs_zpool_compact,
+	.stats =	zs_zpool_stats,
 	.total_size =	zs_zpool_total_size,
 };
 
@@ -1789,9 +1806,9 @@ unsigned long zs_compact(struct zs_pool *pool)
 }
 EXPORT_SYMBOL_GPL(zs_compact);
 
-void zs_pool_stats(struct zs_pool *pool, struct zs_pool_stats *stats)
+void zs_pool_stats(struct zs_pool *pool, struct zpool_stats *stats)
 {
-	memcpy(stats, &pool->stats, sizeof(struct zs_pool_stats));
+	memcpy(stats, &pool->stats, sizeof(struct zpool_stats));
 }
 EXPORT_SYMBOL_GPL(zs_pool_stats);
 
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH 3/3] zram: use common zpool interface
  2015-09-14 13:49 ` Vitaly Wool
@ 2015-09-14 13:55   ` Vitaly Wool
  -1 siblings, 0 replies; 63+ messages in thread
From: Vitaly Wool @ 2015-09-14 13:55 UTC (permalink / raw)
  To: minchan, sergey.senozhatsky, ddstreet; +Cc: linux-kernel, linux-mm

Update zram driver to use common zpool API instead of calling
zsmalloc functions directly. This patch also adds a parameter
that allows for changing underlying compressor storage to zbud.

Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
---
 drivers/block/zram/Kconfig    |  3 ++-
 drivers/block/zram/zram_drv.c | 44 ++++++++++++++++++++++++-------------------
 drivers/block/zram/zram_drv.h |  4 ++--
 3 files changed, 29 insertions(+), 22 deletions(-)

diff --git a/drivers/block/zram/Kconfig b/drivers/block/zram/Kconfig
index 386ba3d..4831d0a 100644
--- a/drivers/block/zram/Kconfig
+++ b/drivers/block/zram/Kconfig
@@ -1,6 +1,7 @@
 config ZRAM
 	tristate "Compressed RAM block device support"
-	depends on BLOCK && SYSFS && ZSMALLOC
+	depends on BLOCK && SYSFS
+	select ZPOOL
 	select LZO_COMPRESS
 	select LZO_DECOMPRESS
 	default n
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 49d5a65..2829c3d 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -44,6 +44,9 @@ static const char *default_compressor = "lzo";
 static unsigned int num_devices = 1;
 static size_t max_zpage_size = PAGE_SIZE / 4 * 3;
 
+#define ZRAM_ZPOOL_DEFAULT "zsmalloc"
+static char *pool_type = ZRAM_ZPOOL_DEFAULT;
+
 static inline void deprecated_attr_warn(const char *name)
 {
 	pr_warn_once("%d (%s) Attribute %s (and others) will be removed. %s\n",
@@ -229,11 +232,11 @@ static ssize_t mem_used_total_show(struct device *dev,
 	down_read(&zram->init_lock);
 	if (init_done(zram)) {
 		struct zram_meta *meta = zram->meta;
-		val = zs_get_total_pages(meta->mem_pool);
+		val = zpool_get_total_size(meta->mem_pool);
 	}
 	up_read(&zram->init_lock);
 
-	return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
+	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
 }
 
 static ssize_t mem_limit_show(struct device *dev,
@@ -298,7 +301,7 @@ static ssize_t mem_used_max_store(struct device *dev,
 	if (init_done(zram)) {
 		struct zram_meta *meta = zram->meta;
 		atomic_long_set(&zram->stats.max_used_pages,
-				zs_get_total_pages(meta->mem_pool));
+			zpool_get_total_size(meta->mem_pool) >> PAGE_SHIFT);
 	}
 	up_read(&zram->init_lock);
 
@@ -399,7 +402,7 @@ static ssize_t compact_store(struct device *dev,
 	}
 
 	meta = zram->meta;
-	zs_compact(meta->mem_pool);
+	zpool_compact(meta->mem_pool, NULL);
 	up_read(&zram->init_lock);
 
 	return len;
@@ -436,8 +439,8 @@ static ssize_t mm_stat_show(struct device *dev,
 
 	down_read(&zram->init_lock);
 	if (init_done(zram)) {
-		mem_used = zs_get_total_pages(zram->meta->mem_pool);
-		zs_pool_stats(zram->meta->mem_pool, &pool_stats);
+		mem_used = zpool_get_total_size(zram->meta->mem_pool);
+		zpool_stats(zram->meta->mem_pool, &pool_stats);
 	}
 
 	orig_size = atomic64_read(&zram->stats.pages_stored);
@@ -447,7 +450,7 @@ static ssize_t mm_stat_show(struct device *dev,
 			"%8llu %8llu %8llu %8lu %8ld %8llu %8lu\n",
 			orig_size << PAGE_SHIFT,
 			(u64)atomic64_read(&zram->stats.compr_data_size),
-			mem_used << PAGE_SHIFT,
+			mem_used,
 			zram->limit_pages << PAGE_SHIFT,
 			max_used << PAGE_SHIFT,
 			(u64)atomic64_read(&zram->stats.zero_pages),
@@ -492,10 +495,10 @@ static void zram_meta_free(struct zram_meta *meta, u64 disksize)
 		if (!handle)
 			continue;
 
-		zs_free(meta->mem_pool, handle);
+		zpool_free(meta->mem_pool, handle);
 	}
 
-	zs_destroy_pool(meta->mem_pool);
+	zpool_destroy_pool(meta->mem_pool);
 	vfree(meta->table);
 	kfree(meta);
 }
@@ -515,7 +518,8 @@ static struct zram_meta *zram_meta_alloc(char *pool_name, u64 disksize)
 		goto out_error;
 	}
 
-	meta->mem_pool = zs_create_pool(pool_name, GFP_NOIO | __GFP_HIGHMEM);
+	meta->mem_pool = zpool_create_pool(pool_type, pool_name,
+			GFP_NOIO | __GFP_HIGHMEM, NULL);
 	if (!meta->mem_pool) {
 		pr_err("Error creating memory pool\n");
 		goto out_error;
@@ -551,7 +555,7 @@ static void zram_free_page(struct zram *zram, size_t index)
 		return;
 	}
 
-	zs_free(meta->mem_pool, handle);
+	zpool_free(meta->mem_pool, handle);
 
 	atomic64_sub(zram_get_obj_size(meta, index),
 			&zram->stats.compr_data_size);
@@ -579,12 +583,12 @@ static int zram_decompress_page(struct zram *zram, char *mem, u32 index)
 		return 0;
 	}
 
-	cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_RO);
+	cmem = zpool_map_handle(meta->mem_pool, handle, ZPOOL_MM_RO);
 	if (size == PAGE_SIZE)
 		copy_page(mem, cmem);
 	else
 		ret = zcomp_decompress(zram->comp, cmem, size, mem);
-	zs_unmap_object(meta->mem_pool, handle);
+	zpool_unmap_handle(meta->mem_pool, handle);
 	bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
 
 	/* Should NEVER happen. Return bio error if it does. */
@@ -718,24 +722,24 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
 			src = uncmem;
 	}
 
-	handle = zs_malloc(meta->mem_pool, clen);
-	if (!handle) {
+	if (zpool_malloc(meta->mem_pool, clen, __GFP_IO | __GFP_NOWARN,
+			&handle) != 0) {
 		pr_err("Error allocating memory for compressed page: %u, size=%zu\n",
 			index, clen);
 		ret = -ENOMEM;
 		goto out;
 	}
 
-	alloced_pages = zs_get_total_pages(meta->mem_pool);
+	alloced_pages = zpool_get_total_size(meta->mem_pool) >> PAGE_SHIFT;
 	if (zram->limit_pages && alloced_pages > zram->limit_pages) {
-		zs_free(meta->mem_pool, handle);
+		zpool_free(meta->mem_pool, handle);
 		ret = -ENOMEM;
 		goto out;
 	}
 
 	update_used_max(zram, alloced_pages);
 
-	cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
+	cmem = zpool_map_handle(meta->mem_pool, handle, ZPOOL_MM_WO);
 
 	if ((clen == PAGE_SIZE) && !is_partial_io(bvec)) {
 		src = kmap_atomic(page);
@@ -747,7 +751,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
 
 	zcomp_strm_release(zram->comp, zstrm);
 	zstrm = NULL;
-	zs_unmap_object(meta->mem_pool, handle);
+	zpool_unmap_handle(meta->mem_pool, handle);
 
 	/*
 	 * Free memory associated with this sector
@@ -1457,6 +1461,8 @@ module_param(num_devices, uint, 0);
 MODULE_PARM_DESC(num_devices, "Number of pre-created zram devices");
 module_param(max_zpage_size, ulong, 0);
 MODULE_PARM_DESC(max_zpage_size, "Threshold for storing compressed pages");
+module_param_named(zpool_type, pool_type, charp, 0444);
+MODULE_PARM_DESC(zpool_type, "zpool implementation selection (zsmalloc vs zbud)");
 
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_AUTHOR("Nitin Gupta <ngupta@vflare.org>");
diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
index 3a29c33..9a64b94 100644
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -16,7 +16,7 @@
 #define _ZRAM_DRV_H_
 
 #include <linux/spinlock.h>
-#include <linux/zsmalloc.h>
+#include <linux/zpool.h>
 
 #include "zcomp.h"
 
@@ -73,7 +73,7 @@ struct zram_stats {
 
 struct zram_meta {
 	struct zram_table_entry *table;
-	struct zs_pool *mem_pool;
+	struct zpool *mem_pool;
 };
 
 struct zram {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH 3/3] zram: use common zpool interface
@ 2015-09-14 13:55   ` Vitaly Wool
  0 siblings, 0 replies; 63+ messages in thread
From: Vitaly Wool @ 2015-09-14 13:55 UTC (permalink / raw)
  To: minchan, sergey.senozhatsky, ddstreet; +Cc: linux-kernel, linux-mm

Update zram driver to use common zpool API instead of calling
zsmalloc functions directly. This patch also adds a parameter
that allows for changing underlying compressor storage to zbud.

Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
---
 drivers/block/zram/Kconfig    |  3 ++-
 drivers/block/zram/zram_drv.c | 44 ++++++++++++++++++++++++-------------------
 drivers/block/zram/zram_drv.h |  4 ++--
 3 files changed, 29 insertions(+), 22 deletions(-)

diff --git a/drivers/block/zram/Kconfig b/drivers/block/zram/Kconfig
index 386ba3d..4831d0a 100644
--- a/drivers/block/zram/Kconfig
+++ b/drivers/block/zram/Kconfig
@@ -1,6 +1,7 @@
 config ZRAM
 	tristate "Compressed RAM block device support"
-	depends on BLOCK && SYSFS && ZSMALLOC
+	depends on BLOCK && SYSFS
+	select ZPOOL
 	select LZO_COMPRESS
 	select LZO_DECOMPRESS
 	default n
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 49d5a65..2829c3d 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -44,6 +44,9 @@ static const char *default_compressor = "lzo";
 static unsigned int num_devices = 1;
 static size_t max_zpage_size = PAGE_SIZE / 4 * 3;
 
+#define ZRAM_ZPOOL_DEFAULT "zsmalloc"
+static char *pool_type = ZRAM_ZPOOL_DEFAULT;
+
 static inline void deprecated_attr_warn(const char *name)
 {
 	pr_warn_once("%d (%s) Attribute %s (and others) will be removed. %s\n",
@@ -229,11 +232,11 @@ static ssize_t mem_used_total_show(struct device *dev,
 	down_read(&zram->init_lock);
 	if (init_done(zram)) {
 		struct zram_meta *meta = zram->meta;
-		val = zs_get_total_pages(meta->mem_pool);
+		val = zpool_get_total_size(meta->mem_pool);
 	}
 	up_read(&zram->init_lock);
 
-	return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
+	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
 }
 
 static ssize_t mem_limit_show(struct device *dev,
@@ -298,7 +301,7 @@ static ssize_t mem_used_max_store(struct device *dev,
 	if (init_done(zram)) {
 		struct zram_meta *meta = zram->meta;
 		atomic_long_set(&zram->stats.max_used_pages,
-				zs_get_total_pages(meta->mem_pool));
+			zpool_get_total_size(meta->mem_pool) >> PAGE_SHIFT);
 	}
 	up_read(&zram->init_lock);
 
@@ -399,7 +402,7 @@ static ssize_t compact_store(struct device *dev,
 	}
 
 	meta = zram->meta;
-	zs_compact(meta->mem_pool);
+	zpool_compact(meta->mem_pool, NULL);
 	up_read(&zram->init_lock);
 
 	return len;
@@ -436,8 +439,8 @@ static ssize_t mm_stat_show(struct device *dev,
 
 	down_read(&zram->init_lock);
 	if (init_done(zram)) {
-		mem_used = zs_get_total_pages(zram->meta->mem_pool);
-		zs_pool_stats(zram->meta->mem_pool, &pool_stats);
+		mem_used = zpool_get_total_size(zram->meta->mem_pool);
+		zpool_stats(zram->meta->mem_pool, &pool_stats);
 	}
 
 	orig_size = atomic64_read(&zram->stats.pages_stored);
@@ -447,7 +450,7 @@ static ssize_t mm_stat_show(struct device *dev,
 			"%8llu %8llu %8llu %8lu %8ld %8llu %8lu\n",
 			orig_size << PAGE_SHIFT,
 			(u64)atomic64_read(&zram->stats.compr_data_size),
-			mem_used << PAGE_SHIFT,
+			mem_used,
 			zram->limit_pages << PAGE_SHIFT,
 			max_used << PAGE_SHIFT,
 			(u64)atomic64_read(&zram->stats.zero_pages),
@@ -492,10 +495,10 @@ static void zram_meta_free(struct zram_meta *meta, u64 disksize)
 		if (!handle)
 			continue;
 
-		zs_free(meta->mem_pool, handle);
+		zpool_free(meta->mem_pool, handle);
 	}
 
-	zs_destroy_pool(meta->mem_pool);
+	zpool_destroy_pool(meta->mem_pool);
 	vfree(meta->table);
 	kfree(meta);
 }
@@ -515,7 +518,8 @@ static struct zram_meta *zram_meta_alloc(char *pool_name, u64 disksize)
 		goto out_error;
 	}
 
-	meta->mem_pool = zs_create_pool(pool_name, GFP_NOIO | __GFP_HIGHMEM);
+	meta->mem_pool = zpool_create_pool(pool_type, pool_name,
+			GFP_NOIO | __GFP_HIGHMEM, NULL);
 	if (!meta->mem_pool) {
 		pr_err("Error creating memory pool\n");
 		goto out_error;
@@ -551,7 +555,7 @@ static void zram_free_page(struct zram *zram, size_t index)
 		return;
 	}
 
-	zs_free(meta->mem_pool, handle);
+	zpool_free(meta->mem_pool, handle);
 
 	atomic64_sub(zram_get_obj_size(meta, index),
 			&zram->stats.compr_data_size);
@@ -579,12 +583,12 @@ static int zram_decompress_page(struct zram *zram, char *mem, u32 index)
 		return 0;
 	}
 
-	cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_RO);
+	cmem = zpool_map_handle(meta->mem_pool, handle, ZPOOL_MM_RO);
 	if (size == PAGE_SIZE)
 		copy_page(mem, cmem);
 	else
 		ret = zcomp_decompress(zram->comp, cmem, size, mem);
-	zs_unmap_object(meta->mem_pool, handle);
+	zpool_unmap_handle(meta->mem_pool, handle);
 	bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
 
 	/* Should NEVER happen. Return bio error if it does. */
@@ -718,24 +722,24 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
 			src = uncmem;
 	}
 
-	handle = zs_malloc(meta->mem_pool, clen);
-	if (!handle) {
+	if (zpool_malloc(meta->mem_pool, clen, __GFP_IO | __GFP_NOWARN,
+			&handle) != 0) {
 		pr_err("Error allocating memory for compressed page: %u, size=%zu\n",
 			index, clen);
 		ret = -ENOMEM;
 		goto out;
 	}
 
-	alloced_pages = zs_get_total_pages(meta->mem_pool);
+	alloced_pages = zpool_get_total_size(meta->mem_pool) >> PAGE_SHIFT;
 	if (zram->limit_pages && alloced_pages > zram->limit_pages) {
-		zs_free(meta->mem_pool, handle);
+		zpool_free(meta->mem_pool, handle);
 		ret = -ENOMEM;
 		goto out;
 	}
 
 	update_used_max(zram, alloced_pages);
 
-	cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
+	cmem = zpool_map_handle(meta->mem_pool, handle, ZPOOL_MM_WO);
 
 	if ((clen == PAGE_SIZE) && !is_partial_io(bvec)) {
 		src = kmap_atomic(page);
@@ -747,7 +751,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
 
 	zcomp_strm_release(zram->comp, zstrm);
 	zstrm = NULL;
-	zs_unmap_object(meta->mem_pool, handle);
+	zpool_unmap_handle(meta->mem_pool, handle);
 
 	/*
 	 * Free memory associated with this sector
@@ -1457,6 +1461,8 @@ module_param(num_devices, uint, 0);
 MODULE_PARM_DESC(num_devices, "Number of pre-created zram devices");
 module_param(max_zpage_size, ulong, 0);
 MODULE_PARM_DESC(max_zpage_size, "Threshold for storing compressed pages");
+module_param_named(zpool_type, pool_type, charp, 0444);
+MODULE_PARM_DESC(zpool_type, "zpool implementation selection (zsmalloc vs zbud)");
 
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_AUTHOR("Nitin Gupta <ngupta@vflare.org>");
diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
index 3a29c33..9a64b94 100644
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -16,7 +16,7 @@
 #define _ZRAM_DRV_H_
 
 #include <linux/spinlock.h>
-#include <linux/zsmalloc.h>
+#include <linux/zpool.h>
 
 #include "zcomp.h"
 
@@ -73,7 +73,7 @@ struct zram_stats {
 
 struct zram_meta {
 	struct zram_table_entry *table;
-	struct zs_pool *mem_pool;
+	struct zpool *mem_pool;
 };
 
 struct zram {
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
  2015-09-14 13:49 ` Vitaly Wool
@ 2015-09-14 14:01   ` Vlastimil Babka
  -1 siblings, 0 replies; 63+ messages in thread
From: Vlastimil Babka @ 2015-09-14 14:01 UTC (permalink / raw)
  To: Vitaly Wool, minchan, sergey.senozhatsky, ddstreet; +Cc: linux-kernel, linux-mm

On 09/14/2015 03:49 PM, Vitaly Wool wrote:
> While using ZRAM on a small RAM footprint devices, together with
> KSM,
> I ran into several occasions when moving pages from compressed swap back
> into the "normal" part of RAM caused significant latencies in system

I'm sure Minchan will want to hear the details of that :)

> operation. By using zbud I lose in compression ratio but gain in
> determinism, lower latencies and lower fragmentation, so in the coming

I doubt the "lower fragmentation" part given what I've read about the 
design of zbud and zsmalloc?

> patches I tried to generalize what I've done to enable zbud for zram so far.





^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
@ 2015-09-14 14:01   ` Vlastimil Babka
  0 siblings, 0 replies; 63+ messages in thread
From: Vlastimil Babka @ 2015-09-14 14:01 UTC (permalink / raw)
  To: Vitaly Wool, minchan, sergey.senozhatsky, ddstreet; +Cc: linux-kernel, linux-mm

On 09/14/2015 03:49 PM, Vitaly Wool wrote:
> While using ZRAM on a small RAM footprint devices, together with
> KSM,
> I ran into several occasions when moving pages from compressed swap back
> into the "normal" part of RAM caused significant latencies in system

I'm sure Minchan will want to hear the details of that :)

> operation. By using zbud I lose in compression ratio but gain in
> determinism, lower latencies and lower fragmentation, so in the coming

I doubt the "lower fragmentation" part given what I've read about the 
design of zbud and zsmalloc?

> patches I tried to generalize what I've done to enable zbud for zram so far.




--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
  2015-09-14 14:01   ` Vlastimil Babka
@ 2015-09-14 14:12     ` Vitaly Wool
  -1 siblings, 0 replies; 63+ messages in thread
From: Vitaly Wool @ 2015-09-14 14:12 UTC (permalink / raw)
  To: Vlastimil Babka; +Cc: minchan, sergey.senozhatsky, ddstreet, LKML, linux-mm

On Mon, Sep 14, 2015 at 4:01 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
>
> On 09/14/2015 03:49 PM, Vitaly Wool wrote:
>>
>> While using ZRAM on a small RAM footprint devices, together with
>> KSM,
>> I ran into several occasions when moving pages from compressed swap back
>> into the "normal" part of RAM caused significant latencies in system
>
>
> I'm sure Minchan will want to hear the details of that :)
>
>> operation. By using zbud I lose in compression ratio but gain in
>> determinism, lower latencies and lower fragmentation, so in the coming
>
>
> I doubt the "lower fragmentation" part given what I've read about the design of zbud and zsmalloc?

As it turns out, I see more cases of compaction kicking in and
significantly more compact_stalls with zsmalloc.

~vitaly

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
@ 2015-09-14 14:12     ` Vitaly Wool
  0 siblings, 0 replies; 63+ messages in thread
From: Vitaly Wool @ 2015-09-14 14:12 UTC (permalink / raw)
  To: Vlastimil Babka; +Cc: minchan, sergey.senozhatsky, ddstreet, LKML, linux-mm

On Mon, Sep 14, 2015 at 4:01 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
>
> On 09/14/2015 03:49 PM, Vitaly Wool wrote:
>>
>> While using ZRAM on a small RAM footprint devices, together with
>> KSM,
>> I ran into several occasions when moving pages from compressed swap back
>> into the "normal" part of RAM caused significant latencies in system
>
>
> I'm sure Minchan will want to hear the details of that :)
>
>> operation. By using zbud I lose in compression ratio but gain in
>> determinism, lower latencies and lower fragmentation, so in the coming
>
>
> I doubt the "lower fragmentation" part given what I've read about the design of zbud and zsmalloc?

As it turns out, I see more cases of compaction kicking in and
significantly more compact_stalls with zsmalloc.

~vitaly

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
  2015-09-14 14:12     ` Vitaly Wool
@ 2015-09-14 14:14       ` Vlastimil Babka
  -1 siblings, 0 replies; 63+ messages in thread
From: Vlastimil Babka @ 2015-09-14 14:14 UTC (permalink / raw)
  To: Vitaly Wool; +Cc: minchan, sergey.senozhatsky, ddstreet, LKML, linux-mm

On 09/14/2015 04:12 PM, Vitaly Wool wrote:
> On Mon, Sep 14, 2015 at 4:01 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
>>
>> On 09/14/2015 03:49 PM, Vitaly Wool wrote:
>>>
>>> While using ZRAM on a small RAM footprint devices, together with
>>> KSM,
>>> I ran into several occasions when moving pages from compressed swap back
>>> into the "normal" part of RAM caused significant latencies in system
>>
>>
>> I'm sure Minchan will want to hear the details of that :)
>>
>>> operation. By using zbud I lose in compression ratio but gain in
>>> determinism, lower latencies and lower fragmentation, so in the coming
>>
>>
>> I doubt the "lower fragmentation" part given what I've read about the design of zbud and zsmalloc?
>
> As it turns out, I see more cases of compaction kicking in and
> significantly more compact_stalls with zsmalloc.

Interesting, I thought that zsmalloc doesn't need contiguous high-order 
pages.

> ~vitaly
>


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
@ 2015-09-14 14:14       ` Vlastimil Babka
  0 siblings, 0 replies; 63+ messages in thread
From: Vlastimil Babka @ 2015-09-14 14:14 UTC (permalink / raw)
  To: Vitaly Wool; +Cc: minchan, sergey.senozhatsky, ddstreet, LKML, linux-mm

On 09/14/2015 04:12 PM, Vitaly Wool wrote:
> On Mon, Sep 14, 2015 at 4:01 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
>>
>> On 09/14/2015 03:49 PM, Vitaly Wool wrote:
>>>
>>> While using ZRAM on a small RAM footprint devices, together with
>>> KSM,
>>> I ran into several occasions when moving pages from compressed swap back
>>> into the "normal" part of RAM caused significant latencies in system
>>
>>
>> I'm sure Minchan will want to hear the details of that :)
>>
>>> operation. By using zbud I lose in compression ratio but gain in
>>> determinism, lower latencies and lower fragmentation, so in the coming
>>
>>
>> I doubt the "lower fragmentation" part given what I've read about the design of zbud and zsmalloc?
>
> As it turns out, I see more cases of compaction kicking in and
> significantly more compact_stalls with zsmalloc.

Interesting, I thought that zsmalloc doesn't need contiguous high-order 
pages.

> ~vitaly
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
  2015-09-14 13:49 ` Vitaly Wool
@ 2015-09-15  0:49   ` Sergey Senozhatsky
  -1 siblings, 0 replies; 63+ messages in thread
From: Sergey Senozhatsky @ 2015-09-15  0:49 UTC (permalink / raw)
  To: Vitaly Wool; +Cc: minchan, sergey.senozhatsky, ddstreet, linux-kernel, linux-mm

On (09/14/15 15:49), Vitaly Wool wrote:
> While using ZRAM on a small RAM footprint devices, together with KSM,
> I ran into several occasions when moving pages from compressed swap back
> into the "normal" part of RAM caused significant latencies in system operation.
> By using zbud I lose in compression ratio but gain in determinism, lower
> latencies and lower fragmentation, so in the coming patches I tried to
> generalize what I've done to enable zbud for zram so far.
> 

do you have CONFIG_PGTABLE_MAPPING enabled or disabled? or
kmap_atomic/memcpy/kunmap_atomic is not the root cause here?
can you provide more details please?

	-ss

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
@ 2015-09-15  0:49   ` Sergey Senozhatsky
  0 siblings, 0 replies; 63+ messages in thread
From: Sergey Senozhatsky @ 2015-09-15  0:49 UTC (permalink / raw)
  To: Vitaly Wool; +Cc: minchan, sergey.senozhatsky, ddstreet, linux-kernel, linux-mm

On (09/14/15 15:49), Vitaly Wool wrote:
> While using ZRAM on a small RAM footprint devices, together with KSM,
> I ran into several occasions when moving pages from compressed swap back
> into the "normal" part of RAM caused significant latencies in system operation.
> By using zbud I lose in compression ratio but gain in determinism, lower
> latencies and lower fragmentation, so in the coming patches I tried to
> generalize what I've done to enable zbud for zram so far.
> 

do you have CONFIG_PGTABLE_MAPPING enabled or disabled? or
kmap_atomic/memcpy/kunmap_atomic is not the root cause here?
can you provide more details please?

	-ss

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 1/3] zram: make max_zpage_size configurable
  2015-09-14 13:50   ` Vitaly Wool
@ 2015-09-15  1:00     ` Sergey Senozhatsky
  -1 siblings, 0 replies; 63+ messages in thread
From: Sergey Senozhatsky @ 2015-09-15  1:00 UTC (permalink / raw)
  To: Vitaly Wool; +Cc: minchan, sergey.senozhatsky, ddstreet, linux-kernel, linux-mm

On (09/14/15 15:50), Vitaly Wool wrote:
> It makes sense to have control over what compression ratios are
> ok to store pages uncompressed and what not.

um... I don't want this to be exported. this is very 'zram-internal'.

besides, you remove the exsiting default value
- static const size_t max_zpage_size = PAGE_SIZE / 4 * 3;

so now people must provide this module param in order to make zram
work the way it used to work for years?


> Moreover, if we end up using zbud allocator for zram, any attempt to
> allocate a whole page will fail, so we may want to avoid this as much
> as possible.

so how does it help?

> So, let's have max_zpage_size configurable as a module parameter.
> 
> Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
> ---
>  drivers/block/zram/zram_drv.c | 13 +++++++++++++
>  drivers/block/zram/zram_drv.h | 16 ----------------
>  2 files changed, 13 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 9fa15bb..6d9f1d1 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -42,6 +42,7 @@ static const char *default_compressor = "lzo";
>  
>  /* Module params (documentation at end) */
>  static unsigned int num_devices = 1;
> +static size_t max_zpage_size = PAGE_SIZE / 4 * 3;
>  
>  static inline void deprecated_attr_warn(const char *name)
>  {
> @@ -1411,6 +1412,16 @@ static int __init zram_init(void)
>  		return ret;
>  	}
>  
> +	/*
> +	 * max_zpage_size must be less than or equal to:
> +	 * ZS_MAX_ALLOC_SIZE. Otherwise, zs_malloc() would
> +	 * always return failure.
> +	 */
> +	if (max_zpage_size > PAGE_SIZE) {
> +		pr_err("Invalid max_zpage_size %ld\n", max_zpage_size);

and how do people find out ZS_MAX_ALLOC_SIZE? this error message does not
help.

> +		return -EINVAL;
> +	}
> +
>  	zram_major = register_blkdev(0, "zram");
>  	if (zram_major <= 0) {
>  		pr_err("Unable to get major number\n");
> @@ -1444,6 +1455,8 @@ module_exit(zram_exit);
>  
>  module_param(num_devices, uint, 0);
>  MODULE_PARM_DESC(num_devices, "Number of pre-created zram devices");
> +module_param(max_zpage_size, ulong, 0);
> +MODULE_PARM_DESC(max_zpage_size, "Threshold for storing compressed pages");

unclear description.


>  
>  MODULE_LICENSE("Dual BSD/GPL");
>  MODULE_AUTHOR("Nitin Gupta <ngupta@vflare.org>");
> diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> index 8e92339..3a29c33 100644
> --- a/drivers/block/zram/zram_drv.h
> +++ b/drivers/block/zram/zram_drv.h
> @@ -20,22 +20,6 @@
>  
>  #include "zcomp.h"
>  
> -/*-- Configurable parameters */
> -
> -/*
> - * Pages that compress to size greater than this are stored
> - * uncompressed in memory.
> - */
> -static const size_t max_zpage_size = PAGE_SIZE / 4 * 3;
> -
> -/*
> - * NOTE: max_zpage_size must be less than or equal to:
> - *   ZS_MAX_ALLOC_SIZE. Otherwise, zs_malloc() would
> - * always return failure.
> - */
> -
> -/*-- End of configurable params */
> -
>  #define SECTOR_SHIFT		9
>  #define SECTORS_PER_PAGE_SHIFT	(PAGE_SHIFT - SECTOR_SHIFT)
>  #define SECTORS_PER_PAGE	(1 << SECTORS_PER_PAGE_SHIFT)
> -- 
> 1.9.1
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 1/3] zram: make max_zpage_size configurable
@ 2015-09-15  1:00     ` Sergey Senozhatsky
  0 siblings, 0 replies; 63+ messages in thread
From: Sergey Senozhatsky @ 2015-09-15  1:00 UTC (permalink / raw)
  To: Vitaly Wool; +Cc: minchan, sergey.senozhatsky, ddstreet, linux-kernel, linux-mm

On (09/14/15 15:50), Vitaly Wool wrote:
> It makes sense to have control over what compression ratios are
> ok to store pages uncompressed and what not.

um... I don't want this to be exported. this is very 'zram-internal'.

besides, you remove the exsiting default value
- static const size_t max_zpage_size = PAGE_SIZE / 4 * 3;

so now people must provide this module param in order to make zram
work the way it used to work for years?


> Moreover, if we end up using zbud allocator for zram, any attempt to
> allocate a whole page will fail, so we may want to avoid this as much
> as possible.

so how does it help?

> So, let's have max_zpage_size configurable as a module parameter.
> 
> Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
> ---
>  drivers/block/zram/zram_drv.c | 13 +++++++++++++
>  drivers/block/zram/zram_drv.h | 16 ----------------
>  2 files changed, 13 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 9fa15bb..6d9f1d1 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -42,6 +42,7 @@ static const char *default_compressor = "lzo";
>  
>  /* Module params (documentation at end) */
>  static unsigned int num_devices = 1;
> +static size_t max_zpage_size = PAGE_SIZE / 4 * 3;
>  
>  static inline void deprecated_attr_warn(const char *name)
>  {
> @@ -1411,6 +1412,16 @@ static int __init zram_init(void)
>  		return ret;
>  	}
>  
> +	/*
> +	 * max_zpage_size must be less than or equal to:
> +	 * ZS_MAX_ALLOC_SIZE. Otherwise, zs_malloc() would
> +	 * always return failure.
> +	 */
> +	if (max_zpage_size > PAGE_SIZE) {
> +		pr_err("Invalid max_zpage_size %ld\n", max_zpage_size);

and how do people find out ZS_MAX_ALLOC_SIZE? this error message does not
help.

> +		return -EINVAL;
> +	}
> +
>  	zram_major = register_blkdev(0, "zram");
>  	if (zram_major <= 0) {
>  		pr_err("Unable to get major number\n");
> @@ -1444,6 +1455,8 @@ module_exit(zram_exit);
>  
>  module_param(num_devices, uint, 0);
>  MODULE_PARM_DESC(num_devices, "Number of pre-created zram devices");
> +module_param(max_zpage_size, ulong, 0);
> +MODULE_PARM_DESC(max_zpage_size, "Threshold for storing compressed pages");

unclear description.


>  
>  MODULE_LICENSE("Dual BSD/GPL");
>  MODULE_AUTHOR("Nitin Gupta <ngupta@vflare.org>");
> diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> index 8e92339..3a29c33 100644
> --- a/drivers/block/zram/zram_drv.h
> +++ b/drivers/block/zram/zram_drv.h
> @@ -20,22 +20,6 @@
>  
>  #include "zcomp.h"
>  
> -/*-- Configurable parameters */
> -
> -/*
> - * Pages that compress to size greater than this are stored
> - * uncompressed in memory.
> - */
> -static const size_t max_zpage_size = PAGE_SIZE / 4 * 3;
> -
> -/*
> - * NOTE: max_zpage_size must be less than or equal to:
> - *   ZS_MAX_ALLOC_SIZE. Otherwise, zs_malloc() would
> - * always return failure.
> - */
> -
> -/*-- End of configurable params */
> -
>  #define SECTOR_SHIFT		9
>  #define SECTORS_PER_PAGE_SHIFT	(PAGE_SHIFT - SECTOR_SHIFT)
>  #define SECTORS_PER_PAGE	(1 << SECTORS_PER_PAGE_SHIFT)
> -- 
> 1.9.1
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 2/3] zpool/zsmalloc/zbud: align on interfaces
  2015-09-14 13:51   ` Vitaly Wool
@ 2015-09-15  1:06     ` Sergey Senozhatsky
  -1 siblings, 0 replies; 63+ messages in thread
From: Sergey Senozhatsky @ 2015-09-15  1:06 UTC (permalink / raw)
  To: Vitaly Wool; +Cc: minchan, sergey.senozhatsky, ddstreet, linux-kernel, linux-mm

On (09/14/15 15:51), Vitaly Wool wrote:
> As a preparation step for zram to be able to use common zpool API,
> there has to be some alignment done on it. This patch adds
> functions that correspond to zsmalloc-specific API to the common
> zpool API and takes care of the callbacks that have to be
> introduced, too.
> 
> Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
> ---
>  drivers/block/zram/zram_drv.c |  4 ++--
>  include/linux/zpool.h         | 14 ++++++++++++++
>  include/linux/zsmalloc.h      |  8 ++------
>  mm/zbud.c                     | 12 ++++++++++++
>  mm/zpool.c                    | 22 ++++++++++++++++++++++
>  mm/zsmalloc.c                 | 23 ++++++++++++++++++++---
>  6 files changed, 72 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 6d9f1d1..49d5a65 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -427,12 +427,12 @@ static ssize_t mm_stat_show(struct device *dev,
>  		struct device_attribute *attr, char *buf)
>  {
>  	struct zram *zram = dev_to_zram(dev);
> -	struct zs_pool_stats pool_stats;
> +	struct zpool_stats pool_stats;
>  	u64 orig_size, mem_used = 0;
>  	long max_used;
>  	ssize_t ret;
>  
> -	memset(&pool_stats, 0x00, sizeof(struct zs_pool_stats));
> +	memset(&pool_stats, 0x00, sizeof(struct zpool_stats));
>  
>  	down_read(&zram->init_lock);
>  	if (init_done(zram)) {
> diff --git a/include/linux/zpool.h b/include/linux/zpool.h
> index 42f8ec9..f64cf86 100644
> --- a/include/linux/zpool.h
> +++ b/include/linux/zpool.h
> @@ -17,6 +17,11 @@ struct zpool_ops {
>  	int (*evict)(struct zpool *pool, unsigned long handle);
>  };
>  
> +struct zpool_stats {
> +	/* How many pages were migrated (freed) */
> +	unsigned long pages_compacted;
> +};
> +
>  /*
>   * Control how a handle is mapped.  It will be ignored if the
>   * implementation does not support it.  Its use is optional.
> @@ -58,6 +63,10 @@ void *zpool_map_handle(struct zpool *pool, unsigned long handle,
>  
>  void zpool_unmap_handle(struct zpool *pool, unsigned long handle);
>  
> +int zpool_compact(struct zpool *pool, unsigned long *compacted);

I see no reason for having this `unsigned long *compacted` param.


> +void zpool_stats(struct zpool *pool, struct zpool_stats *zstats);
> +
>  u64 zpool_get_total_size(struct zpool *pool);
>  
>  
> @@ -72,6 +81,8 @@ u64 zpool_get_total_size(struct zpool *pool);
>   * @shrink:	shrink the pool.
>   * @map:	map a handle.
>   * @unmap:	unmap a handle.
> + * @compact:	try to run compaction for the pool
> + * @stats:	get statistics for the pool
>   * @total_size:	get total size of a pool.
>   *
>   * This is created by a zpool implementation and registered
> @@ -98,6 +109,9 @@ struct zpool_driver {
>  				enum zpool_mapmode mm);
>  	void (*unmap)(void *pool, unsigned long handle);
>  
> +	int (*compact)(void *pool, unsigned long *compacted);
> +	void (*stats)(void *pool, struct zpool_stats *stats);
> +
>  	u64 (*total_size)(void *pool);
>  };
>  
> diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
> index 6398dfa..5aee1c7 100644
> --- a/include/linux/zsmalloc.h
> +++ b/include/linux/zsmalloc.h
> @@ -15,6 +15,7 @@
>  #define _ZS_MALLOC_H_
>  
>  #include <linux/types.h>
> +#include <linux/zpool.h>
>  
>  /*
>   * zsmalloc mapping modes
> @@ -34,11 +35,6 @@ enum zs_mapmode {
>  	 */
>  };
>  
> -struct zs_pool_stats {
> -	/* How many pages were migrated (freed) */
> -	unsigned long pages_compacted;
> -};
> -
>  struct zs_pool;
>  
>  struct zs_pool *zs_create_pool(char *name, gfp_t flags);
> @@ -54,5 +50,5 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
>  unsigned long zs_get_total_pages(struct zs_pool *pool);
>  unsigned long zs_compact(struct zs_pool *pool);
>  
> -void zs_pool_stats(struct zs_pool *pool, struct zs_pool_stats *stats);
> +void zs_pool_stats(struct zs_pool *pool, struct zpool_stats *stats);
>  #endif
> diff --git a/mm/zbud.c b/mm/zbud.c
> index fa48bcdf..0963342 100644
> --- a/mm/zbud.c
> +++ b/mm/zbud.c
> @@ -195,6 +195,16 @@ static void zbud_zpool_unmap(void *pool, unsigned long handle)
>  	zbud_unmap(pool, handle);
>  }
>  
> +static int zbud_zpool_compact(void *pool, unsigned long *compacted)
> +{
> +	return -EINVAL;
> +}
> +
> +static void zbud_zpool_stats(void *pool, struct zpool_stats *stats)
> +{
> +	/* no-op */
> +}
> +
>  static u64 zbud_zpool_total_size(void *pool)
>  {
>  	return zbud_get_pool_size(pool) * PAGE_SIZE;
> @@ -210,6 +220,8 @@ static struct zpool_driver zbud_zpool_driver = {
>  	.shrink =	zbud_zpool_shrink,
>  	.map =		zbud_zpool_map,
>  	.unmap =	zbud_zpool_unmap,
> +	.compact =	zbud_zpool_compact,
> +	.stats =	zbud_zpool_stats,
>  	.total_size =	zbud_zpool_total_size,
>  };
>  
> diff --git a/mm/zpool.c b/mm/zpool.c
> index 8f670d3..15a171a 100644
> --- a/mm/zpool.c
> +++ b/mm/zpool.c
> @@ -341,6 +341,28 @@ void zpool_unmap_handle(struct zpool *zpool, unsigned long handle)
>  }
>  
>  /**
> + * zpool_compact() - try to run compaction over zpool
> + * @pool	The zpool to compact
> + * @compacted	The number of migrated pages
> + *
> + * Returns: 0 on success, error code otherwise
> + */
> +int zpool_compact(struct zpool *zpool, unsigned long *compacted)
> +{
> +	return zpool->driver->compact(zpool->pool, compacted);
> +}
> +
> +/**
> + * zpool_stats() - obtain zpool statistics
> + * @pool	The zpool to get statistics for
> + * @zstats	stats to fill in
> + */
> +void zpool_stats(struct zpool *zpool, struct zpool_stats *zstats)
> +{
> +	zpool->driver->stats(zpool->pool, zstats);
> +}
> +
> +/**
>   * zpool_get_total_size() - The total size of the pool
>   * @pool	The zpool to check
>   *
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index f135b1b..f002f57 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -245,7 +245,7 @@ struct zs_pool {
>  	gfp_t flags;	/* allocation flags used when growing pool */
>  	atomic_long_t pages_allocated;
>  
> -	struct zs_pool_stats stats;
> +	struct zpool_stats stats;
>  
>  	/* Compact classes */
>  	struct shrinker shrinker;
> @@ -365,6 +365,21 @@ static void zs_zpool_unmap(void *pool, unsigned long handle)
>  	zs_unmap_object(pool, handle);
>  }
>  
> +static int zs_zpool_compact(void *pool, unsigned long *compacted)
> +{
> +	unsigned long c = zs_compact(pool);
> +
> +	if (compacted)
> +		*compacted = c;

why not "return zs_compact(pool)"?


> +	return 0;
> +}
> +
> +
> +static void zs_zpool_stats(void *pool, struct zpool_stats *stats)
> +{
> +	zs_pool_stats(pool, stats);
> +}
> +
>  static u64 zs_zpool_total_size(void *pool)
>  {
>  	return zs_get_total_pages(pool) << PAGE_SHIFT;
> @@ -380,6 +395,8 @@ static struct zpool_driver zs_zpool_driver = {
>  	.shrink =	zs_zpool_shrink,
>  	.map =		zs_zpool_map,
>  	.unmap =	zs_zpool_unmap,
> +	.compact =	zs_zpool_compact,
> +	.stats =	zs_zpool_stats,
>  	.total_size =	zs_zpool_total_size,
>  };
>  
> @@ -1789,9 +1806,9 @@ unsigned long zs_compact(struct zs_pool *pool)
>  }
>  EXPORT_SYMBOL_GPL(zs_compact);
>  
> -void zs_pool_stats(struct zs_pool *pool, struct zs_pool_stats *stats)
> +void zs_pool_stats(struct zs_pool *pool, struct zpool_stats *stats)
>  {
> -	memcpy(stats, &pool->stats, sizeof(struct zs_pool_stats));
> +	memcpy(stats, &pool->stats, sizeof(struct zpool_stats));
>  }
>  EXPORT_SYMBOL_GPL(zs_pool_stats);
>  
> -- 
> 1.9.1
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 2/3] zpool/zsmalloc/zbud: align on interfaces
@ 2015-09-15  1:06     ` Sergey Senozhatsky
  0 siblings, 0 replies; 63+ messages in thread
From: Sergey Senozhatsky @ 2015-09-15  1:06 UTC (permalink / raw)
  To: Vitaly Wool; +Cc: minchan, sergey.senozhatsky, ddstreet, linux-kernel, linux-mm

On (09/14/15 15:51), Vitaly Wool wrote:
> As a preparation step for zram to be able to use common zpool API,
> there has to be some alignment done on it. This patch adds
> functions that correspond to zsmalloc-specific API to the common
> zpool API and takes care of the callbacks that have to be
> introduced, too.
> 
> Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
> ---
>  drivers/block/zram/zram_drv.c |  4 ++--
>  include/linux/zpool.h         | 14 ++++++++++++++
>  include/linux/zsmalloc.h      |  8 ++------
>  mm/zbud.c                     | 12 ++++++++++++
>  mm/zpool.c                    | 22 ++++++++++++++++++++++
>  mm/zsmalloc.c                 | 23 ++++++++++++++++++++---
>  6 files changed, 72 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 6d9f1d1..49d5a65 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -427,12 +427,12 @@ static ssize_t mm_stat_show(struct device *dev,
>  		struct device_attribute *attr, char *buf)
>  {
>  	struct zram *zram = dev_to_zram(dev);
> -	struct zs_pool_stats pool_stats;
> +	struct zpool_stats pool_stats;
>  	u64 orig_size, mem_used = 0;
>  	long max_used;
>  	ssize_t ret;
>  
> -	memset(&pool_stats, 0x00, sizeof(struct zs_pool_stats));
> +	memset(&pool_stats, 0x00, sizeof(struct zpool_stats));
>  
>  	down_read(&zram->init_lock);
>  	if (init_done(zram)) {
> diff --git a/include/linux/zpool.h b/include/linux/zpool.h
> index 42f8ec9..f64cf86 100644
> --- a/include/linux/zpool.h
> +++ b/include/linux/zpool.h
> @@ -17,6 +17,11 @@ struct zpool_ops {
>  	int (*evict)(struct zpool *pool, unsigned long handle);
>  };
>  
> +struct zpool_stats {
> +	/* How many pages were migrated (freed) */
> +	unsigned long pages_compacted;
> +};
> +
>  /*
>   * Control how a handle is mapped.  It will be ignored if the
>   * implementation does not support it.  Its use is optional.
> @@ -58,6 +63,10 @@ void *zpool_map_handle(struct zpool *pool, unsigned long handle,
>  
>  void zpool_unmap_handle(struct zpool *pool, unsigned long handle);
>  
> +int zpool_compact(struct zpool *pool, unsigned long *compacted);

I see no reason for having this `unsigned long *compacted` param.


> +void zpool_stats(struct zpool *pool, struct zpool_stats *zstats);
> +
>  u64 zpool_get_total_size(struct zpool *pool);
>  
>  
> @@ -72,6 +81,8 @@ u64 zpool_get_total_size(struct zpool *pool);
>   * @shrink:	shrink the pool.
>   * @map:	map a handle.
>   * @unmap:	unmap a handle.
> + * @compact:	try to run compaction for the pool
> + * @stats:	get statistics for the pool
>   * @total_size:	get total size of a pool.
>   *
>   * This is created by a zpool implementation and registered
> @@ -98,6 +109,9 @@ struct zpool_driver {
>  				enum zpool_mapmode mm);
>  	void (*unmap)(void *pool, unsigned long handle);
>  
> +	int (*compact)(void *pool, unsigned long *compacted);
> +	void (*stats)(void *pool, struct zpool_stats *stats);
> +
>  	u64 (*total_size)(void *pool);
>  };
>  
> diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
> index 6398dfa..5aee1c7 100644
> --- a/include/linux/zsmalloc.h
> +++ b/include/linux/zsmalloc.h
> @@ -15,6 +15,7 @@
>  #define _ZS_MALLOC_H_
>  
>  #include <linux/types.h>
> +#include <linux/zpool.h>
>  
>  /*
>   * zsmalloc mapping modes
> @@ -34,11 +35,6 @@ enum zs_mapmode {
>  	 */
>  };
>  
> -struct zs_pool_stats {
> -	/* How many pages were migrated (freed) */
> -	unsigned long pages_compacted;
> -};
> -
>  struct zs_pool;
>  
>  struct zs_pool *zs_create_pool(char *name, gfp_t flags);
> @@ -54,5 +50,5 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
>  unsigned long zs_get_total_pages(struct zs_pool *pool);
>  unsigned long zs_compact(struct zs_pool *pool);
>  
> -void zs_pool_stats(struct zs_pool *pool, struct zs_pool_stats *stats);
> +void zs_pool_stats(struct zs_pool *pool, struct zpool_stats *stats);
>  #endif
> diff --git a/mm/zbud.c b/mm/zbud.c
> index fa48bcdf..0963342 100644
> --- a/mm/zbud.c
> +++ b/mm/zbud.c
> @@ -195,6 +195,16 @@ static void zbud_zpool_unmap(void *pool, unsigned long handle)
>  	zbud_unmap(pool, handle);
>  }
>  
> +static int zbud_zpool_compact(void *pool, unsigned long *compacted)
> +{
> +	return -EINVAL;
> +}
> +
> +static void zbud_zpool_stats(void *pool, struct zpool_stats *stats)
> +{
> +	/* no-op */
> +}
> +
>  static u64 zbud_zpool_total_size(void *pool)
>  {
>  	return zbud_get_pool_size(pool) * PAGE_SIZE;
> @@ -210,6 +220,8 @@ static struct zpool_driver zbud_zpool_driver = {
>  	.shrink =	zbud_zpool_shrink,
>  	.map =		zbud_zpool_map,
>  	.unmap =	zbud_zpool_unmap,
> +	.compact =	zbud_zpool_compact,
> +	.stats =	zbud_zpool_stats,
>  	.total_size =	zbud_zpool_total_size,
>  };
>  
> diff --git a/mm/zpool.c b/mm/zpool.c
> index 8f670d3..15a171a 100644
> --- a/mm/zpool.c
> +++ b/mm/zpool.c
> @@ -341,6 +341,28 @@ void zpool_unmap_handle(struct zpool *zpool, unsigned long handle)
>  }
>  
>  /**
> + * zpool_compact() - try to run compaction over zpool
> + * @pool	The zpool to compact
> + * @compacted	The number of migrated pages
> + *
> + * Returns: 0 on success, error code otherwise
> + */
> +int zpool_compact(struct zpool *zpool, unsigned long *compacted)
> +{
> +	return zpool->driver->compact(zpool->pool, compacted);
> +}
> +
> +/**
> + * zpool_stats() - obtain zpool statistics
> + * @pool	The zpool to get statistics for
> + * @zstats	stats to fill in
> + */
> +void zpool_stats(struct zpool *zpool, struct zpool_stats *zstats)
> +{
> +	zpool->driver->stats(zpool->pool, zstats);
> +}
> +
> +/**
>   * zpool_get_total_size() - The total size of the pool
>   * @pool	The zpool to check
>   *
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index f135b1b..f002f57 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -245,7 +245,7 @@ struct zs_pool {
>  	gfp_t flags;	/* allocation flags used when growing pool */
>  	atomic_long_t pages_allocated;
>  
> -	struct zs_pool_stats stats;
> +	struct zpool_stats stats;
>  
>  	/* Compact classes */
>  	struct shrinker shrinker;
> @@ -365,6 +365,21 @@ static void zs_zpool_unmap(void *pool, unsigned long handle)
>  	zs_unmap_object(pool, handle);
>  }
>  
> +static int zs_zpool_compact(void *pool, unsigned long *compacted)
> +{
> +	unsigned long c = zs_compact(pool);
> +
> +	if (compacted)
> +		*compacted = c;

why not "return zs_compact(pool)"?


> +	return 0;
> +}
> +
> +
> +static void zs_zpool_stats(void *pool, struct zpool_stats *stats)
> +{
> +	zs_pool_stats(pool, stats);
> +}
> +
>  static u64 zs_zpool_total_size(void *pool)
>  {
>  	return zs_get_total_pages(pool) << PAGE_SHIFT;
> @@ -380,6 +395,8 @@ static struct zpool_driver zs_zpool_driver = {
>  	.shrink =	zs_zpool_shrink,
>  	.map =		zs_zpool_map,
>  	.unmap =	zs_zpool_unmap,
> +	.compact =	zs_zpool_compact,
> +	.stats =	zs_zpool_stats,
>  	.total_size =	zs_zpool_total_size,
>  };
>  
> @@ -1789,9 +1806,9 @@ unsigned long zs_compact(struct zs_pool *pool)
>  }
>  EXPORT_SYMBOL_GPL(zs_compact);
>  
> -void zs_pool_stats(struct zs_pool *pool, struct zs_pool_stats *stats)
> +void zs_pool_stats(struct zs_pool *pool, struct zpool_stats *stats)
>  {
> -	memcpy(stats, &pool->stats, sizeof(struct zs_pool_stats));
> +	memcpy(stats, &pool->stats, sizeof(struct zpool_stats));
>  }
>  EXPORT_SYMBOL_GPL(zs_pool_stats);
>  
> -- 
> 1.9.1
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 3/3] zram: use common zpool interface
  2015-09-14 13:55   ` Vitaly Wool
@ 2015-09-15  1:12     ` Sergey Senozhatsky
  -1 siblings, 0 replies; 63+ messages in thread
From: Sergey Senozhatsky @ 2015-09-15  1:12 UTC (permalink / raw)
  To: Vitaly Wool; +Cc: minchan, sergey.senozhatsky, ddstreet, linux-kernel, linux-mm

On (09/14/15 15:55), Vitaly Wool wrote:
> Update zram driver to use common zpool API instead of calling
> zsmalloc functions directly. This patch also adds a parameter
> that allows for changing underlying compressor storage to zbud.
> 
> Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
> ---
>  drivers/block/zram/Kconfig    |  3 ++-
>  drivers/block/zram/zram_drv.c | 44 ++++++++++++++++++++++++-------------------
>  drivers/block/zram/zram_drv.h |  4 ++--
>  3 files changed, 29 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/block/zram/Kconfig b/drivers/block/zram/Kconfig
> index 386ba3d..4831d0a 100644
> --- a/drivers/block/zram/Kconfig
> +++ b/drivers/block/zram/Kconfig
> @@ -1,6 +1,7 @@
>  config ZRAM
>  	tristate "Compressed RAM block device support"
> -	depends on BLOCK && SYSFS && ZSMALLOC
> +	depends on BLOCK && SYSFS
> +	select ZPOOL

well, in that case, all `#ifdef CONFIG_ZPOOL' in zsmalloc can be dropped,
because now it's a must have.


>  	select LZO_COMPRESS
>  	select LZO_DECOMPRESS
>  	default n
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 49d5a65..2829c3d 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -44,6 +44,9 @@ static const char *default_compressor = "lzo";
>  static unsigned int num_devices = 1;
>  static size_t max_zpage_size = PAGE_SIZE / 4 * 3;
>  
> +#define ZRAM_ZPOOL_DEFAULT "zsmalloc"
> +static char *pool_type = ZRAM_ZPOOL_DEFAULT;
> +
>  static inline void deprecated_attr_warn(const char *name)
>  {
>  	pr_warn_once("%d (%s) Attribute %s (and others) will be removed. %s\n",
> @@ -229,11 +232,11 @@ static ssize_t mem_used_total_show(struct device *dev,
>  	down_read(&zram->init_lock);
>  	if (init_done(zram)) {
>  		struct zram_meta *meta = zram->meta;
> -		val = zs_get_total_pages(meta->mem_pool);
> +		val = zpool_get_total_size(meta->mem_pool);
>  	}
>  	up_read(&zram->init_lock);
>  
> -	return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
> +	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
>  }
>  
>  static ssize_t mem_limit_show(struct device *dev,
> @@ -298,7 +301,7 @@ static ssize_t mem_used_max_store(struct device *dev,
>  	if (init_done(zram)) {
>  		struct zram_meta *meta = zram->meta;
>  		atomic_long_set(&zram->stats.max_used_pages,
> -				zs_get_total_pages(meta->mem_pool));
> +			zpool_get_total_size(meta->mem_pool) >> PAGE_SHIFT);
>  	}
>  	up_read(&zram->init_lock);
>  
> @@ -399,7 +402,7 @@ static ssize_t compact_store(struct device *dev,
>  	}
>  
>  	meta = zram->meta;
> -	zs_compact(meta->mem_pool);
> +	zpool_compact(meta->mem_pool, NULL);

so you don't use that 'compacted' param.


>  	up_read(&zram->init_lock);
>  
>  	return len;
> @@ -436,8 +439,8 @@ static ssize_t mm_stat_show(struct device *dev,
>  
>  	down_read(&zram->init_lock);
>  	if (init_done(zram)) {
> -		mem_used = zs_get_total_pages(zram->meta->mem_pool);
> -		zs_pool_stats(zram->meta->mem_pool, &pool_stats);
> +		mem_used = zpool_get_total_size(zram->meta->mem_pool);
> +		zpool_stats(zram->meta->mem_pool, &pool_stats);
>  	}
>  
>  	orig_size = atomic64_read(&zram->stats.pages_stored);
> @@ -447,7 +450,7 @@ static ssize_t mm_stat_show(struct device *dev,
>  			"%8llu %8llu %8llu %8lu %8ld %8llu %8lu\n",
>  			orig_size << PAGE_SHIFT,
>  			(u64)atomic64_read(&zram->stats.compr_data_size),
> -			mem_used << PAGE_SHIFT,
> +			mem_used,
>  			zram->limit_pages << PAGE_SHIFT,
>  			max_used << PAGE_SHIFT,
>  			(u64)atomic64_read(&zram->stats.zero_pages),
> @@ -492,10 +495,10 @@ static void zram_meta_free(struct zram_meta *meta, u64 disksize)
>  		if (!handle)
>  			continue;
>  
> -		zs_free(meta->mem_pool, handle);
> +		zpool_free(meta->mem_pool, handle);
>  	}
>  
> -	zs_destroy_pool(meta->mem_pool);
> +	zpool_destroy_pool(meta->mem_pool);
>  	vfree(meta->table);
>  	kfree(meta);
>  }
> @@ -515,7 +518,8 @@ static struct zram_meta *zram_meta_alloc(char *pool_name, u64 disksize)
>  		goto out_error;
>  	}
>  
> -	meta->mem_pool = zs_create_pool(pool_name, GFP_NOIO | __GFP_HIGHMEM);
> +	meta->mem_pool = zpool_create_pool(pool_type, pool_name,
> +			GFP_NOIO | __GFP_HIGHMEM, NULL);
>  	if (!meta->mem_pool) {
>  		pr_err("Error creating memory pool\n");
>  		goto out_error;
> @@ -551,7 +555,7 @@ static void zram_free_page(struct zram *zram, size_t index)
>  		return;
>  	}
>  
> -	zs_free(meta->mem_pool, handle);
> +	zpool_free(meta->mem_pool, handle);
>  
>  	atomic64_sub(zram_get_obj_size(meta, index),
>  			&zram->stats.compr_data_size);
> @@ -579,12 +583,12 @@ static int zram_decompress_page(struct zram *zram, char *mem, u32 index)
>  		return 0;
>  	}
>  
> -	cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_RO);
> +	cmem = zpool_map_handle(meta->mem_pool, handle, ZPOOL_MM_RO);
>  	if (size == PAGE_SIZE)
>  		copy_page(mem, cmem);
>  	else
>  		ret = zcomp_decompress(zram->comp, cmem, size, mem);
> -	zs_unmap_object(meta->mem_pool, handle);
> +	zpool_unmap_handle(meta->mem_pool, handle);
>  	bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
>  
>  	/* Should NEVER happen. Return bio error if it does. */
> @@ -718,24 +722,24 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
>  			src = uncmem;
>  	}
>  
> -	handle = zs_malloc(meta->mem_pool, clen);
> -	if (!handle) {
> +	if (zpool_malloc(meta->mem_pool, clen, __GFP_IO | __GFP_NOWARN,

hm, GFP_NOIO and __GFP_IO...

> +			&handle) != 0) {
>  		pr_err("Error allocating memory for compressed page: %u, size=%zu\n",
>  			index, clen);
>  		ret = -ENOMEM;
>  		goto out;
>  	}
>  
> -	alloced_pages = zs_get_total_pages(meta->mem_pool);
> +	alloced_pages = zpool_get_total_size(meta->mem_pool) >> PAGE_SHIFT;
>  	if (zram->limit_pages && alloced_pages > zram->limit_pages) {
> -		zs_free(meta->mem_pool, handle);
> +		zpool_free(meta->mem_pool, handle);
>  		ret = -ENOMEM;
>  		goto out;
>  	}
>  
>  	update_used_max(zram, alloced_pages);
>  
> -	cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
> +	cmem = zpool_map_handle(meta->mem_pool, handle, ZPOOL_MM_WO);
>  
>  	if ((clen == PAGE_SIZE) && !is_partial_io(bvec)) {
>  		src = kmap_atomic(page);
> @@ -747,7 +751,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
>  
>  	zcomp_strm_release(zram->comp, zstrm);
>  	zstrm = NULL;
> -	zs_unmap_object(meta->mem_pool, handle);
> +	zpool_unmap_handle(meta->mem_pool, handle);
>  
>  	/*
>  	 * Free memory associated with this sector
> @@ -1457,6 +1461,8 @@ module_param(num_devices, uint, 0);
>  MODULE_PARM_DESC(num_devices, "Number of pre-created zram devices");
>  module_param(max_zpage_size, ulong, 0);
>  MODULE_PARM_DESC(max_zpage_size, "Threshold for storing compressed pages");
> +module_param_named(zpool_type, pool_type, charp, 0444);
> +MODULE_PARM_DESC(zpool_type, "zpool implementation selection (zsmalloc vs zbud)");
>  
>  MODULE_LICENSE("Dual BSD/GPL");
>  MODULE_AUTHOR("Nitin Gupta <ngupta@vflare.org>");
> diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> index 3a29c33..9a64b94 100644
> --- a/drivers/block/zram/zram_drv.h
> +++ b/drivers/block/zram/zram_drv.h
> @@ -16,7 +16,7 @@
>  #define _ZRAM_DRV_H_
>  
>  #include <linux/spinlock.h>
> -#include <linux/zsmalloc.h>
> +#include <linux/zpool.h>
>  
>  #include "zcomp.h"
>  
> @@ -73,7 +73,7 @@ struct zram_stats {
>  
>  struct zram_meta {
>  	struct zram_table_entry *table;
> -	struct zs_pool *mem_pool;
> +	struct zpool *mem_pool;
>  };
>  
>  struct zram {
> -- 
> 1.9.1
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 3/3] zram: use common zpool interface
@ 2015-09-15  1:12     ` Sergey Senozhatsky
  0 siblings, 0 replies; 63+ messages in thread
From: Sergey Senozhatsky @ 2015-09-15  1:12 UTC (permalink / raw)
  To: Vitaly Wool; +Cc: minchan, sergey.senozhatsky, ddstreet, linux-kernel, linux-mm

On (09/14/15 15:55), Vitaly Wool wrote:
> Update zram driver to use common zpool API instead of calling
> zsmalloc functions directly. This patch also adds a parameter
> that allows for changing underlying compressor storage to zbud.
> 
> Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
> ---
>  drivers/block/zram/Kconfig    |  3 ++-
>  drivers/block/zram/zram_drv.c | 44 ++++++++++++++++++++++++-------------------
>  drivers/block/zram/zram_drv.h |  4 ++--
>  3 files changed, 29 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/block/zram/Kconfig b/drivers/block/zram/Kconfig
> index 386ba3d..4831d0a 100644
> --- a/drivers/block/zram/Kconfig
> +++ b/drivers/block/zram/Kconfig
> @@ -1,6 +1,7 @@
>  config ZRAM
>  	tristate "Compressed RAM block device support"
> -	depends on BLOCK && SYSFS && ZSMALLOC
> +	depends on BLOCK && SYSFS
> +	select ZPOOL

well, in that case, all `#ifdef CONFIG_ZPOOL' in zsmalloc can be dropped,
because now it's a must have.


>  	select LZO_COMPRESS
>  	select LZO_DECOMPRESS
>  	default n
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 49d5a65..2829c3d 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -44,6 +44,9 @@ static const char *default_compressor = "lzo";
>  static unsigned int num_devices = 1;
>  static size_t max_zpage_size = PAGE_SIZE / 4 * 3;
>  
> +#define ZRAM_ZPOOL_DEFAULT "zsmalloc"
> +static char *pool_type = ZRAM_ZPOOL_DEFAULT;
> +
>  static inline void deprecated_attr_warn(const char *name)
>  {
>  	pr_warn_once("%d (%s) Attribute %s (and others) will be removed. %s\n",
> @@ -229,11 +232,11 @@ static ssize_t mem_used_total_show(struct device *dev,
>  	down_read(&zram->init_lock);
>  	if (init_done(zram)) {
>  		struct zram_meta *meta = zram->meta;
> -		val = zs_get_total_pages(meta->mem_pool);
> +		val = zpool_get_total_size(meta->mem_pool);
>  	}
>  	up_read(&zram->init_lock);
>  
> -	return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
> +	return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
>  }
>  
>  static ssize_t mem_limit_show(struct device *dev,
> @@ -298,7 +301,7 @@ static ssize_t mem_used_max_store(struct device *dev,
>  	if (init_done(zram)) {
>  		struct zram_meta *meta = zram->meta;
>  		atomic_long_set(&zram->stats.max_used_pages,
> -				zs_get_total_pages(meta->mem_pool));
> +			zpool_get_total_size(meta->mem_pool) >> PAGE_SHIFT);
>  	}
>  	up_read(&zram->init_lock);
>  
> @@ -399,7 +402,7 @@ static ssize_t compact_store(struct device *dev,
>  	}
>  
>  	meta = zram->meta;
> -	zs_compact(meta->mem_pool);
> +	zpool_compact(meta->mem_pool, NULL);

so you don't use that 'compacted' param.


>  	up_read(&zram->init_lock);
>  
>  	return len;
> @@ -436,8 +439,8 @@ static ssize_t mm_stat_show(struct device *dev,
>  
>  	down_read(&zram->init_lock);
>  	if (init_done(zram)) {
> -		mem_used = zs_get_total_pages(zram->meta->mem_pool);
> -		zs_pool_stats(zram->meta->mem_pool, &pool_stats);
> +		mem_used = zpool_get_total_size(zram->meta->mem_pool);
> +		zpool_stats(zram->meta->mem_pool, &pool_stats);
>  	}
>  
>  	orig_size = atomic64_read(&zram->stats.pages_stored);
> @@ -447,7 +450,7 @@ static ssize_t mm_stat_show(struct device *dev,
>  			"%8llu %8llu %8llu %8lu %8ld %8llu %8lu\n",
>  			orig_size << PAGE_SHIFT,
>  			(u64)atomic64_read(&zram->stats.compr_data_size),
> -			mem_used << PAGE_SHIFT,
> +			mem_used,
>  			zram->limit_pages << PAGE_SHIFT,
>  			max_used << PAGE_SHIFT,
>  			(u64)atomic64_read(&zram->stats.zero_pages),
> @@ -492,10 +495,10 @@ static void zram_meta_free(struct zram_meta *meta, u64 disksize)
>  		if (!handle)
>  			continue;
>  
> -		zs_free(meta->mem_pool, handle);
> +		zpool_free(meta->mem_pool, handle);
>  	}
>  
> -	zs_destroy_pool(meta->mem_pool);
> +	zpool_destroy_pool(meta->mem_pool);
>  	vfree(meta->table);
>  	kfree(meta);
>  }
> @@ -515,7 +518,8 @@ static struct zram_meta *zram_meta_alloc(char *pool_name, u64 disksize)
>  		goto out_error;
>  	}
>  
> -	meta->mem_pool = zs_create_pool(pool_name, GFP_NOIO | __GFP_HIGHMEM);
> +	meta->mem_pool = zpool_create_pool(pool_type, pool_name,
> +			GFP_NOIO | __GFP_HIGHMEM, NULL);
>  	if (!meta->mem_pool) {
>  		pr_err("Error creating memory pool\n");
>  		goto out_error;
> @@ -551,7 +555,7 @@ static void zram_free_page(struct zram *zram, size_t index)
>  		return;
>  	}
>  
> -	zs_free(meta->mem_pool, handle);
> +	zpool_free(meta->mem_pool, handle);
>  
>  	atomic64_sub(zram_get_obj_size(meta, index),
>  			&zram->stats.compr_data_size);
> @@ -579,12 +583,12 @@ static int zram_decompress_page(struct zram *zram, char *mem, u32 index)
>  		return 0;
>  	}
>  
> -	cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_RO);
> +	cmem = zpool_map_handle(meta->mem_pool, handle, ZPOOL_MM_RO);
>  	if (size == PAGE_SIZE)
>  		copy_page(mem, cmem);
>  	else
>  		ret = zcomp_decompress(zram->comp, cmem, size, mem);
> -	zs_unmap_object(meta->mem_pool, handle);
> +	zpool_unmap_handle(meta->mem_pool, handle);
>  	bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
>  
>  	/* Should NEVER happen. Return bio error if it does. */
> @@ -718,24 +722,24 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
>  			src = uncmem;
>  	}
>  
> -	handle = zs_malloc(meta->mem_pool, clen);
> -	if (!handle) {
> +	if (zpool_malloc(meta->mem_pool, clen, __GFP_IO | __GFP_NOWARN,

hm, GFP_NOIO and __GFP_IO...

> +			&handle) != 0) {
>  		pr_err("Error allocating memory for compressed page: %u, size=%zu\n",
>  			index, clen);
>  		ret = -ENOMEM;
>  		goto out;
>  	}
>  
> -	alloced_pages = zs_get_total_pages(meta->mem_pool);
> +	alloced_pages = zpool_get_total_size(meta->mem_pool) >> PAGE_SHIFT;
>  	if (zram->limit_pages && alloced_pages > zram->limit_pages) {
> -		zs_free(meta->mem_pool, handle);
> +		zpool_free(meta->mem_pool, handle);
>  		ret = -ENOMEM;
>  		goto out;
>  	}
>  
>  	update_used_max(zram, alloced_pages);
>  
> -	cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
> +	cmem = zpool_map_handle(meta->mem_pool, handle, ZPOOL_MM_WO);
>  
>  	if ((clen == PAGE_SIZE) && !is_partial_io(bvec)) {
>  		src = kmap_atomic(page);
> @@ -747,7 +751,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
>  
>  	zcomp_strm_release(zram->comp, zstrm);
>  	zstrm = NULL;
> -	zs_unmap_object(meta->mem_pool, handle);
> +	zpool_unmap_handle(meta->mem_pool, handle);
>  
>  	/*
>  	 * Free memory associated with this sector
> @@ -1457,6 +1461,8 @@ module_param(num_devices, uint, 0);
>  MODULE_PARM_DESC(num_devices, "Number of pre-created zram devices");
>  module_param(max_zpage_size, ulong, 0);
>  MODULE_PARM_DESC(max_zpage_size, "Threshold for storing compressed pages");
> +module_param_named(zpool_type, pool_type, charp, 0444);
> +MODULE_PARM_DESC(zpool_type, "zpool implementation selection (zsmalloc vs zbud)");
>  
>  MODULE_LICENSE("Dual BSD/GPL");
>  MODULE_AUTHOR("Nitin Gupta <ngupta@vflare.org>");
> diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> index 3a29c33..9a64b94 100644
> --- a/drivers/block/zram/zram_drv.h
> +++ b/drivers/block/zram/zram_drv.h
> @@ -16,7 +16,7 @@
>  #define _ZRAM_DRV_H_
>  
>  #include <linux/spinlock.h>
> -#include <linux/zsmalloc.h>
> +#include <linux/zpool.h>
>  
>  #include "zcomp.h"
>  
> @@ -73,7 +73,7 @@ struct zram_stats {
>  
>  struct zram_meta {
>  	struct zram_table_entry *table;
> -	struct zs_pool *mem_pool;
> +	struct zpool *mem_pool;
>  };
>  
>  struct zram {
> -- 
> 1.9.1
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
  2015-09-14 14:14       ` Vlastimil Babka
@ 2015-09-15  4:08         ` Dan Streetman
  -1 siblings, 0 replies; 63+ messages in thread
From: Dan Streetman @ 2015-09-15  4:08 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Vitaly Wool, Minchan Kim, Sergey Senozhatsky, LKML, Linux-MM

On Mon, Sep 14, 2015 at 10:14 AM, Vlastimil Babka <vbabka@suse.cz> wrote:
> On 09/14/2015 04:12 PM, Vitaly Wool wrote:
>>
>> On Mon, Sep 14, 2015 at 4:01 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
>>>
>>>
>>> On 09/14/2015 03:49 PM, Vitaly Wool wrote:
>>>>
>>>>
>>>> While using ZRAM on a small RAM footprint devices, together with
>>>> KSM,
>>>> I ran into several occasions when moving pages from compressed swap back
>>>> into the "normal" part of RAM caused significant latencies in system
>>>
>>>
>>>
>>> I'm sure Minchan will want to hear the details of that :)
>>>
>>>> operation. By using zbud I lose in compression ratio but gain in
>>>> determinism, lower latencies and lower fragmentation, so in the coming
>>>
>>>
>>>
>>> I doubt the "lower fragmentation" part given what I've read about the
>>> design of zbud and zsmalloc?
>>
>>
>> As it turns out, I see more cases of compaction kicking in and
>> significantly more compact_stalls with zsmalloc.
>
>
> Interesting, I thought that zsmalloc doesn't need contiguous high-order
> pages.

it doesn't.  but it has a complex (compared to zbud) way of storing
pages - many different classes, which each are made up of zspages,
which contain multiple actual pages to store some number of
specifically sized objects.  So it can get fragmented, with lots of
zspages with empty spaces for objects.  That's what the recently added
zsmalloc compaction addresses, by scanning all the zspages in all the
classes and compacting zspages within each class.

but I haven't followed most of the recent zsmalloc updates too
closely, so I may be totally wrong :-)

zbud is much simpler; since it just uses buddied pairs, it simply
keeps a list of zbud page with only 1 compressed page stored in it.
There is still the possibility of fragmentation, but since it's
simple, it's much smaller.  And there is no compaction implemented in
it, currently.  The downside, as we all know, is worse efficiency in
storing compressed pages - it can't do better than 2:1.

>
>> ~vitaly
>>
>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
@ 2015-09-15  4:08         ` Dan Streetman
  0 siblings, 0 replies; 63+ messages in thread
From: Dan Streetman @ 2015-09-15  4:08 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Vitaly Wool, Minchan Kim, Sergey Senozhatsky, LKML, Linux-MM

On Mon, Sep 14, 2015 at 10:14 AM, Vlastimil Babka <vbabka@suse.cz> wrote:
> On 09/14/2015 04:12 PM, Vitaly Wool wrote:
>>
>> On Mon, Sep 14, 2015 at 4:01 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
>>>
>>>
>>> On 09/14/2015 03:49 PM, Vitaly Wool wrote:
>>>>
>>>>
>>>> While using ZRAM on a small RAM footprint devices, together with
>>>> KSM,
>>>> I ran into several occasions when moving pages from compressed swap back
>>>> into the "normal" part of RAM caused significant latencies in system
>>>
>>>
>>>
>>> I'm sure Minchan will want to hear the details of that :)
>>>
>>>> operation. By using zbud I lose in compression ratio but gain in
>>>> determinism, lower latencies and lower fragmentation, so in the coming
>>>
>>>
>>>
>>> I doubt the "lower fragmentation" part given what I've read about the
>>> design of zbud and zsmalloc?
>>
>>
>> As it turns out, I see more cases of compaction kicking in and
>> significantly more compact_stalls with zsmalloc.
>
>
> Interesting, I thought that zsmalloc doesn't need contiguous high-order
> pages.

it doesn't.  but it has a complex (compared to zbud) way of storing
pages - many different classes, which each are made up of zspages,
which contain multiple actual pages to store some number of
specifically sized objects.  So it can get fragmented, with lots of
zspages with empty spaces for objects.  That's what the recently added
zsmalloc compaction addresses, by scanning all the zspages in all the
classes and compacting zspages within each class.

but I haven't followed most of the recent zsmalloc updates too
closely, so I may be totally wrong :-)

zbud is much simpler; since it just uses buddied pairs, it simply
keeps a list of zbud page with only 1 compressed page stored in it.
There is still the possibility of fragmentation, but since it's
simple, it's much smaller.  And there is no compaction implemented in
it, currently.  The downside, as we all know, is worse efficiency in
storing compressed pages - it can't do better than 2:1.

>
>> ~vitaly
>>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
  2015-09-15  4:08         ` Dan Streetman
@ 2015-09-15  4:22           ` Sergey Senozhatsky
  -1 siblings, 0 replies; 63+ messages in thread
From: Sergey Senozhatsky @ 2015-09-15  4:22 UTC (permalink / raw)
  To: Dan Streetman
  Cc: Vlastimil Babka, Vitaly Wool, Minchan Kim, Sergey Senozhatsky,
	LKML, Linux-MM

On (09/15/15 00:08), Dan Streetman wrote:
[..]
> 
> it doesn't.  but it has a complex (compared to zbud) way of storing
> pages - many different classes, which each are made up of zspages,
> which contain multiple actual pages to store some number of
> specifically sized objects.  So it can get fragmented, with lots of
> zspages with empty spaces for objects.  That's what the recently added
> zsmalloc compaction addresses, by scanning all the zspages in all the
> classes and compacting zspages within each class.
> 

correct. a bit of internals: we don't scan all the zspages every
time. each class has stats for allocated used objects, allocated
used objects, etc. so we 'compact' only classes that can be
compacted:

 static unsigned long zs_can_compact(struct size_class *class)
 {
         unsigned long obj_wasted;
 
         obj_wasted = zs_stat_get(class, OBJ_ALLOCATED) -
                 zs_stat_get(class, OBJ_USED);
 
         obj_wasted /= get_maxobj_per_zspage(class->size,
                         class->pages_per_zspage);
 
         return obj_wasted * class->pages_per_zspage;
 }

if we can free any zspages (which is at least one page), then we
attempt to do so.

is compaction the root cause of the symptoms Vitaly observe?


Vitaly, if you disable compaction:

---

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 14fc466..d9b5427 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1944,8 +1944,9 @@ struct zs_pool *zs_create_pool(char *name, gfp_t flags)
         * Not critical, we still can use the pool
         * and user can trigger compaction manually.
         */
-       if (zs_register_shrinker(pool) == 0)
+/*     if (zs_register_shrinker(pool) == 0)
                pool->shrinker_enabled = true;
+*/
        return pool;
 
 err:


---

does the 'problem' go away?


> but I haven't followed most of the recent zsmalloc updates too
> closely, so I may be totally wrong :-)
> 
> zbud is much simpler; since it just uses buddied pairs, it simply
> keeps a list of zbud page with only 1 compressed page stored in it.
> There is still the possibility of fragmentation, but since it's
> simple, it's much smaller.  And there is no compaction implemented in
> it, currently.  The downside, as we all know, is worse efficiency in
> storing compressed pages - it can't do better than 2:1.
> 

	-ss

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
@ 2015-09-15  4:22           ` Sergey Senozhatsky
  0 siblings, 0 replies; 63+ messages in thread
From: Sergey Senozhatsky @ 2015-09-15  4:22 UTC (permalink / raw)
  To: Dan Streetman
  Cc: Vlastimil Babka, Vitaly Wool, Minchan Kim, Sergey Senozhatsky,
	LKML, Linux-MM

On (09/15/15 00:08), Dan Streetman wrote:
[..]
> 
> it doesn't.  but it has a complex (compared to zbud) way of storing
> pages - many different classes, which each are made up of zspages,
> which contain multiple actual pages to store some number of
> specifically sized objects.  So it can get fragmented, with lots of
> zspages with empty spaces for objects.  That's what the recently added
> zsmalloc compaction addresses, by scanning all the zspages in all the
> classes and compacting zspages within each class.
> 

correct. a bit of internals: we don't scan all the zspages every
time. each class has stats for allocated used objects, allocated
used objects, etc. so we 'compact' only classes that can be
compacted:

 static unsigned long zs_can_compact(struct size_class *class)
 {
         unsigned long obj_wasted;
 
         obj_wasted = zs_stat_get(class, OBJ_ALLOCATED) -
                 zs_stat_get(class, OBJ_USED);
 
         obj_wasted /= get_maxobj_per_zspage(class->size,
                         class->pages_per_zspage);
 
         return obj_wasted * class->pages_per_zspage;
 }

if we can free any zspages (which is at least one page), then we
attempt to do so.

is compaction the root cause of the symptoms Vitaly observe?


Vitaly, if you disable compaction:

---

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 14fc466..d9b5427 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1944,8 +1944,9 @@ struct zs_pool *zs_create_pool(char *name, gfp_t flags)
         * Not critical, we still can use the pool
         * and user can trigger compaction manually.
         */
-       if (zs_register_shrinker(pool) == 0)
+/*     if (zs_register_shrinker(pool) == 0)
                pool->shrinker_enabled = true;
+*/
        return pool;
 
 err:


---

does the 'problem' go away?


> but I haven't followed most of the recent zsmalloc updates too
> closely, so I may be totally wrong :-)
> 
> zbud is much simpler; since it just uses buddied pairs, it simply
> keeps a list of zbud page with only 1 compressed page stored in it.
> There is still the possibility of fragmentation, but since it's
> simple, it's much smaller.  And there is no compaction implemented in
> it, currently.  The downside, as we all know, is worse efficiency in
> storing compressed pages - it can't do better than 2:1.
> 

	-ss

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* Re: [PATCH 2/3] zpool/zsmalloc/zbud: align on interfaces
  2015-09-15  1:06     ` Sergey Senozhatsky
@ 2015-09-15  5:09       ` Dan Streetman
  -1 siblings, 0 replies; 63+ messages in thread
From: Dan Streetman @ 2015-09-15  5:09 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Vitaly Wool, Minchan Kim, Sergey Senozhatsky, linux-kernel,
	Linux-MM

On Mon, Sep 14, 2015 at 9:06 PM, Sergey Senozhatsky
<sergey.senozhatsky.work@gmail.com> wrote:
> On (09/14/15 15:51), Vitaly Wool wrote:
>> As a preparation step for zram to be able to use common zpool API,
>> there has to be some alignment done on it. This patch adds
>> functions that correspond to zsmalloc-specific API to the common
>> zpool API and takes care of the callbacks that have to be
>> introduced, too.
>>
>> Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
>> ---
>>  drivers/block/zram/zram_drv.c |  4 ++--
>>  include/linux/zpool.h         | 14 ++++++++++++++
>>  include/linux/zsmalloc.h      |  8 ++------
>>  mm/zbud.c                     | 12 ++++++++++++
>>  mm/zpool.c                    | 22 ++++++++++++++++++++++
>>  mm/zsmalloc.c                 | 23 ++++++++++++++++++++---
>>  6 files changed, 72 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
>> index 6d9f1d1..49d5a65 100644
>> --- a/drivers/block/zram/zram_drv.c
>> +++ b/drivers/block/zram/zram_drv.c
>> @@ -427,12 +427,12 @@ static ssize_t mm_stat_show(struct device *dev,
>>               struct device_attribute *attr, char *buf)
>>  {
>>       struct zram *zram = dev_to_zram(dev);
>> -     struct zs_pool_stats pool_stats;
>> +     struct zpool_stats pool_stats;
>>       u64 orig_size, mem_used = 0;
>>       long max_used;
>>       ssize_t ret;
>>
>> -     memset(&pool_stats, 0x00, sizeof(struct zs_pool_stats));
>> +     memset(&pool_stats, 0x00, sizeof(struct zpool_stats));
>>
>>       down_read(&zram->init_lock);
>>       if (init_done(zram)) {
>> diff --git a/include/linux/zpool.h b/include/linux/zpool.h
>> index 42f8ec9..f64cf86 100644
>> --- a/include/linux/zpool.h
>> +++ b/include/linux/zpool.h
>> @@ -17,6 +17,11 @@ struct zpool_ops {
>>       int (*evict)(struct zpool *pool, unsigned long handle);
>>  };
>>
>> +struct zpool_stats {
>> +     /* How many pages were migrated (freed) */
>> +     unsigned long pages_compacted;
>> +};
>> +
>>  /*
>>   * Control how a handle is mapped.  It will be ignored if the
>>   * implementation does not support it.  Its use is optional.
>> @@ -58,6 +63,10 @@ void *zpool_map_handle(struct zpool *pool, unsigned long handle,
>>
>>  void zpool_unmap_handle(struct zpool *pool, unsigned long handle);
>>
>> +int zpool_compact(struct zpool *pool, unsigned long *compacted);
>
> I see no reason for having this `unsigned long *compacted` param.
>
>
>> +void zpool_stats(struct zpool *pool, struct zpool_stats *zstats);
>> +
>>  u64 zpool_get_total_size(struct zpool *pool);
>>
>>
>> @@ -72,6 +81,8 @@ u64 zpool_get_total_size(struct zpool *pool);
>>   * @shrink:  shrink the pool.
>>   * @map:     map a handle.
>>   * @unmap:   unmap a handle.
>> + * @compact: try to run compaction for the pool
>> + * @stats:   get statistics for the pool
>>   * @total_size:      get total size of a pool.
>>   *
>>   * This is created by a zpool implementation and registered
>> @@ -98,6 +109,9 @@ struct zpool_driver {
>>                               enum zpool_mapmode mm);
>>       void (*unmap)(void *pool, unsigned long handle);
>>
>> +     int (*compact)(void *pool, unsigned long *compacted);
>> +     void (*stats)(void *pool, struct zpool_stats *stats);
>> +
>>       u64 (*total_size)(void *pool);
>>  };
>>
>> diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
>> index 6398dfa..5aee1c7 100644
>> --- a/include/linux/zsmalloc.h
>> +++ b/include/linux/zsmalloc.h
>> @@ -15,6 +15,7 @@
>>  #define _ZS_MALLOC_H_
>>
>>  #include <linux/types.h>
>> +#include <linux/zpool.h>
>>
>>  /*
>>   * zsmalloc mapping modes
>> @@ -34,11 +35,6 @@ enum zs_mapmode {
>>        */
>>  };
>>
>> -struct zs_pool_stats {
>> -     /* How many pages were migrated (freed) */
>> -     unsigned long pages_compacted;
>> -};
>> -
>>  struct zs_pool;
>>
>>  struct zs_pool *zs_create_pool(char *name, gfp_t flags);
>> @@ -54,5 +50,5 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
>>  unsigned long zs_get_total_pages(struct zs_pool *pool);
>>  unsigned long zs_compact(struct zs_pool *pool);
>>
>> -void zs_pool_stats(struct zs_pool *pool, struct zs_pool_stats *stats);
>> +void zs_pool_stats(struct zs_pool *pool, struct zpool_stats *stats);
>>  #endif
>> diff --git a/mm/zbud.c b/mm/zbud.c
>> index fa48bcdf..0963342 100644
>> --- a/mm/zbud.c
>> +++ b/mm/zbud.c
>> @@ -195,6 +195,16 @@ static void zbud_zpool_unmap(void *pool, unsigned long handle)
>>       zbud_unmap(pool, handle);
>>  }
>>
>> +static int zbud_zpool_compact(void *pool, unsigned long *compacted)
>> +{
>> +     return -EINVAL;
>> +}
>> +
>> +static void zbud_zpool_stats(void *pool, struct zpool_stats *stats)
>> +{
>> +     /* no-op */
>> +}
>> +
>>  static u64 zbud_zpool_total_size(void *pool)
>>  {
>>       return zbud_get_pool_size(pool) * PAGE_SIZE;
>> @@ -210,6 +220,8 @@ static struct zpool_driver zbud_zpool_driver = {
>>       .shrink =       zbud_zpool_shrink,
>>       .map =          zbud_zpool_map,
>>       .unmap =        zbud_zpool_unmap,
>> +     .compact =      zbud_zpool_compact,
>> +     .stats =        zbud_zpool_stats,
>>       .total_size =   zbud_zpool_total_size,
>>  };
>>
>> diff --git a/mm/zpool.c b/mm/zpool.c
>> index 8f670d3..15a171a 100644
>> --- a/mm/zpool.c
>> +++ b/mm/zpool.c
>> @@ -341,6 +341,28 @@ void zpool_unmap_handle(struct zpool *zpool, unsigned long handle)
>>  }
>>
>>  /**
>> + * zpool_compact() - try to run compaction over zpool
>> + * @pool     The zpool to compact
>> + * @compacted        The number of migrated pages
>> + *
>> + * Returns: 0 on success, error code otherwise
>> + */
>> +int zpool_compact(struct zpool *zpool, unsigned long *compacted)
>> +{
>> +     return zpool->driver->compact(zpool->pool, compacted);
>> +}
>> +
>> +/**
>> + * zpool_stats() - obtain zpool statistics
>> + * @pool     The zpool to get statistics for
>> + * @zstats   stats to fill in
>> + */
>> +void zpool_stats(struct zpool *zpool, struct zpool_stats *zstats)
>> +{
>> +     zpool->driver->stats(zpool->pool, zstats);
>> +}
>> +
>> +/**
>>   * zpool_get_total_size() - The total size of the pool
>>   * @pool     The zpool to check
>>   *
>> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
>> index f135b1b..f002f57 100644
>> --- a/mm/zsmalloc.c
>> +++ b/mm/zsmalloc.c
>> @@ -245,7 +245,7 @@ struct zs_pool {
>>       gfp_t flags;    /* allocation flags used when growing pool */
>>       atomic_long_t pages_allocated;
>>
>> -     struct zs_pool_stats stats;
>> +     struct zpool_stats stats;
>>
>>       /* Compact classes */
>>       struct shrinker shrinker;
>> @@ -365,6 +365,21 @@ static void zs_zpool_unmap(void *pool, unsigned long handle)
>>       zs_unmap_object(pool, handle);
>>  }
>>
>> +static int zs_zpool_compact(void *pool, unsigned long *compacted)
>> +{
>> +     unsigned long c = zs_compact(pool);
>> +
>> +     if (compacted)
>> +             *compacted = c;
>
> why not "return zs_compact(pool)"?

I agree it should just return zs_compact, and zbud can just always
return 0.  Any failures in compaction are going to be entirely
internal (besides an invalid pool, but there's bigger problems if that
happens), and there's nothing the caller can do with an error return
code.

Other than that, the patch looks fine to me, you can add (with that change)

Acked-by: Dan Streetman <ddstreet@ieee.org>

>
>
>> +     return 0;
>> +}
>> +
>> +
>> +static void zs_zpool_stats(void *pool, struct zpool_stats *stats)
>> +{
>> +     zs_pool_stats(pool, stats);
>> +}
>> +
>>  static u64 zs_zpool_total_size(void *pool)
>>  {
>>       return zs_get_total_pages(pool) << PAGE_SHIFT;
>> @@ -380,6 +395,8 @@ static struct zpool_driver zs_zpool_driver = {
>>       .shrink =       zs_zpool_shrink,
>>       .map =          zs_zpool_map,
>>       .unmap =        zs_zpool_unmap,
>> +     .compact =      zs_zpool_compact,
>> +     .stats =        zs_zpool_stats,
>>       .total_size =   zs_zpool_total_size,
>>  };
>>
>> @@ -1789,9 +1806,9 @@ unsigned long zs_compact(struct zs_pool *pool)
>>  }
>>  EXPORT_SYMBOL_GPL(zs_compact);
>>
>> -void zs_pool_stats(struct zs_pool *pool, struct zs_pool_stats *stats)
>> +void zs_pool_stats(struct zs_pool *pool, struct zpool_stats *stats)
>>  {
>> -     memcpy(stats, &pool->stats, sizeof(struct zs_pool_stats));
>> +     memcpy(stats, &pool->stats, sizeof(struct zpool_stats));
>>  }
>>  EXPORT_SYMBOL_GPL(zs_pool_stats);
>>
>> --
>> 1.9.1
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 2/3] zpool/zsmalloc/zbud: align on interfaces
@ 2015-09-15  5:09       ` Dan Streetman
  0 siblings, 0 replies; 63+ messages in thread
From: Dan Streetman @ 2015-09-15  5:09 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Vitaly Wool, Minchan Kim, Sergey Senozhatsky, linux-kernel,
	Linux-MM

On Mon, Sep 14, 2015 at 9:06 PM, Sergey Senozhatsky
<sergey.senozhatsky.work@gmail.com> wrote:
> On (09/14/15 15:51), Vitaly Wool wrote:
>> As a preparation step for zram to be able to use common zpool API,
>> there has to be some alignment done on it. This patch adds
>> functions that correspond to zsmalloc-specific API to the common
>> zpool API and takes care of the callbacks that have to be
>> introduced, too.
>>
>> Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
>> ---
>>  drivers/block/zram/zram_drv.c |  4 ++--
>>  include/linux/zpool.h         | 14 ++++++++++++++
>>  include/linux/zsmalloc.h      |  8 ++------
>>  mm/zbud.c                     | 12 ++++++++++++
>>  mm/zpool.c                    | 22 ++++++++++++++++++++++
>>  mm/zsmalloc.c                 | 23 ++++++++++++++++++++---
>>  6 files changed, 72 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
>> index 6d9f1d1..49d5a65 100644
>> --- a/drivers/block/zram/zram_drv.c
>> +++ b/drivers/block/zram/zram_drv.c
>> @@ -427,12 +427,12 @@ static ssize_t mm_stat_show(struct device *dev,
>>               struct device_attribute *attr, char *buf)
>>  {
>>       struct zram *zram = dev_to_zram(dev);
>> -     struct zs_pool_stats pool_stats;
>> +     struct zpool_stats pool_stats;
>>       u64 orig_size, mem_used = 0;
>>       long max_used;
>>       ssize_t ret;
>>
>> -     memset(&pool_stats, 0x00, sizeof(struct zs_pool_stats));
>> +     memset(&pool_stats, 0x00, sizeof(struct zpool_stats));
>>
>>       down_read(&zram->init_lock);
>>       if (init_done(zram)) {
>> diff --git a/include/linux/zpool.h b/include/linux/zpool.h
>> index 42f8ec9..f64cf86 100644
>> --- a/include/linux/zpool.h
>> +++ b/include/linux/zpool.h
>> @@ -17,6 +17,11 @@ struct zpool_ops {
>>       int (*evict)(struct zpool *pool, unsigned long handle);
>>  };
>>
>> +struct zpool_stats {
>> +     /* How many pages were migrated (freed) */
>> +     unsigned long pages_compacted;
>> +};
>> +
>>  /*
>>   * Control how a handle is mapped.  It will be ignored if the
>>   * implementation does not support it.  Its use is optional.
>> @@ -58,6 +63,10 @@ void *zpool_map_handle(struct zpool *pool, unsigned long handle,
>>
>>  void zpool_unmap_handle(struct zpool *pool, unsigned long handle);
>>
>> +int zpool_compact(struct zpool *pool, unsigned long *compacted);
>
> I see no reason for having this `unsigned long *compacted` param.
>
>
>> +void zpool_stats(struct zpool *pool, struct zpool_stats *zstats);
>> +
>>  u64 zpool_get_total_size(struct zpool *pool);
>>
>>
>> @@ -72,6 +81,8 @@ u64 zpool_get_total_size(struct zpool *pool);
>>   * @shrink:  shrink the pool.
>>   * @map:     map a handle.
>>   * @unmap:   unmap a handle.
>> + * @compact: try to run compaction for the pool
>> + * @stats:   get statistics for the pool
>>   * @total_size:      get total size of a pool.
>>   *
>>   * This is created by a zpool implementation and registered
>> @@ -98,6 +109,9 @@ struct zpool_driver {
>>                               enum zpool_mapmode mm);
>>       void (*unmap)(void *pool, unsigned long handle);
>>
>> +     int (*compact)(void *pool, unsigned long *compacted);
>> +     void (*stats)(void *pool, struct zpool_stats *stats);
>> +
>>       u64 (*total_size)(void *pool);
>>  };
>>
>> diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
>> index 6398dfa..5aee1c7 100644
>> --- a/include/linux/zsmalloc.h
>> +++ b/include/linux/zsmalloc.h
>> @@ -15,6 +15,7 @@
>>  #define _ZS_MALLOC_H_
>>
>>  #include <linux/types.h>
>> +#include <linux/zpool.h>
>>
>>  /*
>>   * zsmalloc mapping modes
>> @@ -34,11 +35,6 @@ enum zs_mapmode {
>>        */
>>  };
>>
>> -struct zs_pool_stats {
>> -     /* How many pages were migrated (freed) */
>> -     unsigned long pages_compacted;
>> -};
>> -
>>  struct zs_pool;
>>
>>  struct zs_pool *zs_create_pool(char *name, gfp_t flags);
>> @@ -54,5 +50,5 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
>>  unsigned long zs_get_total_pages(struct zs_pool *pool);
>>  unsigned long zs_compact(struct zs_pool *pool);
>>
>> -void zs_pool_stats(struct zs_pool *pool, struct zs_pool_stats *stats);
>> +void zs_pool_stats(struct zs_pool *pool, struct zpool_stats *stats);
>>  #endif
>> diff --git a/mm/zbud.c b/mm/zbud.c
>> index fa48bcdf..0963342 100644
>> --- a/mm/zbud.c
>> +++ b/mm/zbud.c
>> @@ -195,6 +195,16 @@ static void zbud_zpool_unmap(void *pool, unsigned long handle)
>>       zbud_unmap(pool, handle);
>>  }
>>
>> +static int zbud_zpool_compact(void *pool, unsigned long *compacted)
>> +{
>> +     return -EINVAL;
>> +}
>> +
>> +static void zbud_zpool_stats(void *pool, struct zpool_stats *stats)
>> +{
>> +     /* no-op */
>> +}
>> +
>>  static u64 zbud_zpool_total_size(void *pool)
>>  {
>>       return zbud_get_pool_size(pool) * PAGE_SIZE;
>> @@ -210,6 +220,8 @@ static struct zpool_driver zbud_zpool_driver = {
>>       .shrink =       zbud_zpool_shrink,
>>       .map =          zbud_zpool_map,
>>       .unmap =        zbud_zpool_unmap,
>> +     .compact =      zbud_zpool_compact,
>> +     .stats =        zbud_zpool_stats,
>>       .total_size =   zbud_zpool_total_size,
>>  };
>>
>> diff --git a/mm/zpool.c b/mm/zpool.c
>> index 8f670d3..15a171a 100644
>> --- a/mm/zpool.c
>> +++ b/mm/zpool.c
>> @@ -341,6 +341,28 @@ void zpool_unmap_handle(struct zpool *zpool, unsigned long handle)
>>  }
>>
>>  /**
>> + * zpool_compact() - try to run compaction over zpool
>> + * @pool     The zpool to compact
>> + * @compacted        The number of migrated pages
>> + *
>> + * Returns: 0 on success, error code otherwise
>> + */
>> +int zpool_compact(struct zpool *zpool, unsigned long *compacted)
>> +{
>> +     return zpool->driver->compact(zpool->pool, compacted);
>> +}
>> +
>> +/**
>> + * zpool_stats() - obtain zpool statistics
>> + * @pool     The zpool to get statistics for
>> + * @zstats   stats to fill in
>> + */
>> +void zpool_stats(struct zpool *zpool, struct zpool_stats *zstats)
>> +{
>> +     zpool->driver->stats(zpool->pool, zstats);
>> +}
>> +
>> +/**
>>   * zpool_get_total_size() - The total size of the pool
>>   * @pool     The zpool to check
>>   *
>> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
>> index f135b1b..f002f57 100644
>> --- a/mm/zsmalloc.c
>> +++ b/mm/zsmalloc.c
>> @@ -245,7 +245,7 @@ struct zs_pool {
>>       gfp_t flags;    /* allocation flags used when growing pool */
>>       atomic_long_t pages_allocated;
>>
>> -     struct zs_pool_stats stats;
>> +     struct zpool_stats stats;
>>
>>       /* Compact classes */
>>       struct shrinker shrinker;
>> @@ -365,6 +365,21 @@ static void zs_zpool_unmap(void *pool, unsigned long handle)
>>       zs_unmap_object(pool, handle);
>>  }
>>
>> +static int zs_zpool_compact(void *pool, unsigned long *compacted)
>> +{
>> +     unsigned long c = zs_compact(pool);
>> +
>> +     if (compacted)
>> +             *compacted = c;
>
> why not "return zs_compact(pool)"?

I agree it should just return zs_compact, and zbud can just always
return 0.  Any failures in compaction are going to be entirely
internal (besides an invalid pool, but there's bigger problems if that
happens), and there's nothing the caller can do with an error return
code.

Other than that, the patch looks fine to me, you can add (with that change)

Acked-by: Dan Streetman <ddstreet@ieee.org>

>
>
>> +     return 0;
>> +}
>> +
>> +
>> +static void zs_zpool_stats(void *pool, struct zpool_stats *stats)
>> +{
>> +     zs_pool_stats(pool, stats);
>> +}
>> +
>>  static u64 zs_zpool_total_size(void *pool)
>>  {
>>       return zs_get_total_pages(pool) << PAGE_SHIFT;
>> @@ -380,6 +395,8 @@ static struct zpool_driver zs_zpool_driver = {
>>       .shrink =       zs_zpool_shrink,
>>       .map =          zs_zpool_map,
>>       .unmap =        zs_zpool_unmap,
>> +     .compact =      zs_zpool_compact,
>> +     .stats =        zs_zpool_stats,
>>       .total_size =   zs_zpool_total_size,
>>  };
>>
>> @@ -1789,9 +1806,9 @@ unsigned long zs_compact(struct zs_pool *pool)
>>  }
>>  EXPORT_SYMBOL_GPL(zs_compact);
>>
>> -void zs_pool_stats(struct zs_pool *pool, struct zs_pool_stats *stats)
>> +void zs_pool_stats(struct zs_pool *pool, struct zpool_stats *stats)
>>  {
>> -     memcpy(stats, &pool->stats, sizeof(struct zs_pool_stats));
>> +     memcpy(stats, &pool->stats, sizeof(struct zpool_stats));
>>  }
>>  EXPORT_SYMBOL_GPL(zs_pool_stats);
>>
>> --
>> 1.9.1
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 1/3] zram: make max_zpage_size configurable
  2015-09-14 13:50   ` Vitaly Wool
@ 2015-09-15  5:42     ` Dan Streetman
  -1 siblings, 0 replies; 63+ messages in thread
From: Dan Streetman @ 2015-09-15  5:42 UTC (permalink / raw)
  To: Vitaly Wool; +Cc: Minchan Kim, Sergey Senozhatsky, linux-kernel, Linux-MM

On Mon, Sep 14, 2015 at 9:50 AM, Vitaly Wool <vitalywool@gmail.com> wrote:
> It makes sense to have control over what compression ratios are
> ok to store pages uncompressed and what not. Moreover, if we end
> up using zbud allocator for zram, any attempt to allocate a whole
> page will fail, so we may want to avoid this as much as possible.

Well, zram explicitly expects to be able to store PAGE_SIZE'd objects:

if (unlikely(clen > max_zpage_size))
    clen = PAGE_SIZE;
handle = zs_malloc(meta->mem_pool, clen);

so the max_zpage_size doesn't prevent zram from trying to store the
page in zsmalloc/zbud/whatever; instead, if the compressed page is
larger than max_zpage_size, it just stores it uncompressed (as a side
note, I'm not quite sure what the benefit of not storing in compressed
form any pages that compress to between 3/4 and 1 page is...I suppose
the decompression time is skipped, but it also wastes space...i would
just make max_zpage_size == PAGE_SIZE).

but zbud can't store a PAGE_SIZE'd object.  so the behavior would
change.  The current behavior is:

compressed page <= max_zpage_size : stored compressed
compressed page > max_zpage_size : stored uncompressed

new behavior:

compressed page <= max_zpage_size : stored compressed
compressed page > max_zpage_size : zram write fails

to do this right, I think you have to change zbud to be able to store
PAGE_SIZE'd objects.  That should be doable, I think you can just the
page->lru to store it in the zbud lru, and use a page flag to indicate
it's uncompressed, full PAGE_SIZE page, or something like that.  But
without the ability to store full pages, zbud won't work well with
zram.

>
> So, let's have max_zpage_size configurable as a module parameter.
>
> Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
> ---
>  drivers/block/zram/zram_drv.c | 13 +++++++++++++
>  drivers/block/zram/zram_drv.h | 16 ----------------
>  2 files changed, 13 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 9fa15bb..6d9f1d1 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -42,6 +42,7 @@ static const char *default_compressor = "lzo";
>
>  /* Module params (documentation at end) */
>  static unsigned int num_devices = 1;
> +static size_t max_zpage_size = PAGE_SIZE / 4 * 3;
>
>  static inline void deprecated_attr_warn(const char *name)
>  {
> @@ -1411,6 +1412,16 @@ static int __init zram_init(void)
>                 return ret;
>         }
>
> +       /*
> +        * max_zpage_size must be less than or equal to:
> +        * ZS_MAX_ALLOC_SIZE. Otherwise, zs_malloc() would
> +        * always return failure.
> +        */
> +       if (max_zpage_size > PAGE_SIZE) {
> +               pr_err("Invalid max_zpage_size %ld\n", max_zpage_size);
> +               return -EINVAL;
> +       }
> +
>         zram_major = register_blkdev(0, "zram");
>         if (zram_major <= 0) {
>                 pr_err("Unable to get major number\n");
> @@ -1444,6 +1455,8 @@ module_exit(zram_exit);
>
>  module_param(num_devices, uint, 0);
>  MODULE_PARM_DESC(num_devices, "Number of pre-created zram devices");
> +module_param(max_zpage_size, ulong, 0);
> +MODULE_PARM_DESC(max_zpage_size, "Threshold for storing compressed pages");
>
>  MODULE_LICENSE("Dual BSD/GPL");
>  MODULE_AUTHOR("Nitin Gupta <ngupta@vflare.org>");
> diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> index 8e92339..3a29c33 100644
> --- a/drivers/block/zram/zram_drv.h
> +++ b/drivers/block/zram/zram_drv.h
> @@ -20,22 +20,6 @@
>
>  #include "zcomp.h"
>
> -/*-- Configurable parameters */
> -
> -/*
> - * Pages that compress to size greater than this are stored
> - * uncompressed in memory.
> - */
> -static const size_t max_zpage_size = PAGE_SIZE / 4 * 3;
> -
> -/*
> - * NOTE: max_zpage_size must be less than or equal to:
> - *   ZS_MAX_ALLOC_SIZE. Otherwise, zs_malloc() would
> - * always return failure.
> - */
> -
> -/*-- End of configurable params */
> -
>  #define SECTOR_SHIFT           9
>  #define SECTORS_PER_PAGE_SHIFT (PAGE_SHIFT - SECTOR_SHIFT)
>  #define SECTORS_PER_PAGE       (1 << SECTORS_PER_PAGE_SHIFT)
> --
> 1.9.1

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 1/3] zram: make max_zpage_size configurable
@ 2015-09-15  5:42     ` Dan Streetman
  0 siblings, 0 replies; 63+ messages in thread
From: Dan Streetman @ 2015-09-15  5:42 UTC (permalink / raw)
  To: Vitaly Wool; +Cc: Minchan Kim, Sergey Senozhatsky, linux-kernel, Linux-MM

On Mon, Sep 14, 2015 at 9:50 AM, Vitaly Wool <vitalywool@gmail.com> wrote:
> It makes sense to have control over what compression ratios are
> ok to store pages uncompressed and what not. Moreover, if we end
> up using zbud allocator for zram, any attempt to allocate a whole
> page will fail, so we may want to avoid this as much as possible.

Well, zram explicitly expects to be able to store PAGE_SIZE'd objects:

if (unlikely(clen > max_zpage_size))
    clen = PAGE_SIZE;
handle = zs_malloc(meta->mem_pool, clen);

so the max_zpage_size doesn't prevent zram from trying to store the
page in zsmalloc/zbud/whatever; instead, if the compressed page is
larger than max_zpage_size, it just stores it uncompressed (as a side
note, I'm not quite sure what the benefit of not storing in compressed
form any pages that compress to between 3/4 and 1 page is...I suppose
the decompression time is skipped, but it also wastes space...i would
just make max_zpage_size == PAGE_SIZE).

but zbud can't store a PAGE_SIZE'd object.  so the behavior would
change.  The current behavior is:

compressed page <= max_zpage_size : stored compressed
compressed page > max_zpage_size : stored uncompressed

new behavior:

compressed page <= max_zpage_size : stored compressed
compressed page > max_zpage_size : zram write fails

to do this right, I think you have to change zbud to be able to store
PAGE_SIZE'd objects.  That should be doable, I think you can just the
page->lru to store it in the zbud lru, and use a page flag to indicate
it's uncompressed, full PAGE_SIZE page, or something like that.  But
without the ability to store full pages, zbud won't work well with
zram.

>
> So, let's have max_zpage_size configurable as a module parameter.
>
> Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
> ---
>  drivers/block/zram/zram_drv.c | 13 +++++++++++++
>  drivers/block/zram/zram_drv.h | 16 ----------------
>  2 files changed, 13 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 9fa15bb..6d9f1d1 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -42,6 +42,7 @@ static const char *default_compressor = "lzo";
>
>  /* Module params (documentation at end) */
>  static unsigned int num_devices = 1;
> +static size_t max_zpage_size = PAGE_SIZE / 4 * 3;
>
>  static inline void deprecated_attr_warn(const char *name)
>  {
> @@ -1411,6 +1412,16 @@ static int __init zram_init(void)
>                 return ret;
>         }
>
> +       /*
> +        * max_zpage_size must be less than or equal to:
> +        * ZS_MAX_ALLOC_SIZE. Otherwise, zs_malloc() would
> +        * always return failure.
> +        */
> +       if (max_zpage_size > PAGE_SIZE) {
> +               pr_err("Invalid max_zpage_size %ld\n", max_zpage_size);
> +               return -EINVAL;
> +       }
> +
>         zram_major = register_blkdev(0, "zram");
>         if (zram_major <= 0) {
>                 pr_err("Unable to get major number\n");
> @@ -1444,6 +1455,8 @@ module_exit(zram_exit);
>
>  module_param(num_devices, uint, 0);
>  MODULE_PARM_DESC(num_devices, "Number of pre-created zram devices");
> +module_param(max_zpage_size, ulong, 0);
> +MODULE_PARM_DESC(max_zpage_size, "Threshold for storing compressed pages");
>
>  MODULE_LICENSE("Dual BSD/GPL");
>  MODULE_AUTHOR("Nitin Gupta <ngupta@vflare.org>");
> diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> index 8e92339..3a29c33 100644
> --- a/drivers/block/zram/zram_drv.h
> +++ b/drivers/block/zram/zram_drv.h
> @@ -20,22 +20,6 @@
>
>  #include "zcomp.h"
>
> -/*-- Configurable parameters */
> -
> -/*
> - * Pages that compress to size greater than this are stored
> - * uncompressed in memory.
> - */
> -static const size_t max_zpage_size = PAGE_SIZE / 4 * 3;
> -
> -/*
> - * NOTE: max_zpage_size must be less than or equal to:
> - *   ZS_MAX_ALLOC_SIZE. Otherwise, zs_malloc() would
> - * always return failure.
> - */
> -
> -/*-- End of configurable params */
> -
>  #define SECTOR_SHIFT           9
>  #define SECTORS_PER_PAGE_SHIFT (PAGE_SHIFT - SECTOR_SHIFT)
>  #define SECTORS_PER_PAGE       (1 << SECTORS_PER_PAGE_SHIFT)
> --
> 1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 3/3] zram: use common zpool interface
  2015-09-15  1:12     ` Sergey Senozhatsky
@ 2015-09-15  6:03       ` Dan Streetman
  -1 siblings, 0 replies; 63+ messages in thread
From: Dan Streetman @ 2015-09-15  6:03 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Vitaly Wool, Minchan Kim, Sergey Senozhatsky, linux-kernel,
	Linux-MM

On Mon, Sep 14, 2015 at 9:12 PM, Sergey Senozhatsky
<sergey.senozhatsky.work@gmail.com> wrote:
> On (09/14/15 15:55), Vitaly Wool wrote:
>> Update zram driver to use common zpool API instead of calling
>> zsmalloc functions directly. This patch also adds a parameter
>> that allows for changing underlying compressor storage to zbud.
>>
>> Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
>> ---
>>  drivers/block/zram/Kconfig    |  3 ++-
>>  drivers/block/zram/zram_drv.c | 44 ++++++++++++++++++++++++-------------------
>>  drivers/block/zram/zram_drv.h |  4 ++--
>>  3 files changed, 29 insertions(+), 22 deletions(-)
>>
>> diff --git a/drivers/block/zram/Kconfig b/drivers/block/zram/Kconfig
>> index 386ba3d..4831d0a 100644
>> --- a/drivers/block/zram/Kconfig
>> +++ b/drivers/block/zram/Kconfig
>> @@ -1,6 +1,7 @@
>>  config ZRAM
>>       tristate "Compressed RAM block device support"
>> -     depends on BLOCK && SYSFS && ZSMALLOC
>> +     depends on BLOCK && SYSFS
>> +     select ZPOOL
>
> well, in that case, all `#ifdef CONFIG_ZPOOL' in zsmalloc can be dropped,
> because now it's a must have.

yeah, it could be dropped from zbud as well.

>
>
>>       select LZO_COMPRESS
>>       select LZO_DECOMPRESS
>>       default n
>> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
>> index 49d5a65..2829c3d 100644
>> --- a/drivers/block/zram/zram_drv.c
>> +++ b/drivers/block/zram/zram_drv.c
>> @@ -44,6 +44,9 @@ static const char *default_compressor = "lzo";
>>  static unsigned int num_devices = 1;
>>  static size_t max_zpage_size = PAGE_SIZE / 4 * 3;
>>
>> +#define ZRAM_ZPOOL_DEFAULT "zsmalloc"
>> +static char *pool_type = ZRAM_ZPOOL_DEFAULT;
>> +
>>  static inline void deprecated_attr_warn(const char *name)
>>  {
>>       pr_warn_once("%d (%s) Attribute %s (and others) will be removed. %s\n",
>> @@ -229,11 +232,11 @@ static ssize_t mem_used_total_show(struct device *dev,
>>       down_read(&zram->init_lock);
>>       if (init_done(zram)) {
>>               struct zram_meta *meta = zram->meta;
>> -             val = zs_get_total_pages(meta->mem_pool);
>> +             val = zpool_get_total_size(meta->mem_pool);
>>       }
>>       up_read(&zram->init_lock);
>>
>> -     return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
>> +     return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
>>  }
>>
>>  static ssize_t mem_limit_show(struct device *dev,
>> @@ -298,7 +301,7 @@ static ssize_t mem_used_max_store(struct device *dev,
>>       if (init_done(zram)) {
>>               struct zram_meta *meta = zram->meta;
>>               atomic_long_set(&zram->stats.max_used_pages,
>> -                             zs_get_total_pages(meta->mem_pool));
>> +                     zpool_get_total_size(meta->mem_pool) >> PAGE_SHIFT);
>>       }
>>       up_read(&zram->init_lock);
>>
>> @@ -399,7 +402,7 @@ static ssize_t compact_store(struct device *dev,
>>       }
>>
>>       meta = zram->meta;
>> -     zs_compact(meta->mem_pool);
>> +     zpool_compact(meta->mem_pool, NULL);
>
> so you don't use that 'compacted' param.
>
>
>>       up_read(&zram->init_lock);
>>
>>       return len;
>> @@ -436,8 +439,8 @@ static ssize_t mm_stat_show(struct device *dev,
>>
>>       down_read(&zram->init_lock);
>>       if (init_done(zram)) {
>> -             mem_used = zs_get_total_pages(zram->meta->mem_pool);
>> -             zs_pool_stats(zram->meta->mem_pool, &pool_stats);
>> +             mem_used = zpool_get_total_size(zram->meta->mem_pool);
>> +             zpool_stats(zram->meta->mem_pool, &pool_stats);
>>       }
>>
>>       orig_size = atomic64_read(&zram->stats.pages_stored);
>> @@ -447,7 +450,7 @@ static ssize_t mm_stat_show(struct device *dev,
>>                       "%8llu %8llu %8llu %8lu %8ld %8llu %8lu\n",
>>                       orig_size << PAGE_SHIFT,
>>                       (u64)atomic64_read(&zram->stats.compr_data_size),
>> -                     mem_used << PAGE_SHIFT,
>> +                     mem_used,
>>                       zram->limit_pages << PAGE_SHIFT,
>>                       max_used << PAGE_SHIFT,
>>                       (u64)atomic64_read(&zram->stats.zero_pages),
>> @@ -492,10 +495,10 @@ static void zram_meta_free(struct zram_meta *meta, u64 disksize)
>>               if (!handle)
>>                       continue;
>>
>> -             zs_free(meta->mem_pool, handle);
>> +             zpool_free(meta->mem_pool, handle);
>>       }
>>
>> -     zs_destroy_pool(meta->mem_pool);
>> +     zpool_destroy_pool(meta->mem_pool);
>>       vfree(meta->table);
>>       kfree(meta);
>>  }
>> @@ -515,7 +518,8 @@ static struct zram_meta *zram_meta_alloc(char *pool_name, u64 disksize)
>>               goto out_error;
>>       }
>>
>> -     meta->mem_pool = zs_create_pool(pool_name, GFP_NOIO | __GFP_HIGHMEM);
>> +     meta->mem_pool = zpool_create_pool(pool_type, pool_name,
>> +                     GFP_NOIO | __GFP_HIGHMEM, NULL);
>>       if (!meta->mem_pool) {
>>               pr_err("Error creating memory pool\n");
>>               goto out_error;
>> @@ -551,7 +555,7 @@ static void zram_free_page(struct zram *zram, size_t index)
>>               return;
>>       }
>>
>> -     zs_free(meta->mem_pool, handle);
>> +     zpool_free(meta->mem_pool, handle);
>>
>>       atomic64_sub(zram_get_obj_size(meta, index),
>>                       &zram->stats.compr_data_size);
>> @@ -579,12 +583,12 @@ static int zram_decompress_page(struct zram *zram, char *mem, u32 index)
>>               return 0;
>>       }
>>
>> -     cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_RO);
>> +     cmem = zpool_map_handle(meta->mem_pool, handle, ZPOOL_MM_RO);
>>       if (size == PAGE_SIZE)
>>               copy_page(mem, cmem);
>>       else
>>               ret = zcomp_decompress(zram->comp, cmem, size, mem);
>> -     zs_unmap_object(meta->mem_pool, handle);
>> +     zpool_unmap_handle(meta->mem_pool, handle);
>>       bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
>>
>>       /* Should NEVER happen. Return bio error if it does. */
>> @@ -718,24 +722,24 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
>>                       src = uncmem;
>>       }
>>
>> -     handle = zs_malloc(meta->mem_pool, clen);
>> -     if (!handle) {
>> +     if (zpool_malloc(meta->mem_pool, clen, __GFP_IO | __GFP_NOWARN,
>
> hm, GFP_NOIO and __GFP_IO...

NOWARN isn't needed; zswap uses it but zram doesn't need to, in fact
zram probably does want a nomem warning printed since it does return
the error to the caller who is writing to zram.

with zsmalloc, the gfp flags are what was passed at pool creation
time, so this should be __GFP_NOIO | __GFP_HIGHMEM.  However, zbud
doesn't allow alloc with highmem.  What probably is best here is to
use NOIO and HIGHMEM from zram, and change zbud to just strip out
HIGHMEM if it's set, instead of returning error.  Although if zram
actually requires using highmem, that would be a problem.

>
>> +                     &handle) != 0) {
>>               pr_err("Error allocating memory for compressed page: %u, size=%zu\n",
>>                       index, clen);
>>               ret = -ENOMEM;
>>               goto out;
>>       }
>>
>> -     alloced_pages = zs_get_total_pages(meta->mem_pool);
>> +     alloced_pages = zpool_get_total_size(meta->mem_pool) >> PAGE_SHIFT;
>>       if (zram->limit_pages && alloced_pages > zram->limit_pages) {
>> -             zs_free(meta->mem_pool, handle);
>> +             zpool_free(meta->mem_pool, handle);
>>               ret = -ENOMEM;
>>               goto out;
>>       }
>>
>>       update_used_max(zram, alloced_pages);
>>
>> -     cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
>> +     cmem = zpool_map_handle(meta->mem_pool, handle, ZPOOL_MM_WO);
>>
>>       if ((clen == PAGE_SIZE) && !is_partial_io(bvec)) {
>>               src = kmap_atomic(page);
>> @@ -747,7 +751,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
>>
>>       zcomp_strm_release(zram->comp, zstrm);
>>       zstrm = NULL;
>> -     zs_unmap_object(meta->mem_pool, handle);
>> +     zpool_unmap_handle(meta->mem_pool, handle);
>>
>>       /*
>>        * Free memory associated with this sector
>> @@ -1457,6 +1461,8 @@ module_param(num_devices, uint, 0);
>>  MODULE_PARM_DESC(num_devices, "Number of pre-created zram devices");
>>  module_param(max_zpage_size, ulong, 0);
>>  MODULE_PARM_DESC(max_zpage_size, "Threshold for storing compressed pages");
>> +module_param_named(zpool_type, pool_type, charp, 0444);
>> +MODULE_PARM_DESC(zpool_type, "zpool implementation selection (zsmalloc vs zbud)");
>>
>>  MODULE_LICENSE("Dual BSD/GPL");
>>  MODULE_AUTHOR("Nitin Gupta <ngupta@vflare.org>");
>> diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
>> index 3a29c33..9a64b94 100644
>> --- a/drivers/block/zram/zram_drv.h
>> +++ b/drivers/block/zram/zram_drv.h
>> @@ -16,7 +16,7 @@
>>  #define _ZRAM_DRV_H_
>>
>>  #include <linux/spinlock.h>
>> -#include <linux/zsmalloc.h>
>> +#include <linux/zpool.h>
>>
>>  #include "zcomp.h"
>>
>> @@ -73,7 +73,7 @@ struct zram_stats {
>>
>>  struct zram_meta {
>>       struct zram_table_entry *table;
>> -     struct zs_pool *mem_pool;
>> +     struct zpool *mem_pool;
>>  };
>>
>>  struct zram {
>> --
>> 1.9.1
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 3/3] zram: use common zpool interface
@ 2015-09-15  6:03       ` Dan Streetman
  0 siblings, 0 replies; 63+ messages in thread
From: Dan Streetman @ 2015-09-15  6:03 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Vitaly Wool, Minchan Kim, Sergey Senozhatsky, linux-kernel,
	Linux-MM

On Mon, Sep 14, 2015 at 9:12 PM, Sergey Senozhatsky
<sergey.senozhatsky.work@gmail.com> wrote:
> On (09/14/15 15:55), Vitaly Wool wrote:
>> Update zram driver to use common zpool API instead of calling
>> zsmalloc functions directly. This patch also adds a parameter
>> that allows for changing underlying compressor storage to zbud.
>>
>> Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
>> ---
>>  drivers/block/zram/Kconfig    |  3 ++-
>>  drivers/block/zram/zram_drv.c | 44 ++++++++++++++++++++++++-------------------
>>  drivers/block/zram/zram_drv.h |  4 ++--
>>  3 files changed, 29 insertions(+), 22 deletions(-)
>>
>> diff --git a/drivers/block/zram/Kconfig b/drivers/block/zram/Kconfig
>> index 386ba3d..4831d0a 100644
>> --- a/drivers/block/zram/Kconfig
>> +++ b/drivers/block/zram/Kconfig
>> @@ -1,6 +1,7 @@
>>  config ZRAM
>>       tristate "Compressed RAM block device support"
>> -     depends on BLOCK && SYSFS && ZSMALLOC
>> +     depends on BLOCK && SYSFS
>> +     select ZPOOL
>
> well, in that case, all `#ifdef CONFIG_ZPOOL' in zsmalloc can be dropped,
> because now it's a must have.

yeah, it could be dropped from zbud as well.

>
>
>>       select LZO_COMPRESS
>>       select LZO_DECOMPRESS
>>       default n
>> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
>> index 49d5a65..2829c3d 100644
>> --- a/drivers/block/zram/zram_drv.c
>> +++ b/drivers/block/zram/zram_drv.c
>> @@ -44,6 +44,9 @@ static const char *default_compressor = "lzo";
>>  static unsigned int num_devices = 1;
>>  static size_t max_zpage_size = PAGE_SIZE / 4 * 3;
>>
>> +#define ZRAM_ZPOOL_DEFAULT "zsmalloc"
>> +static char *pool_type = ZRAM_ZPOOL_DEFAULT;
>> +
>>  static inline void deprecated_attr_warn(const char *name)
>>  {
>>       pr_warn_once("%d (%s) Attribute %s (and others) will be removed. %s\n",
>> @@ -229,11 +232,11 @@ static ssize_t mem_used_total_show(struct device *dev,
>>       down_read(&zram->init_lock);
>>       if (init_done(zram)) {
>>               struct zram_meta *meta = zram->meta;
>> -             val = zs_get_total_pages(meta->mem_pool);
>> +             val = zpool_get_total_size(meta->mem_pool);
>>       }
>>       up_read(&zram->init_lock);
>>
>> -     return scnprintf(buf, PAGE_SIZE, "%llu\n", val << PAGE_SHIFT);
>> +     return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
>>  }
>>
>>  static ssize_t mem_limit_show(struct device *dev,
>> @@ -298,7 +301,7 @@ static ssize_t mem_used_max_store(struct device *dev,
>>       if (init_done(zram)) {
>>               struct zram_meta *meta = zram->meta;
>>               atomic_long_set(&zram->stats.max_used_pages,
>> -                             zs_get_total_pages(meta->mem_pool));
>> +                     zpool_get_total_size(meta->mem_pool) >> PAGE_SHIFT);
>>       }
>>       up_read(&zram->init_lock);
>>
>> @@ -399,7 +402,7 @@ static ssize_t compact_store(struct device *dev,
>>       }
>>
>>       meta = zram->meta;
>> -     zs_compact(meta->mem_pool);
>> +     zpool_compact(meta->mem_pool, NULL);
>
> so you don't use that 'compacted' param.
>
>
>>       up_read(&zram->init_lock);
>>
>>       return len;
>> @@ -436,8 +439,8 @@ static ssize_t mm_stat_show(struct device *dev,
>>
>>       down_read(&zram->init_lock);
>>       if (init_done(zram)) {
>> -             mem_used = zs_get_total_pages(zram->meta->mem_pool);
>> -             zs_pool_stats(zram->meta->mem_pool, &pool_stats);
>> +             mem_used = zpool_get_total_size(zram->meta->mem_pool);
>> +             zpool_stats(zram->meta->mem_pool, &pool_stats);
>>       }
>>
>>       orig_size = atomic64_read(&zram->stats.pages_stored);
>> @@ -447,7 +450,7 @@ static ssize_t mm_stat_show(struct device *dev,
>>                       "%8llu %8llu %8llu %8lu %8ld %8llu %8lu\n",
>>                       orig_size << PAGE_SHIFT,
>>                       (u64)atomic64_read(&zram->stats.compr_data_size),
>> -                     mem_used << PAGE_SHIFT,
>> +                     mem_used,
>>                       zram->limit_pages << PAGE_SHIFT,
>>                       max_used << PAGE_SHIFT,
>>                       (u64)atomic64_read(&zram->stats.zero_pages),
>> @@ -492,10 +495,10 @@ static void zram_meta_free(struct zram_meta *meta, u64 disksize)
>>               if (!handle)
>>                       continue;
>>
>> -             zs_free(meta->mem_pool, handle);
>> +             zpool_free(meta->mem_pool, handle);
>>       }
>>
>> -     zs_destroy_pool(meta->mem_pool);
>> +     zpool_destroy_pool(meta->mem_pool);
>>       vfree(meta->table);
>>       kfree(meta);
>>  }
>> @@ -515,7 +518,8 @@ static struct zram_meta *zram_meta_alloc(char *pool_name, u64 disksize)
>>               goto out_error;
>>       }
>>
>> -     meta->mem_pool = zs_create_pool(pool_name, GFP_NOIO | __GFP_HIGHMEM);
>> +     meta->mem_pool = zpool_create_pool(pool_type, pool_name,
>> +                     GFP_NOIO | __GFP_HIGHMEM, NULL);
>>       if (!meta->mem_pool) {
>>               pr_err("Error creating memory pool\n");
>>               goto out_error;
>> @@ -551,7 +555,7 @@ static void zram_free_page(struct zram *zram, size_t index)
>>               return;
>>       }
>>
>> -     zs_free(meta->mem_pool, handle);
>> +     zpool_free(meta->mem_pool, handle);
>>
>>       atomic64_sub(zram_get_obj_size(meta, index),
>>                       &zram->stats.compr_data_size);
>> @@ -579,12 +583,12 @@ static int zram_decompress_page(struct zram *zram, char *mem, u32 index)
>>               return 0;
>>       }
>>
>> -     cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_RO);
>> +     cmem = zpool_map_handle(meta->mem_pool, handle, ZPOOL_MM_RO);
>>       if (size == PAGE_SIZE)
>>               copy_page(mem, cmem);
>>       else
>>               ret = zcomp_decompress(zram->comp, cmem, size, mem);
>> -     zs_unmap_object(meta->mem_pool, handle);
>> +     zpool_unmap_handle(meta->mem_pool, handle);
>>       bit_spin_unlock(ZRAM_ACCESS, &meta->table[index].value);
>>
>>       /* Should NEVER happen. Return bio error if it does. */
>> @@ -718,24 +722,24 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
>>                       src = uncmem;
>>       }
>>
>> -     handle = zs_malloc(meta->mem_pool, clen);
>> -     if (!handle) {
>> +     if (zpool_malloc(meta->mem_pool, clen, __GFP_IO | __GFP_NOWARN,
>
> hm, GFP_NOIO and __GFP_IO...

NOWARN isn't needed; zswap uses it but zram doesn't need to, in fact
zram probably does want a nomem warning printed since it does return
the error to the caller who is writing to zram.

with zsmalloc, the gfp flags are what was passed at pool creation
time, so this should be __GFP_NOIO | __GFP_HIGHMEM.  However, zbud
doesn't allow alloc with highmem.  What probably is best here is to
use NOIO and HIGHMEM from zram, and change zbud to just strip out
HIGHMEM if it's set, instead of returning error.  Although if zram
actually requires using highmem, that would be a problem.

>
>> +                     &handle) != 0) {
>>               pr_err("Error allocating memory for compressed page: %u, size=%zu\n",
>>                       index, clen);
>>               ret = -ENOMEM;
>>               goto out;
>>       }
>>
>> -     alloced_pages = zs_get_total_pages(meta->mem_pool);
>> +     alloced_pages = zpool_get_total_size(meta->mem_pool) >> PAGE_SHIFT;
>>       if (zram->limit_pages && alloced_pages > zram->limit_pages) {
>> -             zs_free(meta->mem_pool, handle);
>> +             zpool_free(meta->mem_pool, handle);
>>               ret = -ENOMEM;
>>               goto out;
>>       }
>>
>>       update_used_max(zram, alloced_pages);
>>
>> -     cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
>> +     cmem = zpool_map_handle(meta->mem_pool, handle, ZPOOL_MM_WO);
>>
>>       if ((clen == PAGE_SIZE) && !is_partial_io(bvec)) {
>>               src = kmap_atomic(page);
>> @@ -747,7 +751,7 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
>>
>>       zcomp_strm_release(zram->comp, zstrm);
>>       zstrm = NULL;
>> -     zs_unmap_object(meta->mem_pool, handle);
>> +     zpool_unmap_handle(meta->mem_pool, handle);
>>
>>       /*
>>        * Free memory associated with this sector
>> @@ -1457,6 +1461,8 @@ module_param(num_devices, uint, 0);
>>  MODULE_PARM_DESC(num_devices, "Number of pre-created zram devices");
>>  module_param(max_zpage_size, ulong, 0);
>>  MODULE_PARM_DESC(max_zpage_size, "Threshold for storing compressed pages");
>> +module_param_named(zpool_type, pool_type, charp, 0444);
>> +MODULE_PARM_DESC(zpool_type, "zpool implementation selection (zsmalloc vs zbud)");
>>
>>  MODULE_LICENSE("Dual BSD/GPL");
>>  MODULE_AUTHOR("Nitin Gupta <ngupta@vflare.org>");
>> diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
>> index 3a29c33..9a64b94 100644
>> --- a/drivers/block/zram/zram_drv.h
>> +++ b/drivers/block/zram/zram_drv.h
>> @@ -16,7 +16,7 @@
>>  #define _ZRAM_DRV_H_
>>
>>  #include <linux/spinlock.h>
>> -#include <linux/zsmalloc.h>
>> +#include <linux/zpool.h>
>>
>>  #include "zcomp.h"
>>
>> @@ -73,7 +73,7 @@ struct zram_stats {
>>
>>  struct zram_meta {
>>       struct zram_table_entry *table;
>> -     struct zs_pool *mem_pool;
>> +     struct zpool *mem_pool;
>>  };
>>
>>  struct zram {
>> --
>> 1.9.1
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 1/3] zram: make max_zpage_size configurable
  2015-09-15  5:42     ` Dan Streetman
@ 2015-09-15  6:08       ` Sergey Senozhatsky
  -1 siblings, 0 replies; 63+ messages in thread
From: Sergey Senozhatsky @ 2015-09-15  6:08 UTC (permalink / raw)
  To: Dan Streetman
  Cc: Vitaly Wool, Minchan Kim, Sergey Senozhatsky, linux-kernel,
	Linux-MM

On (09/15/15 01:42), Dan Streetman wrote:
> Well, zram explicitly expects to be able to store PAGE_SIZE'd objects:
> 
> if (unlikely(clen > max_zpage_size))
>     clen = PAGE_SIZE;
> handle = zs_malloc(meta->mem_pool, clen);
> 
> so the max_zpage_size doesn't prevent zram from trying to store the
> page in zsmalloc/zbud/whatever; instead, if the compressed page is
> larger than max_zpage_size, it just stores it uncompressed (as a side
> note, I'm not quite sure what the benefit of not storing in compressed
> form any pages that compress to between 3/4 and 1 page is...I suppose
> the decompression time is skipped, but it also wastes space...i would
> just make max_zpage_size == PAGE_SIZE).

correct, to avoid decompression of something that doesn't really
save any memory. compressed buffer may be very close to PAGE_SIZE,
so it sort of makes sense. I agree, that (3/4, 1 page] looks like
a magically picked range, though.

> 
> but zbud can't store a PAGE_SIZE'd object.  so the behavior would
> change.  The current behavior is:
> 
> compressed page <= max_zpage_size : stored compressed
> compressed page > max_zpage_size : stored uncompressed
> 
> new behavior:
> 
> compressed page <= max_zpage_size : stored compressed
> compressed page > max_zpage_size : zram write fails

yes. and per my observation, `compressed page > max_zpage_size'
happens "quite often".

	-ss

> to do this right, I think you have to change zbud to be able to store
> PAGE_SIZE'd objects.  That should be doable, I think you can just the
> page->lru to store it in the zbud lru, and use a page flag to indicate
> it's uncompressed, full PAGE_SIZE page, or something like that.  But
> without the ability to store full pages, zbud won't work well with
> zram.
> 
> >
> > So, let's have max_zpage_size configurable as a module parameter.
> >
> > Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
> > ---
> >  drivers/block/zram/zram_drv.c | 13 +++++++++++++
> >  drivers/block/zram/zram_drv.h | 16 ----------------
> >  2 files changed, 13 insertions(+), 16 deletions(-)
> >
> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > index 9fa15bb..6d9f1d1 100644
> > --- a/drivers/block/zram/zram_drv.c
> > +++ b/drivers/block/zram/zram_drv.c
> > @@ -42,6 +42,7 @@ static const char *default_compressor = "lzo";
> >
> >  /* Module params (documentation at end) */
> >  static unsigned int num_devices = 1;
> > +static size_t max_zpage_size = PAGE_SIZE / 4 * 3;
> >
> >  static inline void deprecated_attr_warn(const char *name)
> >  {
> > @@ -1411,6 +1412,16 @@ static int __init zram_init(void)
> >                 return ret;
> >         }
> >
> > +       /*
> > +        * max_zpage_size must be less than or equal to:
> > +        * ZS_MAX_ALLOC_SIZE. Otherwise, zs_malloc() would
> > +        * always return failure.
> > +        */
> > +       if (max_zpage_size > PAGE_SIZE) {
> > +               pr_err("Invalid max_zpage_size %ld\n", max_zpage_size);
> > +               return -EINVAL;
> > +       }
> > +
> >         zram_major = register_blkdev(0, "zram");
> >         if (zram_major <= 0) {
> >                 pr_err("Unable to get major number\n");
> > @@ -1444,6 +1455,8 @@ module_exit(zram_exit);
> >
> >  module_param(num_devices, uint, 0);
> >  MODULE_PARM_DESC(num_devices, "Number of pre-created zram devices");
> > +module_param(max_zpage_size, ulong, 0);
> > +MODULE_PARM_DESC(max_zpage_size, "Threshold for storing compressed pages");
> >
> >  MODULE_LICENSE("Dual BSD/GPL");
> >  MODULE_AUTHOR("Nitin Gupta <ngupta@vflare.org>");
> > diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> > index 8e92339..3a29c33 100644
> > --- a/drivers/block/zram/zram_drv.h
> > +++ b/drivers/block/zram/zram_drv.h
> > @@ -20,22 +20,6 @@
> >
> >  #include "zcomp.h"
> >
> > -/*-- Configurable parameters */
> > -
> > -/*
> > - * Pages that compress to size greater than this are stored
> > - * uncompressed in memory.
> > - */
> > -static const size_t max_zpage_size = PAGE_SIZE / 4 * 3;
> > -
> > -/*
> > - * NOTE: max_zpage_size must be less than or equal to:
> > - *   ZS_MAX_ALLOC_SIZE. Otherwise, zs_malloc() would
> > - * always return failure.
> > - */
> > -
> > -/*-- End of configurable params */
> > -
> >  #define SECTOR_SHIFT           9
> >  #define SECTORS_PER_PAGE_SHIFT (PAGE_SHIFT - SECTOR_SHIFT)
> >  #define SECTORS_PER_PAGE       (1 << SECTORS_PER_PAGE_SHIFT)
> > --
> > 1.9.1
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 1/3] zram: make max_zpage_size configurable
@ 2015-09-15  6:08       ` Sergey Senozhatsky
  0 siblings, 0 replies; 63+ messages in thread
From: Sergey Senozhatsky @ 2015-09-15  6:08 UTC (permalink / raw)
  To: Dan Streetman
  Cc: Vitaly Wool, Minchan Kim, Sergey Senozhatsky, linux-kernel,
	Linux-MM

On (09/15/15 01:42), Dan Streetman wrote:
> Well, zram explicitly expects to be able to store PAGE_SIZE'd objects:
> 
> if (unlikely(clen > max_zpage_size))
>     clen = PAGE_SIZE;
> handle = zs_malloc(meta->mem_pool, clen);
> 
> so the max_zpage_size doesn't prevent zram from trying to store the
> page in zsmalloc/zbud/whatever; instead, if the compressed page is
> larger than max_zpage_size, it just stores it uncompressed (as a side
> note, I'm not quite sure what the benefit of not storing in compressed
> form any pages that compress to between 3/4 and 1 page is...I suppose
> the decompression time is skipped, but it also wastes space...i would
> just make max_zpage_size == PAGE_SIZE).

correct, to avoid decompression of something that doesn't really
save any memory. compressed buffer may be very close to PAGE_SIZE,
so it sort of makes sense. I agree, that (3/4, 1 page] looks like
a magically picked range, though.

> 
> but zbud can't store a PAGE_SIZE'd object.  so the behavior would
> change.  The current behavior is:
> 
> compressed page <= max_zpage_size : stored compressed
> compressed page > max_zpage_size : stored uncompressed
> 
> new behavior:
> 
> compressed page <= max_zpage_size : stored compressed
> compressed page > max_zpage_size : zram write fails

yes. and per my observation, `compressed page > max_zpage_size'
happens "quite often".

	-ss

> to do this right, I think you have to change zbud to be able to store
> PAGE_SIZE'd objects.  That should be doable, I think you can just the
> page->lru to store it in the zbud lru, and use a page flag to indicate
> it's uncompressed, full PAGE_SIZE page, or something like that.  But
> without the ability to store full pages, zbud won't work well with
> zram.
> 
> >
> > So, let's have max_zpage_size configurable as a module parameter.
> >
> > Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
> > ---
> >  drivers/block/zram/zram_drv.c | 13 +++++++++++++
> >  drivers/block/zram/zram_drv.h | 16 ----------------
> >  2 files changed, 13 insertions(+), 16 deletions(-)
> >
> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > index 9fa15bb..6d9f1d1 100644
> > --- a/drivers/block/zram/zram_drv.c
> > +++ b/drivers/block/zram/zram_drv.c
> > @@ -42,6 +42,7 @@ static const char *default_compressor = "lzo";
> >
> >  /* Module params (documentation at end) */
> >  static unsigned int num_devices = 1;
> > +static size_t max_zpage_size = PAGE_SIZE / 4 * 3;
> >
> >  static inline void deprecated_attr_warn(const char *name)
> >  {
> > @@ -1411,6 +1412,16 @@ static int __init zram_init(void)
> >                 return ret;
> >         }
> >
> > +       /*
> > +        * max_zpage_size must be less than or equal to:
> > +        * ZS_MAX_ALLOC_SIZE. Otherwise, zs_malloc() would
> > +        * always return failure.
> > +        */
> > +       if (max_zpage_size > PAGE_SIZE) {
> > +               pr_err("Invalid max_zpage_size %ld\n", max_zpage_size);
> > +               return -EINVAL;
> > +       }
> > +
> >         zram_major = register_blkdev(0, "zram");
> >         if (zram_major <= 0) {
> >                 pr_err("Unable to get major number\n");
> > @@ -1444,6 +1455,8 @@ module_exit(zram_exit);
> >
> >  module_param(num_devices, uint, 0);
> >  MODULE_PARM_DESC(num_devices, "Number of pre-created zram devices");
> > +module_param(max_zpage_size, ulong, 0);
> > +MODULE_PARM_DESC(max_zpage_size, "Threshold for storing compressed pages");
> >
> >  MODULE_LICENSE("Dual BSD/GPL");
> >  MODULE_AUTHOR("Nitin Gupta <ngupta@vflare.org>");
> > diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> > index 8e92339..3a29c33 100644
> > --- a/drivers/block/zram/zram_drv.h
> > +++ b/drivers/block/zram/zram_drv.h
> > @@ -20,22 +20,6 @@
> >
> >  #include "zcomp.h"
> >
> > -/*-- Configurable parameters */
> > -
> > -/*
> > - * Pages that compress to size greater than this are stored
> > - * uncompressed in memory.
> > - */
> > -static const size_t max_zpage_size = PAGE_SIZE / 4 * 3;
> > -
> > -/*
> > - * NOTE: max_zpage_size must be less than or equal to:
> > - *   ZS_MAX_ALLOC_SIZE. Otherwise, zs_malloc() would
> > - * always return failure.
> > - */
> > -
> > -/*-- End of configurable params */
> > -
> >  #define SECTOR_SHIFT           9
> >  #define SECTORS_PER_PAGE_SHIFT (PAGE_SHIFT - SECTOR_SHIFT)
> >  #define SECTORS_PER_PAGE       (1 << SECTORS_PER_PAGE_SHIFT)
> > --
> > 1.9.1
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
  2015-09-14 13:49 ` Vitaly Wool
@ 2015-09-15  6:13   ` Minchan Kim
  -1 siblings, 0 replies; 63+ messages in thread
From: Minchan Kim @ 2015-09-15  6:13 UTC (permalink / raw)
  To: Vitaly Wool
  Cc: sergey.senozhatsky, ddstreet, linux-kernel, linux-mm,
	김준수, Gioh Kim

Hello Vitaly,

It seems you sent a mail with gmail web or something which didn't use
plain-text. ;-). I will add newline manually.
Please send a mail with plain-text in future.

On Mon, Sep 14, 2015 at 03:49:01PM +0200, Vitaly Wool wrote:
> While using ZRAM on a small RAM footprint devices, together with KSM,

KSM? Is there any reason you mentioned *KSM* in this context?
IOW, if you don't use KSM, you couldn't see a problem?

> I ran into several occasions when moving pages from compressed swap back
> into the "normal" part of RAM caused significant latencies in system operation.

What kernel version did you use? Did you enable CMA? ION?
What was major factor for the latencies?

Decompress? zsmalloc-compaction overhead? rmap overheads?
compaction overheads?
There are several potential culprits.
It would be very helpful if you provide some numbers(perf will help you).

> By using zbud I lose in compression ratio but gain in determinism,
> lower latencies and lower fragmentation, so in the coming patches
> I tried to generalize what I've done to enable zbud for zram so far.

Before that, I'd like to know what is root cause.
>From my side, I had an similar experience.
At that time, problem was that *compaction* which triggered to reclaim
lots of page cache pages. The reason compaction triggered a lot was
fragmentation caused by zsmalloc, GPU and high-order allocation
request by SLUB and somethings(ex, ION, fork).

Recently, Joonsoo fixed SLUB side.
http://marc.info/?l=linux-kernel&m=143891343223853&w=2

And we added zram-auto-compaction recently so zram try to compact
objects to reduce memory usage. It might be helpful for fragment
problem as side effect but please keep it mind that it would be opposite.
Currently, zram-auto-compaction is not aware of virtual memory compaction
so as worst case, zsmalloc can spread out pinned object into movable
pageblocks via zsmalloc-compaction.
Gioh and I try to solve the issue with below patches but is pending now
by other urgent works.
https://lwn.net/Articles/650917/
https://lkml.org/lkml/2015/8/10/90

In summary, we need to clarify what's the root cause before diving into
code and hiding it.

Thanks.

> 
> -- 
> Vitaly Wool <vitalywool@gmail.com>
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
@ 2015-09-15  6:13   ` Minchan Kim
  0 siblings, 0 replies; 63+ messages in thread
From: Minchan Kim @ 2015-09-15  6:13 UTC (permalink / raw)
  To: Vitaly Wool
  Cc: sergey.senozhatsky, ddstreet, linux-kernel, linux-mm,
	김준수, Gioh Kim

Hello Vitaly,

It seems you sent a mail with gmail web or something which didn't use
plain-text. ;-). I will add newline manually.
Please send a mail with plain-text in future.

On Mon, Sep 14, 2015 at 03:49:01PM +0200, Vitaly Wool wrote:
> While using ZRAM on a small RAM footprint devices, together with KSM,

KSM? Is there any reason you mentioned *KSM* in this context?
IOW, if you don't use KSM, you couldn't see a problem?

> I ran into several occasions when moving pages from compressed swap back
> into the "normal" part of RAM caused significant latencies in system operation.

What kernel version did you use? Did you enable CMA? ION?
What was major factor for the latencies?

Decompress? zsmalloc-compaction overhead? rmap overheads?
compaction overheads?
There are several potential culprits.
It would be very helpful if you provide some numbers(perf will help you).

> By using zbud I lose in compression ratio but gain in determinism,
> lower latencies and lower fragmentation, so in the coming patches
> I tried to generalize what I've done to enable zbud for zram so far.

Before that, I'd like to know what is root cause.
>From my side, I had an similar experience.
At that time, problem was that *compaction* which triggered to reclaim
lots of page cache pages. The reason compaction triggered a lot was
fragmentation caused by zsmalloc, GPU and high-order allocation
request by SLUB and somethings(ex, ION, fork).

Recently, Joonsoo fixed SLUB side.
http://marc.info/?l=linux-kernel&m=143891343223853&w=2

And we added zram-auto-compaction recently so zram try to compact
objects to reduce memory usage. It might be helpful for fragment
problem as side effect but please keep it mind that it would be opposite.
Currently, zram-auto-compaction is not aware of virtual memory compaction
so as worst case, zsmalloc can spread out pinned object into movable
pageblocks via zsmalloc-compaction.
Gioh and I try to solve the issue with below patches but is pending now
by other urgent works.
https://lwn.net/Articles/650917/
https://lkml.org/lkml/2015/8/10/90

In summary, we need to clarify what's the root cause before diving into
code and hiding it.

Thanks.

> 
> -- 
> Vitaly Wool <vitalywool@gmail.com>
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 1/3] zram: make max_zpage_size configurable
  2015-09-15  1:00     ` Sergey Senozhatsky
  (?)
@ 2015-09-15  7:18     ` Vitaly Wool
  2015-09-15  7:38         ` Sergey Senozhatsky
  -1 siblings, 1 reply; 63+ messages in thread
From: Vitaly Wool @ 2015-09-15  7:18 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: minchan, linux-mm, sergey.senozhatsky, linux-kernel, ddstreet

[-- Attachment #1: Type: text/plain, Size: 3855 bytes --]

On Sep 15, 2015 2:59 AM, "Sergey Senozhatsky" <
sergey.senozhatsky.work@gmail.com> wrote:
>
> On (09/14/15 15:50), Vitaly Wool wrote:
> > It makes sense to have control over what compression ratios are
> > ok to store pages uncompressed and what not.
>
> um... I don't want this to be exported. this is very 'zram-internal'.
>
> besides, you remove the exsiting default value
> - static const size_t max_zpage_size = PAGE_SIZE / 4 * 3;
>
> so now people must provide this module param in order to make zram
> work the way it used to work for years?
>

I do not remove it, I move it :) May I ask to review the patch again please?

>
> > Moreover, if we end up using zbud allocator for zram, any attempt to
> > allocate a whole page will fail, so we may want to avoid this as much
> > as possible.
>
> so how does it help?
>
> > So, let's have max_zpage_size configurable as a module parameter.
> >
> > Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
> > ---
> >  drivers/block/zram/zram_drv.c | 13 +++++++++++++
> >  drivers/block/zram/zram_drv.h | 16 ----------------
> >  2 files changed, 13 insertions(+), 16 deletions(-)
> >
> > diff --git a/drivers/block/zram/zram_drv.c
b/drivers/block/zram/zram_drv.c
> > index 9fa15bb..6d9f1d1 100644
> > --- a/drivers/block/zram/zram_drv.c
> > +++ b/drivers/block/zram/zram_drv.c
> > @@ -42,6 +42,7 @@ static const char *default_compressor = "lzo";
> >
> >  /* Module params (documentation at end) */
> >  static unsigned int num_devices = 1;
> > +static size_t max_zpage_size = PAGE_SIZE / 4 * 3;
> >
> >  static inline void deprecated_attr_warn(const char *name)
> >  {
> > @@ -1411,6 +1412,16 @@ static int __init zram_init(void)
> >               return ret;
> >       }
> >
> > +     /*
> > +      * max_zpage_size must be less than or equal to:
> > +      * ZS_MAX_ALLOC_SIZE. Otherwise, zs_malloc() would
> > +      * always return failure.
> > +      */
> > +     if (max_zpage_size > PAGE_SIZE) {
> > +             pr_err("Invalid max_zpage_size %ld\n", max_zpage_size);
>
> and how do people find out ZS_MAX_ALLOC_SIZE? this error message does not
> help.
>
> > +             return -EINVAL;
> > +     }
> > +
> >       zram_major = register_blkdev(0, "zram");
> >       if (zram_major <= 0) {
> >               pr_err("Unable to get major number\n");
> > @@ -1444,6 +1455,8 @@ module_exit(zram_exit);
> >
> >  module_param(num_devices, uint, 0);
> >  MODULE_PARM_DESC(num_devices, "Number of pre-created zram devices");
> > +module_param(max_zpage_size, ulong, 0);
> > +MODULE_PARM_DESC(max_zpage_size, "Threshold for storing compressed
pages");
>
> unclear description.
>
>
> >
> >  MODULE_LICENSE("Dual BSD/GPL");
> >  MODULE_AUTHOR("Nitin Gupta <ngupta@vflare.org>");
> > diff --git a/drivers/block/zram/zram_drv.h
b/drivers/block/zram/zram_drv.h
> > index 8e92339..3a29c33 100644
> > --- a/drivers/block/zram/zram_drv.h
> > +++ b/drivers/block/zram/zram_drv.h
> > @@ -20,22 +20,6 @@
> >
> >  #include "zcomp.h"
> >
> > -/*-- Configurable parameters */
> > -
> > -/*
> > - * Pages that compress to size greater than this are stored
> > - * uncompressed in memory.
> > - */
> > -static const size_t max_zpage_size = PAGE_SIZE / 4 * 3;
> > -
> > -/*
> > - * NOTE: max_zpage_size must be less than or equal to:
> > - *   ZS_MAX_ALLOC_SIZE. Otherwise, zs_malloc() would
> > - * always return failure.
> > - */
> > -
> > -/*-- End of configurable params */
> > -
> >  #define SECTOR_SHIFT         9
> >  #define SECTORS_PER_PAGE_SHIFT       (PAGE_SHIFT - SECTOR_SHIFT)
> >  #define SECTORS_PER_PAGE     (1 << SECTORS_PER_PAGE_SHIFT)
> > --
> > 1.9.1
> >
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org.  For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> >

[-- Attachment #2: Type: text/html, Size: 5616 bytes --]

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 1/3] zram: make max_zpage_size configurable
  2015-09-15  7:18     ` Vitaly Wool
@ 2015-09-15  7:38         ` Sergey Senozhatsky
  0 siblings, 0 replies; 63+ messages in thread
From: Sergey Senozhatsky @ 2015-09-15  7:38 UTC (permalink / raw)
  To: Vitaly Wool
  Cc: Sergey Senozhatsky, minchan, linux-mm, sergey.senozhatsky,
	linux-kernel, ddstreet

On (09/15/15 09:18), Vitaly Wool wrote:
>    > On (09/14/15 15:50), Vitaly Wool wrote:
>    > > It makes sense to have control over what compression ratios are
>    > > ok to store pages uncompressed and what not.
>    >
>    > um... I don't want this to be exported. this is very 'zram-internal'.
>    >
>    > besides, you remove the exsiting default value
>    > - static const size_t max_zpage_size = PAGE_SIZE / 4 * 3;
>    >
>    > so now people must provide this module param in order to make zram
>    > work the way it used to work for years?
>    >
> 
>    I do not remove it, I move it :) May I ask to review the patch again
>    please?

Oh, yep, missed that.

Please read Dan Streetman's reply. I agree with what he said/suggested.


And can you please switch your email client to 'plain text' mode only?
haven't checked, but I suspect that some of vger lists block !plain_text
emails. ... apart from the fact that it's hard to read such emails in
mutt, etc.

	-ss

>    >
>    > > Moreover, if we end up using zbud allocator for zram, any attempt to
>    > > allocate a whole page will fail, so we may want to avoid this as much
>    > > as possible.
>    >
>    > so how does it help?
>    >
>    > > So, let's have max_zpage_size configurable as a module parameter.
>    > >
>    > > Signed-off-by: Vitaly Wool <[2]vitalywool@gmail.com>
>    > > ---
>    > >  drivers/block/zram/zram_drv.c | 13 +++++++++++++
>    > >  drivers/block/zram/zram_drv.h | 16 ----------------
>    > >  2 files changed, 13 insertions(+), 16 deletions(-)
>    > >
>    > > diff --git a/drivers/block/zram/zram_drv.c
>    b/drivers/block/zram/zram_drv.c
>    > > index 9fa15bb..6d9f1d1 100644
>    > > --- a/drivers/block/zram/zram_drv.c
>    > > +++ b/drivers/block/zram/zram_drv.c
>    > > @@ -42,6 +42,7 @@ static const char *default_compressor = "lzo";
>    > >
>    > >  /* Module params (documentation at end) */
>    > >  static unsigned int num_devices = 1;
>    > > +static size_t max_zpage_size = PAGE_SIZE / 4 * 3;
>    > >
>    > >  static inline void deprecated_attr_warn(const char *name)
>    > >  {
>    > > @@ -1411,6 +1412,16 @@ static int __init zram_init(void)
>    > >               return ret;
>    > >       }
>    > >
>    > > +     /*
>    > > +      * max_zpage_size must be less than or equal to:
>    > > +      * ZS_MAX_ALLOC_SIZE. Otherwise, zs_malloc() would
>    > > +      * always return failure.
>    > > +      */
>    > > +     if (max_zpage_size > PAGE_SIZE) {
>    > > +             pr_err("Invalid max_zpage_size %ld\n", max_zpage_size);
>    >
>    > and how do people find out ZS_MAX_ALLOC_SIZE? this error message does
>    not
>    > help.
>    >
>    > > +             return -EINVAL;
>    > > +     }
>    > > +
>    > >       zram_major = register_blkdev(0, "zram");
>    > >       if (zram_major <= 0) {
>    > >               pr_err("Unable to get major number\n");
>    > > @@ -1444,6 +1455,8 @@ module_exit(zram_exit);
>    > >
>    > >  module_param(num_devices, uint, 0);
>    > >  MODULE_PARM_DESC(num_devices, "Number of pre-created zram devices");
>    > > +module_param(max_zpage_size, ulong, 0);
>    > > +MODULE_PARM_DESC(max_zpage_size, "Threshold for storing compressed
>    pages");
>    >
>    > unclear description.
>    >
>    >
>    > >
>    > >  MODULE_LICENSE("Dual BSD/GPL");
>    > >  MODULE_AUTHOR("Nitin Gupta <[3]ngupta@vflare.org>");
>    > > diff --git a/drivers/block/zram/zram_drv.h
>    b/drivers/block/zram/zram_drv.h
>    > > index 8e92339..3a29c33 100644
>    > > --- a/drivers/block/zram/zram_drv.h
>    > > +++ b/drivers/block/zram/zram_drv.h
>    > > @@ -20,22 +20,6 @@
>    > >
>    > >  #include "zcomp.h"
>    > >
>    > > -/*-- Configurable parameters */
>    > > -
>    > > -/*
>    > > - * Pages that compress to size greater than this are stored
>    > > - * uncompressed in memory.
>    > > - */
>    > > -static const size_t max_zpage_size = PAGE_SIZE / 4 * 3;
>    > > -
>    > > -/*
>    > > - * NOTE: max_zpage_size must be less than or equal to:
>    > > - *   ZS_MAX_ALLOC_SIZE. Otherwise, zs_malloc() would
>    > > - * always return failure.
>    > > - */
>    > > -
>    > > -/*-- End of configurable params */
>    > > -
>    > >  #define SECTOR_SHIFT         9
>    > >  #define SECTORS_PER_PAGE_SHIFT       (PAGE_SHIFT - SECTOR_SHIFT)
>    > >  #define SECTORS_PER_PAGE     (1 << SECTORS_PER_PAGE_SHIFT)
>    > > --
>    > > 1.9.1
>    > >
>    > > --
>    > > To unsubscribe, send a message with 'unsubscribe linux-mm' in
>    > > the body to [4]majordomo@kvack.org.  For more info on Linux MM,
>    > > see: [5]http://www.linux-mm.org/ .
>    > > Don't email: <a href=mailto:"[6]dont@kvack.org"> [7]email@kvack.org
>    </a>
>    > >
> 
> References
> 
>    Visible links
>    1. mailto:sergey.senozhatsky.work@gmail.com
>    2. mailto:vitalywool@gmail.com
>    3. mailto:ngupta@vflare.org
>    4. mailto:majordomo@kvack.org
>    5. http://www.linux-mm.org/
>    6. mailto:dont@kvack.org
>    7. mailto:email@kvack.org

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 1/3] zram: make max_zpage_size configurable
@ 2015-09-15  7:38         ` Sergey Senozhatsky
  0 siblings, 0 replies; 63+ messages in thread
From: Sergey Senozhatsky @ 2015-09-15  7:38 UTC (permalink / raw)
  To: Vitaly Wool
  Cc: Sergey Senozhatsky, minchan, linux-mm, sergey.senozhatsky,
	linux-kernel, ddstreet

On (09/15/15 09:18), Vitaly Wool wrote:
>    > On (09/14/15 15:50), Vitaly Wool wrote:
>    > > It makes sense to have control over what compression ratios are
>    > > ok to store pages uncompressed and what not.
>    >
>    > um... I don't want this to be exported. this is very 'zram-internal'.
>    >
>    > besides, you remove the exsiting default value
>    > - static const size_t max_zpage_size = PAGE_SIZE / 4 * 3;
>    >
>    > so now people must provide this module param in order to make zram
>    > work the way it used to work for years?
>    >
> 
>    I do not remove it, I move it :) May I ask to review the patch again
>    please?

Oh, yep, missed that.

Please read Dan Streetman's reply. I agree with what he said/suggested.


And can you please switch your email client to 'plain text' mode only?
haven't checked, but I suspect that some of vger lists block !plain_text
emails. ... apart from the fact that it's hard to read such emails in
mutt, etc.

	-ss

>    >
>    > > Moreover, if we end up using zbud allocator for zram, any attempt to
>    > > allocate a whole page will fail, so we may want to avoid this as much
>    > > as possible.
>    >
>    > so how does it help?
>    >
>    > > So, let's have max_zpage_size configurable as a module parameter.
>    > >
>    > > Signed-off-by: Vitaly Wool <[2]vitalywool@gmail.com>
>    > > ---
>    > >  drivers/block/zram/zram_drv.c | 13 +++++++++++++
>    > >  drivers/block/zram/zram_drv.h | 16 ----------------
>    > >  2 files changed, 13 insertions(+), 16 deletions(-)
>    > >
>    > > diff --git a/drivers/block/zram/zram_drv.c
>    b/drivers/block/zram/zram_drv.c
>    > > index 9fa15bb..6d9f1d1 100644
>    > > --- a/drivers/block/zram/zram_drv.c
>    > > +++ b/drivers/block/zram/zram_drv.c
>    > > @@ -42,6 +42,7 @@ static const char *default_compressor = "lzo";
>    > >
>    > >  /* Module params (documentation at end) */
>    > >  static unsigned int num_devices = 1;
>    > > +static size_t max_zpage_size = PAGE_SIZE / 4 * 3;
>    > >
>    > >  static inline void deprecated_attr_warn(const char *name)
>    > >  {
>    > > @@ -1411,6 +1412,16 @@ static int __init zram_init(void)
>    > >               return ret;
>    > >       }
>    > >
>    > > +     /*
>    > > +      * max_zpage_size must be less than or equal to:
>    > > +      * ZS_MAX_ALLOC_SIZE. Otherwise, zs_malloc() would
>    > > +      * always return failure.
>    > > +      */
>    > > +     if (max_zpage_size > PAGE_SIZE) {
>    > > +             pr_err("Invalid max_zpage_size %ld\n", max_zpage_size);
>    >
>    > and how do people find out ZS_MAX_ALLOC_SIZE? this error message does
>    not
>    > help.
>    >
>    > > +             return -EINVAL;
>    > > +     }
>    > > +
>    > >       zram_major = register_blkdev(0, "zram");
>    > >       if (zram_major <= 0) {
>    > >               pr_err("Unable to get major number\n");
>    > > @@ -1444,6 +1455,8 @@ module_exit(zram_exit);
>    > >
>    > >  module_param(num_devices, uint, 0);
>    > >  MODULE_PARM_DESC(num_devices, "Number of pre-created zram devices");
>    > > +module_param(max_zpage_size, ulong, 0);
>    > > +MODULE_PARM_DESC(max_zpage_size, "Threshold for storing compressed
>    pages");
>    >
>    > unclear description.
>    >
>    >
>    > >
>    > >  MODULE_LICENSE("Dual BSD/GPL");
>    > >  MODULE_AUTHOR("Nitin Gupta <[3]ngupta@vflare.org>");
>    > > diff --git a/drivers/block/zram/zram_drv.h
>    b/drivers/block/zram/zram_drv.h
>    > > index 8e92339..3a29c33 100644
>    > > --- a/drivers/block/zram/zram_drv.h
>    > > +++ b/drivers/block/zram/zram_drv.h
>    > > @@ -20,22 +20,6 @@
>    > >
>    > >  #include "zcomp.h"
>    > >
>    > > -/*-- Configurable parameters */
>    > > -
>    > > -/*
>    > > - * Pages that compress to size greater than this are stored
>    > > - * uncompressed in memory.
>    > > - */
>    > > -static const size_t max_zpage_size = PAGE_SIZE / 4 * 3;
>    > > -
>    > > -/*
>    > > - * NOTE: max_zpage_size must be less than or equal to:
>    > > - *   ZS_MAX_ALLOC_SIZE. Otherwise, zs_malloc() would
>    > > - * always return failure.
>    > > - */
>    > > -
>    > > -/*-- End of configurable params */
>    > > -
>    > >  #define SECTOR_SHIFT         9
>    > >  #define SECTORS_PER_PAGE_SHIFT       (PAGE_SHIFT - SECTOR_SHIFT)
>    > >  #define SECTORS_PER_PAGE     (1 << SECTORS_PER_PAGE_SHIFT)
>    > > --
>    > > 1.9.1
>    > >
>    > > --
>    > > To unsubscribe, send a message with 'unsubscribe linux-mm' in
>    > > the body to [4]majordomo@kvack.org.  For more info on Linux MM,
>    > > see: [5]http://www.linux-mm.org/ .
>    > > Don't email: <a href=mailto:"[6]dont@kvack.org"> [7]email@kvack.org
>    </a>
>    > >
> 
> References
> 
>    Visible links
>    1. mailto:sergey.senozhatsky.work@gmail.com
>    2. mailto:vitalywool@gmail.com
>    3. mailto:ngupta@vflare.org
>    4. mailto:majordomo@kvack.org
>    5. http://www.linux-mm.org/
>    6. mailto:dont@kvack.org
>    7. mailto:email@kvack.org

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
  2015-09-15  4:22           ` Sergey Senozhatsky
@ 2015-09-17  6:21             ` Vlastimil Babka
  -1 siblings, 0 replies; 63+ messages in thread
From: Vlastimil Babka @ 2015-09-17  6:21 UTC (permalink / raw)
  To: Sergey Senozhatsky, Dan Streetman
  Cc: Vitaly Wool, Minchan Kim, Sergey Senozhatsky, LKML, Linux-MM

On 09/15/2015 06:22 AM, Sergey Senozhatsky wrote:
> On (09/15/15 00:08), Dan Streetman wrote:
> [..]
>
> correct. a bit of internals: we don't scan all the zspages every
> time. each class has stats for allocated used objects, allocated
> used objects, etc. so we 'compact' only classes that can be
> compacted:
>
>   static unsigned long zs_can_compact(struct size_class *class)
>   {
>           unsigned long obj_wasted;
>
>           obj_wasted = zs_stat_get(class, OBJ_ALLOCATED) -
>                   zs_stat_get(class, OBJ_USED);
>
>           obj_wasted /= get_maxobj_per_zspage(class->size,
>                           class->pages_per_zspage);
>
>           return obj_wasted * class->pages_per_zspage;
>   }
>
> if we can free any zspages (which is at least one page), then we
> attempt to do so.
>
> is compaction the root cause of the symptoms Vitaly observe?

He mentioned the "compact_stalls" counter which in /proc/vmstat is for 
the traditional physical memory compaction, not the zsmalloc-specific 
one. Which would imply high-order allocations. Does zsmalloc try them 
first before falling back to the order-0 zspages linked together manually?


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
@ 2015-09-17  6:21             ` Vlastimil Babka
  0 siblings, 0 replies; 63+ messages in thread
From: Vlastimil Babka @ 2015-09-17  6:21 UTC (permalink / raw)
  To: Sergey Senozhatsky, Dan Streetman
  Cc: Vitaly Wool, Minchan Kim, Sergey Senozhatsky, LKML, Linux-MM

On 09/15/2015 06:22 AM, Sergey Senozhatsky wrote:
> On (09/15/15 00:08), Dan Streetman wrote:
> [..]
>
> correct. a bit of internals: we don't scan all the zspages every
> time. each class has stats for allocated used objects, allocated
> used objects, etc. so we 'compact' only classes that can be
> compacted:
>
>   static unsigned long zs_can_compact(struct size_class *class)
>   {
>           unsigned long obj_wasted;
>
>           obj_wasted = zs_stat_get(class, OBJ_ALLOCATED) -
>                   zs_stat_get(class, OBJ_USED);
>
>           obj_wasted /= get_maxobj_per_zspage(class->size,
>                           class->pages_per_zspage);
>
>           return obj_wasted * class->pages_per_zspage;
>   }
>
> if we can free any zspages (which is at least one page), then we
> attempt to do so.
>
> is compaction the root cause of the symptoms Vitaly observe?

He mentioned the "compact_stalls" counter which in /proc/vmstat is for 
the traditional physical memory compaction, not the zsmalloc-specific 
one. Which would imply high-order allocations. Does zsmalloc try them 
first before falling back to the order-0 zspages linked together manually?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
  2015-09-17  6:21             ` Vlastimil Babka
@ 2015-09-17  9:19               ` Sergey Senozhatsky
  -1 siblings, 0 replies; 63+ messages in thread
From: Sergey Senozhatsky @ 2015-09-17  9:19 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Sergey Senozhatsky, Dan Streetman, Vitaly Wool, Minchan Kim,
	Sergey Senozhatsky, LKML, Linux-MM

On (09/17/15 08:21), Vlastimil Babka wrote:
> On 09/15/2015 06:22 AM, Sergey Senozhatsky wrote:
> >On (09/15/15 00:08), Dan Streetman wrote:
> >[..]
> >
> >correct. a bit of internals: we don't scan all the zspages every
> >time. each class has stats for allocated used objects, allocated
> >used objects, etc. so we 'compact' only classes that can be
> >compacted:
> >
> >  static unsigned long zs_can_compact(struct size_class *class)
> >  {
> >          unsigned long obj_wasted;
> >
> >          obj_wasted = zs_stat_get(class, OBJ_ALLOCATED) -
> >                  zs_stat_get(class, OBJ_USED);
> >
> >          obj_wasted /= get_maxobj_per_zspage(class->size,
> >                          class->pages_per_zspage);
> >
> >          return obj_wasted * class->pages_per_zspage;
> >  }
> >
> >if we can free any zspages (which is at least one page), then we
> >attempt to do so.
> >
> >is compaction the root cause of the symptoms Vitaly observe?
> 
> He mentioned the "compact_stalls" counter which in /proc/vmstat is for the
> traditional physical memory compaction, not the zsmalloc-specific one. Which
> would imply high-order allocations. Does zsmalloc try them first before
> falling back to the order-0 zspages linked together manually?

each zspage is a bunch (pages_per_zspage) of alloc_page() calls

        for (i = 0; i < class->pages_per_zspage; i++) {
                struct page *page;

                page = alloc_page(flags);
                if (!page)
                        goto cleanup;

                INIT_LIST_HEAD(&page->lru);
                if (i == 0) {   /* first page */
                        SetPagePrivate(page);
                        set_page_private(page, 0);
                        first_page = page;
                        first_page->inuse = 0;
                }
                if (i == 1)
                        set_page_private(first_page, (unsigned long)page);
                if (i >= 1)
                        set_page_private(page, (unsigned long)first_page);
                if (i >= 2)
                        list_add(&page->lru, &prev_page->lru);
                if (i == class->pages_per_zspage - 1)   /* last page */
                        SetPagePrivate2(page);
                prev_page = page;
        }

	-ss

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
@ 2015-09-17  9:19               ` Sergey Senozhatsky
  0 siblings, 0 replies; 63+ messages in thread
From: Sergey Senozhatsky @ 2015-09-17  9:19 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Sergey Senozhatsky, Dan Streetman, Vitaly Wool, Minchan Kim,
	Sergey Senozhatsky, LKML, Linux-MM

On (09/17/15 08:21), Vlastimil Babka wrote:
> On 09/15/2015 06:22 AM, Sergey Senozhatsky wrote:
> >On (09/15/15 00:08), Dan Streetman wrote:
> >[..]
> >
> >correct. a bit of internals: we don't scan all the zspages every
> >time. each class has stats for allocated used objects, allocated
> >used objects, etc. so we 'compact' only classes that can be
> >compacted:
> >
> >  static unsigned long zs_can_compact(struct size_class *class)
> >  {
> >          unsigned long obj_wasted;
> >
> >          obj_wasted = zs_stat_get(class, OBJ_ALLOCATED) -
> >                  zs_stat_get(class, OBJ_USED);
> >
> >          obj_wasted /= get_maxobj_per_zspage(class->size,
> >                          class->pages_per_zspage);
> >
> >          return obj_wasted * class->pages_per_zspage;
> >  }
> >
> >if we can free any zspages (which is at least one page), then we
> >attempt to do so.
> >
> >is compaction the root cause of the symptoms Vitaly observe?
> 
> He mentioned the "compact_stalls" counter which in /proc/vmstat is for the
> traditional physical memory compaction, not the zsmalloc-specific one. Which
> would imply high-order allocations. Does zsmalloc try them first before
> falling back to the order-0 zspages linked together manually?

each zspage is a bunch (pages_per_zspage) of alloc_page() calls

        for (i = 0; i < class->pages_per_zspage; i++) {
                struct page *page;

                page = alloc_page(flags);
                if (!page)
                        goto cleanup;

                INIT_LIST_HEAD(&page->lru);
                if (i == 0) {   /* first page */
                        SetPagePrivate(page);
                        set_page_private(page, 0);
                        first_page = page;
                        first_page->inuse = 0;
                }
                if (i == 1)
                        set_page_private(first_page, (unsigned long)page);
                if (i >= 1)
                        set_page_private(page, (unsigned long)first_page);
                if (i >= 2)
                        list_add(&page->lru, &prev_page->lru);
                if (i == class->pages_per_zspage - 1)   /* last page */
                        SetPagePrivate2(page);
                prev_page = page;
        }

	-ss

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
  2015-09-15  6:13   ` Minchan Kim
@ 2015-09-25  9:54     ` Vitaly Wool
  -1 siblings, 0 replies; 63+ messages in thread
From: Vitaly Wool @ 2015-09-25  9:54 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Dan Streetman, LKML, Linux-MM,
	김준수, Gioh Kim

Hello Minchan,

the main use case where I see unacceptably long stalls in UI with
zsmalloc is switching between users in Android.
There is a way to automate user creation and switching between them so
the test I run both to get vmstat statistics and to profile stalls is
to create a user, switch to it and switch back. Each test cycle does
that 10 times, and all the results presented below are averages for 20
runs.

Kernel configurations used for testing:

(1): vanilla
(2): (1) plus "make SLUB atomic" patch [1]
(3): (1) with zbud instead of zsmalloc
(4): (2) with compaction defer logic mostly disabled

> KSM? Is there any reason you mentioned *KSM* in this context?
> IOW, if you don't use KSM, you couldn't see a problem?

If I don't use KSM, latenices get smaller in both cases. Worst case
wise, zbud still gives more deterministic behavior.

>> I ran into several occasions when moving pages from compressed swap back
>> into the "normal" part of RAM caused significant latencies in system operation.
>
> What kernel version did you use? Did you enable CMA? ION?
> What was major factor for the latencies?

CMA and ION are both enabled. The working kernel is 3.18 based with
most of the mm/ stuff backported from 4.2.
The major factors for the latencies was a) fragmentation and b)
compaction deferral. See also below.

> Decompress? zsmalloc-compaction overhead? rmap overheads?
> compaction overheads?
> There are several potential culprits.
> It would be very helpful if you provide some numbers(perf will help you).

The UI is blocked after user switching for, average:
(1) 1.84 seconds
(2) 0.89 seconds
(3) 1.32 seconds
(4) 0.87 seconds

The UI us blocked after user switching for, worst-case:
(1) 2.91
(2) 1.12
(3) 1.79
(4) 1.34

Selected vmstat results, average:
I. allocstall
(1) 7814
(2) 4615
(3) 2004
(4) 2611

II. compact_stall
(1) 1869
(2) 1135
(3) 727
(4) 638

III. compact_fail
(1) 914
(2) 520
(3) 230
(4) 218

IV. compact_success
(1) 876
(2) 535
(3) 419
(4) 443

More data available on request.

>> By using zbud I lose in compression ratio but gain in determinism,
>> lower latencies and lower fragmentation, so in the coming patches
>> I tried to generalize what I've done to enable zbud for zram so far.
>
> Before that, I'd like to know what is root cause.
> From my side, I had an similar experience.
> At that time, problem was that *compaction* which triggered to reclaim
> lots of page cache pages. The reason compaction triggered a lot was
> fragmentation caused by zsmalloc, GPU and high-order allocation
> request by SLUB and somethings(ex, ION, fork).
>
> Recently, Joonsoo fixed SLUB side.
> http://marc.info/?l=linux-kernel&m=143891343223853&w=2

Yes, it makes things better, see above. However, worst case is still
looking not so nice.

> And we added zram-auto-compaction recently so zram try to compact
> objects to reduce memory usage. It might be helpful for fragment
> problem as side effect but please keep it mind that it would be opposite.
> Currently, zram-auto-compaction is not aware of virtual memory compaction
> so as worst case, zsmalloc can spread out pinned object into movable
> pageblocks via zsmalloc-compaction.
> Gioh and I try to solve the issue with below patches but is pending now
> by other urgent works.
> https://lwn.net/Articles/650917/
> https://lkml.org/lkml/2015/8/10/90
>
> In summary, we need to clarify what's the root cause before diving into
> code and hiding it.

I'm not "hiding" anything. This statement is utterly bogus.

Summarizing my test results, I would like to stress that:
* zbud gives better worst-times
* the system's behavior with zbud is way more stable and predictable
* zsmalloc-based zram operation depends very much on kernel memory
management subsystem changes
* zsmalloc operates significantly worse with compaction deferral logic
introduced after ca. 3.18

As a bottom line, zsmalloc operation is substantially more fragile and
far less predictable than zbud's. If that is not a good reason to _at
least_ have *an option* to use zram with the latter, then I don't know
what is.

~vitaly

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
@ 2015-09-25  9:54     ` Vitaly Wool
  0 siblings, 0 replies; 63+ messages in thread
From: Vitaly Wool @ 2015-09-25  9:54 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Dan Streetman, LKML, Linux-MM,
	김준수, Gioh Kim

Hello Minchan,

the main use case where I see unacceptably long stalls in UI with
zsmalloc is switching between users in Android.
There is a way to automate user creation and switching between them so
the test I run both to get vmstat statistics and to profile stalls is
to create a user, switch to it and switch back. Each test cycle does
that 10 times, and all the results presented below are averages for 20
runs.

Kernel configurations used for testing:

(1): vanilla
(2): (1) plus "make SLUB atomic" patch [1]
(3): (1) with zbud instead of zsmalloc
(4): (2) with compaction defer logic mostly disabled

> KSM? Is there any reason you mentioned *KSM* in this context?
> IOW, if you don't use KSM, you couldn't see a problem?

If I don't use KSM, latenices get smaller in both cases. Worst case
wise, zbud still gives more deterministic behavior.

>> I ran into several occasions when moving pages from compressed swap back
>> into the "normal" part of RAM caused significant latencies in system operation.
>
> What kernel version did you use? Did you enable CMA? ION?
> What was major factor for the latencies?

CMA and ION are both enabled. The working kernel is 3.18 based with
most of the mm/ stuff backported from 4.2.
The major factors for the latencies was a) fragmentation and b)
compaction deferral. See also below.

> Decompress? zsmalloc-compaction overhead? rmap overheads?
> compaction overheads?
> There are several potential culprits.
> It would be very helpful if you provide some numbers(perf will help you).

The UI is blocked after user switching for, average:
(1) 1.84 seconds
(2) 0.89 seconds
(3) 1.32 seconds
(4) 0.87 seconds

The UI us blocked after user switching for, worst-case:
(1) 2.91
(2) 1.12
(3) 1.79
(4) 1.34

Selected vmstat results, average:
I. allocstall
(1) 7814
(2) 4615
(3) 2004
(4) 2611

II. compact_stall
(1) 1869
(2) 1135
(3) 727
(4) 638

III. compact_fail
(1) 914
(2) 520
(3) 230
(4) 218

IV. compact_success
(1) 876
(2) 535
(3) 419
(4) 443

More data available on request.

>> By using zbud I lose in compression ratio but gain in determinism,
>> lower latencies and lower fragmentation, so in the coming patches
>> I tried to generalize what I've done to enable zbud for zram so far.
>
> Before that, I'd like to know what is root cause.
> From my side, I had an similar experience.
> At that time, problem was that *compaction* which triggered to reclaim
> lots of page cache pages. The reason compaction triggered a lot was
> fragmentation caused by zsmalloc, GPU and high-order allocation
> request by SLUB and somethings(ex, ION, fork).
>
> Recently, Joonsoo fixed SLUB side.
> http://marc.info/?l=linux-kernel&m=143891343223853&w=2

Yes, it makes things better, see above. However, worst case is still
looking not so nice.

> And we added zram-auto-compaction recently so zram try to compact
> objects to reduce memory usage. It might be helpful for fragment
> problem as side effect but please keep it mind that it would be opposite.
> Currently, zram-auto-compaction is not aware of virtual memory compaction
> so as worst case, zsmalloc can spread out pinned object into movable
> pageblocks via zsmalloc-compaction.
> Gioh and I try to solve the issue with below patches but is pending now
> by other urgent works.
> https://lwn.net/Articles/650917/
> https://lkml.org/lkml/2015/8/10/90
>
> In summary, we need to clarify what's the root cause before diving into
> code and hiding it.

I'm not "hiding" anything. This statement is utterly bogus.

Summarizing my test results, I would like to stress that:
* zbud gives better worst-times
* the system's behavior with zbud is way more stable and predictable
* zsmalloc-based zram operation depends very much on kernel memory
management subsystem changes
* zsmalloc operates significantly worse with compaction deferral logic
introduced after ca. 3.18

As a bottom line, zsmalloc operation is substantially more fragile and
far less predictable than zbud's. If that is not a good reason to _at
least_ have *an option* to use zram with the latter, then I don't know
what is.

~vitaly

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
  2015-09-25  9:54     ` Vitaly Wool
@ 2015-09-30  7:52       ` Minchan Kim
  -1 siblings, 0 replies; 63+ messages in thread
From: Minchan Kim @ 2015-09-30  7:52 UTC (permalink / raw)
  To: Vitaly Wool
  Cc: Sergey Senozhatsky, Dan Streetman, LKML, Linux-MM,
	김준수, Gioh Kim, Vlastimil Babka

Hello Vitaly,

First of all, Thanks for the detail report. I will add comments below.

On Fri, Sep 25, 2015 at 11:54:57AM +0200, Vitaly Wool wrote:
> Hello Minchan,
> 
> the main use case where I see unacceptably long stalls in UI with
> zsmalloc is switching between users in Android.

What is main factor of the workload when user switching happens?
I guess lots of fork and read inode/dentry? so high-order allcation by them?

> There is a way to automate user creation and switching between them so
> the test I run both to get vmstat statistics and to profile stalls is
> to create a user, switch to it and switch back. Each test cycle does
> that 10 times, and all the results presented below are averages for 20
> runs.

Could you share your script?
I will ask our production team to reproduce it.

> 
> Kernel configurations used for testing:
> 
> (1): vanilla
> (2): (1) plus "make SLUB atomic" patch [1]

Does it mean "mm/slub: don't wait for high-order page allocation"?

> (3): (1) with zbud instead of zsmalloc
> (4): (2) with compaction defer logic mostly disabled

How does it disable compaction defer logic?
It would be helpful if you show the code.
Strictly speaking, I want to see your mm/compaction.c

The reason I am asking that you said you saw improvement
if you apply below.

        https://lkml.org/lkml/2015/9/9/313.

In compaction pov, without it, compaction does nothing because
zone_watermark_ok can return true(ie, false-positive) by CMA so
VM relies on reclaiming LRU pages to make high-order free pages
rather than compaction. In such case, you could lose lots of
page caches which I guess it's culprit of slowness you are seeing.

NOTE: Mel is trying to remove watermark check logic for high-order
allocation recently so it would be helpful for your workload.
http://lwn.net/Articles/657967/

> 
> > KSM? Is there any reason you mentioned *KSM* in this context?
> > IOW, if you don't use KSM, you couldn't see a problem?
> 
> If I don't use KSM, latenices get smaller in both cases. Worst case
> wise, zbud still gives more deterministic behavior.

I think KSM consumes rmap meta per page via SLAB request so SLAB
will ask high-order alloc for that. I guess that's one of reason
you saw smaller latency if you disabled KSM.

> 
> >> I ran into several occasions when moving pages from compressed swap back
> >> into the "normal" part of RAM caused significant latencies in system operation.
> >
> > What kernel version did you use? Did you enable CMA? ION?
> > What was major factor for the latencies?
> 
> CMA and ION are both enabled. The working kernel is 3.18 based with
> most of the mm/ stuff backported from 4.2.

3.18 is really old so I guess you did backport a lot of MM patches
from 4.2. Especially, there are lots of enhancement on compaction side
since then so it would be very helpful to say what patches you have
backported about compaction/CMA and zsmalloc/zram.

> The major factors for the latencies was a) fragmentation and b)
> compaction deferral. See also below.
> 
> > Decompress? zsmalloc-compaction overhead? rmap overheads?
> > compaction overheads?
> > There are several potential culprits.
> > It would be very helpful if you provide some numbers(perf will help you).
> 
> The UI is blocked after user switching for, average:
> (1) 1.84 seconds
> (2) 0.89 seconds
> (3) 1.32 seconds
> (4) 0.87 seconds

First of all, above data doesn't reveal how many time system spend in
somewhere. For it, perf record will be your friend.
If you use perf on ARM, please keep it in mind that ARM perf doesn't
support NMI so if your routine disable IRQ, sampling point isn't
correct you so you should take care of it.

I guess most of time will be spent in idle to wait I/O complete
because I am seeing this problem caused by page cache thrasing.

Anyway, could you test below two cases more?

1. vanilla + slub fallback + zbud + compaction defer disabling?
2. vanilla + zbud + compaction defer disabling?

I'd like to know how only compaction defer disabling patch affects
your workload without SLUB fallback patch.

> 
> The UI us blocked after user switching for, worst-case:
> (1) 2.91
> (2) 1.12
> (3) 1.79
> (4) 1.34

As worst-case, 4 is slower than 2 so ignoring compaction defering
unconditionally wouldn't be not option.

> 
> Selected vmstat results, average:
> I. allocstall
> (1) 7814
> (2) 4615
> (3) 2004
> (4) 2611
> 

vanilla + zbud is best for allocstall POV but it was not best
for avg and worst-case so it is not a major factor of your slowness.

> II. compact_stall
> (1) 1869

876 / 1869 * 100 = 46%

> (2) 1135

535 / 1135 * 100 = 47%

> (3) 727

419 / 727 * 100 = 57%

> (4) 638

638 / 443 * 100 = 144%

It seems each of data is selected from various experiement so
4 is higher 100% so even though the data is not consistent,
I guess 4 is much better than others.

>From of it, I guess compaction defer logic has a problem and
it made your problem as I said. There are a few of known problems
in compaction.

Joonsoo and Vlastimil have tried to fix it for a long time so
I hope they could solve it in this chance.

> 
> III. compact_fail
> (1) 914
> (2) 520
> (3) 230
> (4) 218
> 
> IV. compact_success
> (1) 876
> (2) 535
> (3) 419
> (4) 443
> 
> More data available on request.

1. Could you show how many of pages zbud/zsmalloc have been used for your
test and /proc/swaps as well as vmstat?

You could get it by pool_total_size on zswap and mem_used_total on zram.

2. Could you show /proc/vmstat raw data at before and after?

So, we could see more values like pgmajfault, nr_inactive, nr_file and
so on.

3. Perf record will prove where your system spent a lot of time.

> 
> >> By using zbud I lose in compression ratio but gain in determinism,
> >> lower latencies and lower fragmentation, so in the coming patches
> >> I tried to generalize what I've done to enable zbud for zram so far.
> >
> > Before that, I'd like to know what is root cause.
> > From my side, I had an similar experience.
> > At that time, problem was that *compaction* which triggered to reclaim
> > lots of page cache pages. The reason compaction triggered a lot was
> > fragmentation caused by zsmalloc, GPU and high-order allocation
> > request by SLUB and somethings(ex, ION, fork).
> >
> > Recently, Joonsoo fixed SLUB side.
> > http://marc.info/?l=linux-kernel&m=143891343223853&w=2
> 
> Yes, it makes things better, see above. However, worst case is still
> looking not so nice.

Your data says only SLUB fallback is best for worst-case. No?

> 
> > And we added zram-auto-compaction recently so zram try to compact
> > objects to reduce memory usage. It might be helpful for fragment
> > problem as side effect but please keep it mind that it would be opposite.
> > Currently, zram-auto-compaction is not aware of virtual memory compaction
> > so as worst case, zsmalloc can spread out pinned object into movable
> > pageblocks via zsmalloc-compaction.
> > Gioh and I try to solve the issue with below patches but is pending now
> > by other urgent works.
> > https://lwn.net/Articles/650917/
> > https://lkml.org/lkml/2015/8/10/90
> >
> > In summary, we need to clarify what's the root cause before diving into
> > code and hiding it.
> 
> I'm not "hiding" anything. This statement is utterly bogus.

Above I said, we found there are many problem between CMA, compaction
and zram a few month ago and we have approached to solve it generally.

        https://lwn.net/Articles/635446/

In this context, your approach is totally *hiding*.
Again saying, let's investigate fundamental problems.

> 
> Summarizing my test results, I would like to stress that:
> * zbud gives better worst-times

It would be different for workload. If you lose page cache by lack of
free memory due to bad compress ratio by zbud, it will bite you
after a while.

> * the system's behavior with zbud is way more stable and predictable

I agree zbud is more predictable. I said that's why zbud/zswap was born
although there are zram in the kenrel.
please read this.

        https://lwn.net/Articles/549740/

Although I will do best effort for zram/zsmalloc to make deterministic
without losing compress ratio, it couldn't do as well as zbud.

> * zsmalloc-based zram operation depends very much on kernel memory
> management subsystem changes

I couldn't agree this claim. What logic in kernel MM system makes you
think so?

> * zsmalloc operates significantly worse with compaction deferral logic
> introduced after ca. 3.18

I already explained how we can approach external fragmentation problems
you mentioned with generic ways and my concerns for supporing zbud.

Please explain what you are seeing problems in my suggestion technically.

> 
> As a bottom line, zsmalloc operation is substantially more fragile and
> far less predictable than zbud's. If that is not a good reason to _at
> least_ have *an option* to use zram with the latter, then I don't know
> what is.

Please have a look.

There are lots of known problems in CMA, compacation and zsmalloc
and several developers have solved it. Although it's not perfect now,
I think we are approaching right ways.

But you are now insisting "let's just use zbud into zram" with just
having compaction stat of vmstat without detailed analysis.
(I never think just throwing result is technical discussion. I really
want to know what makes such result with data).

> 
> ~vitaly

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
@ 2015-09-30  7:52       ` Minchan Kim
  0 siblings, 0 replies; 63+ messages in thread
From: Minchan Kim @ 2015-09-30  7:52 UTC (permalink / raw)
  To: Vitaly Wool
  Cc: Sergey Senozhatsky, Dan Streetman, LKML, Linux-MM,
	김준수, Gioh Kim, Vlastimil Babka

Hello Vitaly,

First of all, Thanks for the detail report. I will add comments below.

On Fri, Sep 25, 2015 at 11:54:57AM +0200, Vitaly Wool wrote:
> Hello Minchan,
> 
> the main use case where I see unacceptably long stalls in UI with
> zsmalloc is switching between users in Android.

What is main factor of the workload when user switching happens?
I guess lots of fork and read inode/dentry? so high-order allcation by them?

> There is a way to automate user creation and switching between them so
> the test I run both to get vmstat statistics and to profile stalls is
> to create a user, switch to it and switch back. Each test cycle does
> that 10 times, and all the results presented below are averages for 20
> runs.

Could you share your script?
I will ask our production team to reproduce it.

> 
> Kernel configurations used for testing:
> 
> (1): vanilla
> (2): (1) plus "make SLUB atomic" patch [1]

Does it mean "mm/slub: don't wait for high-order page allocation"?

> (3): (1) with zbud instead of zsmalloc
> (4): (2) with compaction defer logic mostly disabled

How does it disable compaction defer logic?
It would be helpful if you show the code.
Strictly speaking, I want to see your mm/compaction.c

The reason I am asking that you said you saw improvement
if you apply below.

        https://lkml.org/lkml/2015/9/9/313.

In compaction pov, without it, compaction does nothing because
zone_watermark_ok can return true(ie, false-positive) by CMA so
VM relies on reclaiming LRU pages to make high-order free pages
rather than compaction. In such case, you could lose lots of
page caches which I guess it's culprit of slowness you are seeing.

NOTE: Mel is trying to remove watermark check logic for high-order
allocation recently so it would be helpful for your workload.
http://lwn.net/Articles/657967/

> 
> > KSM? Is there any reason you mentioned *KSM* in this context?
> > IOW, if you don't use KSM, you couldn't see a problem?
> 
> If I don't use KSM, latenices get smaller in both cases. Worst case
> wise, zbud still gives more deterministic behavior.

I think KSM consumes rmap meta per page via SLAB request so SLAB
will ask high-order alloc for that. I guess that's one of reason
you saw smaller latency if you disabled KSM.

> 
> >> I ran into several occasions when moving pages from compressed swap back
> >> into the "normal" part of RAM caused significant latencies in system operation.
> >
> > What kernel version did you use? Did you enable CMA? ION?
> > What was major factor for the latencies?
> 
> CMA and ION are both enabled. The working kernel is 3.18 based with
> most of the mm/ stuff backported from 4.2.

3.18 is really old so I guess you did backport a lot of MM patches
from 4.2. Especially, there are lots of enhancement on compaction side
since then so it would be very helpful to say what patches you have
backported about compaction/CMA and zsmalloc/zram.

> The major factors for the latencies was a) fragmentation and b)
> compaction deferral. See also below.
> 
> > Decompress? zsmalloc-compaction overhead? rmap overheads?
> > compaction overheads?
> > There are several potential culprits.
> > It would be very helpful if you provide some numbers(perf will help you).
> 
> The UI is blocked after user switching for, average:
> (1) 1.84 seconds
> (2) 0.89 seconds
> (3) 1.32 seconds
> (4) 0.87 seconds

First of all, above data doesn't reveal how many time system spend in
somewhere. For it, perf record will be your friend.
If you use perf on ARM, please keep it in mind that ARM perf doesn't
support NMI so if your routine disable IRQ, sampling point isn't
correct you so you should take care of it.

I guess most of time will be spent in idle to wait I/O complete
because I am seeing this problem caused by page cache thrasing.

Anyway, could you test below two cases more?

1. vanilla + slub fallback + zbud + compaction defer disabling?
2. vanilla + zbud + compaction defer disabling?

I'd like to know how only compaction defer disabling patch affects
your workload without SLUB fallback patch.

> 
> The UI us blocked after user switching for, worst-case:
> (1) 2.91
> (2) 1.12
> (3) 1.79
> (4) 1.34

As worst-case, 4 is slower than 2 so ignoring compaction defering
unconditionally wouldn't be not option.

> 
> Selected vmstat results, average:
> I. allocstall
> (1) 7814
> (2) 4615
> (3) 2004
> (4) 2611
> 

vanilla + zbud is best for allocstall POV but it was not best
for avg and worst-case so it is not a major factor of your slowness.

> II. compact_stall
> (1) 1869

876 / 1869 * 100 = 46%

> (2) 1135

535 / 1135 * 100 = 47%

> (3) 727

419 / 727 * 100 = 57%

> (4) 638

638 / 443 * 100 = 144%

It seems each of data is selected from various experiement so
4 is higher 100% so even though the data is not consistent,
I guess 4 is much better than others.

>From of it, I guess compaction defer logic has a problem and
it made your problem as I said. There are a few of known problems
in compaction.

Joonsoo and Vlastimil have tried to fix it for a long time so
I hope they could solve it in this chance.

> 
> III. compact_fail
> (1) 914
> (2) 520
> (3) 230
> (4) 218
> 
> IV. compact_success
> (1) 876
> (2) 535
> (3) 419
> (4) 443
> 
> More data available on request.

1. Could you show how many of pages zbud/zsmalloc have been used for your
test and /proc/swaps as well as vmstat?

You could get it by pool_total_size on zswap and mem_used_total on zram.

2. Could you show /proc/vmstat raw data at before and after?

So, we could see more values like pgmajfault, nr_inactive, nr_file and
so on.

3. Perf record will prove where your system spent a lot of time.

> 
> >> By using zbud I lose in compression ratio but gain in determinism,
> >> lower latencies and lower fragmentation, so in the coming patches
> >> I tried to generalize what I've done to enable zbud for zram so far.
> >
> > Before that, I'd like to know what is root cause.
> > From my side, I had an similar experience.
> > At that time, problem was that *compaction* which triggered to reclaim
> > lots of page cache pages. The reason compaction triggered a lot was
> > fragmentation caused by zsmalloc, GPU and high-order allocation
> > request by SLUB and somethings(ex, ION, fork).
> >
> > Recently, Joonsoo fixed SLUB side.
> > http://marc.info/?l=linux-kernel&m=143891343223853&w=2
> 
> Yes, it makes things better, see above. However, worst case is still
> looking not so nice.

Your data says only SLUB fallback is best for worst-case. No?

> 
> > And we added zram-auto-compaction recently so zram try to compact
> > objects to reduce memory usage. It might be helpful for fragment
> > problem as side effect but please keep it mind that it would be opposite.
> > Currently, zram-auto-compaction is not aware of virtual memory compaction
> > so as worst case, zsmalloc can spread out pinned object into movable
> > pageblocks via zsmalloc-compaction.
> > Gioh and I try to solve the issue with below patches but is pending now
> > by other urgent works.
> > https://lwn.net/Articles/650917/
> > https://lkml.org/lkml/2015/8/10/90
> >
> > In summary, we need to clarify what's the root cause before diving into
> > code and hiding it.
> 
> I'm not "hiding" anything. This statement is utterly bogus.

Above I said, we found there are many problem between CMA, compaction
and zram a few month ago and we have approached to solve it generally.

        https://lwn.net/Articles/635446/

In this context, your approach is totally *hiding*.
Again saying, let's investigate fundamental problems.

> 
> Summarizing my test results, I would like to stress that:
> * zbud gives better worst-times

It would be different for workload. If you lose page cache by lack of
free memory due to bad compress ratio by zbud, it will bite you
after a while.

> * the system's behavior with zbud is way more stable and predictable

I agree zbud is more predictable. I said that's why zbud/zswap was born
although there are zram in the kenrel.
please read this.

        https://lwn.net/Articles/549740/

Although I will do best effort for zram/zsmalloc to make deterministic
without losing compress ratio, it couldn't do as well as zbud.

> * zsmalloc-based zram operation depends very much on kernel memory
> management subsystem changes

I couldn't agree this claim. What logic in kernel MM system makes you
think so?

> * zsmalloc operates significantly worse with compaction deferral logic
> introduced after ca. 3.18

I already explained how we can approach external fragmentation problems
you mentioned with generic ways and my concerns for supporing zbud.

Please explain what you are seeing problems in my suggestion technically.

> 
> As a bottom line, zsmalloc operation is substantially more fragile and
> far less predictable than zbud's. If that is not a good reason to _at
> least_ have *an option* to use zram with the latter, then I don't know
> what is.

Please have a look.

There are lots of known problems in CMA, compacation and zsmalloc
and several developers have solved it. Although it's not perfect now,
I think we are approaching right ways.

But you are now insisting "let's just use zbud into zram" with just
having compaction stat of vmstat without detailed analysis.
(I never think just throwing result is technical discussion. I really
want to know what makes such result with data).

> 
> ~vitaly

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
  2015-09-30  7:52       ` Minchan Kim
@ 2015-09-30  8:01         ` Vitaly Wool
  -1 siblings, 0 replies; 63+ messages in thread
From: Vitaly Wool @ 2015-09-30  8:01 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Dan Streetman, LKML, Linux-MM,
	김준수, Gioh Kim, Vlastimil Babka

> Could you share your script?
> I will ask our production team to reproduce it.

Wait, let me get it right. Your production team?
I take it as you would like me to help your company fix your bugs.
You are pushing the limits here.

~vitaly

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
@ 2015-09-30  8:01         ` Vitaly Wool
  0 siblings, 0 replies; 63+ messages in thread
From: Vitaly Wool @ 2015-09-30  8:01 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Dan Streetman, LKML, Linux-MM,
	김준수, Gioh Kim, Vlastimil Babka

> Could you share your script?
> I will ask our production team to reproduce it.

Wait, let me get it right. Your production team?
I take it as you would like me to help your company fix your bugs.
You are pushing the limits here.

~vitaly

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
  2015-09-30  8:01         ` Vitaly Wool
@ 2015-09-30  8:13           ` Minchan Kim
  -1 siblings, 0 replies; 63+ messages in thread
From: Minchan Kim @ 2015-09-30  8:13 UTC (permalink / raw)
  To: Vitaly Wool
  Cc: Sergey Senozhatsky, Dan Streetman, LKML, Linux-MM,
	김준수, Gioh Kim, Vlastimil Babka

On Wed, Sep 30, 2015 at 10:01:59AM +0200, Vitaly Wool wrote:
> > Could you share your script?
> > I will ask our production team to reproduce it.
> 
> Wait, let me get it right. Your production team?
> I take it as you would like me to help your company fix your bugs.
> You are pushing the limits here.

I'm really sorry if you take it as fixing my bugs.
I never wanted it but just want to help your problem.
Please read LKML. Normally, developers wanted to share test script to
reproduce the problem because it's easier to solve the problem
without consuming much time with ping-pong.

Anyway, I have shared my experience to you and suggest patches and
on-going works. In your concept, I shouldn't do that for fixing
your problems so I shouldn't help you any more? Right?

> 
> ~vitaly

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
@ 2015-09-30  8:13           ` Minchan Kim
  0 siblings, 0 replies; 63+ messages in thread
From: Minchan Kim @ 2015-09-30  8:13 UTC (permalink / raw)
  To: Vitaly Wool
  Cc: Sergey Senozhatsky, Dan Streetman, LKML, Linux-MM,
	김준수, Gioh Kim, Vlastimil Babka

On Wed, Sep 30, 2015 at 10:01:59AM +0200, Vitaly Wool wrote:
> > Could you share your script?
> > I will ask our production team to reproduce it.
> 
> Wait, let me get it right. Your production team?
> I take it as you would like me to help your company fix your bugs.
> You are pushing the limits here.

I'm really sorry if you take it as fixing my bugs.
I never wanted it but just want to help your problem.
Please read LKML. Normally, developers wanted to share test script to
reproduce the problem because it's easier to solve the problem
without consuming much time with ping-pong.

Anyway, I have shared my experience to you and suggest patches and
on-going works. In your concept, I shouldn't do that for fixing
your problems so I shouldn't help you any more? Right?

> 
> ~vitaly

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
  2015-09-30  8:13           ` Minchan Kim
@ 2015-09-30  8:18             ` Vitaly Wool
  -1 siblings, 0 replies; 63+ messages in thread
From: Vitaly Wool @ 2015-09-30  8:18 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Dan Streetman, LKML, Linux-MM,
	김준수, Gioh Kim, Vlastimil Babka

On Wed, Sep 30, 2015 at 10:13 AM, Minchan Kim <minchan@kernel.org> wrote:
> On Wed, Sep 30, 2015 at 10:01:59AM +0200, Vitaly Wool wrote:
>> > Could you share your script?
>> > I will ask our production team to reproduce it.
>>
>> Wait, let me get it right. Your production team?
>> I take it as you would like me to help your company fix your bugs.
>> You are pushing the limits here.
>
> I'm really sorry if you take it as fixing my bugs.
> I never wanted it but just want to help your problem.
> Please read LKML. Normally, developers wanted to share test script to
> reproduce the problem because it's easier to solve the problem
> without consuming much time with ping-pong.

Normally developers do not have backing up "production teams".

> Anyway, I have shared my experience to you and suggest patches and
> on-going works. In your concept, I shouldn't do that for fixing
> your problems so I shouldn't help you any more? Right?

I never asked you to fix my problems. I have substantial proof that
zsmalloc is fragile enough not to be a good fit for projects I work on
and I want to have the code that allows zram to work with zbud
mainlined. Whatever you want me to do to help you fix zsmalloc issues
*should* be *orthogonal* to the former. You are abusing your
maintainer role here, trying to act in favor of your company rather
than for the benefit of OSS.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
@ 2015-09-30  8:18             ` Vitaly Wool
  0 siblings, 0 replies; 63+ messages in thread
From: Vitaly Wool @ 2015-09-30  8:18 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Dan Streetman, LKML, Linux-MM,
	김준수, Gioh Kim, Vlastimil Babka

On Wed, Sep 30, 2015 at 10:13 AM, Minchan Kim <minchan@kernel.org> wrote:
> On Wed, Sep 30, 2015 at 10:01:59AM +0200, Vitaly Wool wrote:
>> > Could you share your script?
>> > I will ask our production team to reproduce it.
>>
>> Wait, let me get it right. Your production team?
>> I take it as you would like me to help your company fix your bugs.
>> You are pushing the limits here.
>
> I'm really sorry if you take it as fixing my bugs.
> I never wanted it but just want to help your problem.
> Please read LKML. Normally, developers wanted to share test script to
> reproduce the problem because it's easier to solve the problem
> without consuming much time with ping-pong.

Normally developers do not have backing up "production teams".

> Anyway, I have shared my experience to you and suggest patches and
> on-going works. In your concept, I shouldn't do that for fixing
> your problems so I shouldn't help you any more? Right?

I never asked you to fix my problems. I have substantial proof that
zsmalloc is fragile enough not to be a good fit for projects I work on
and I want to have the code that allows zram to work with zbud
mainlined. Whatever you want me to do to help you fix zsmalloc issues
*should* be *orthogonal* to the former. You are abusing your
maintainer role here, trying to act in favor of your company rather
than for the benefit of OSS.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
  2015-09-25  9:54     ` Vitaly Wool
@ 2015-09-30 15:37       ` Vlastimil Babka
  -1 siblings, 0 replies; 63+ messages in thread
From: Vlastimil Babka @ 2015-09-30 15:37 UTC (permalink / raw)
  To: Vitaly Wool, Minchan Kim
  Cc: Sergey Senozhatsky, Dan Streetman, LKML, Linux-MM,
	김준수, Gioh Kim

On 09/25/2015 11:54 AM, Vitaly Wool wrote:
> Hello Minchan,
>
> the main use case where I see unacceptably long stalls in UI with
> zsmalloc is switching between users in Android.
> There is a way to automate user creation and switching between them so
> the test I run both to get vmstat statistics and to profile stalls is
> to create a user, switch to it and switch back. Each test cycle does
> that 10 times, and all the results presented below are averages for 20
> runs.
>
> Kernel configurations used for testing:
>
> (1): vanilla
> (2): (1) plus "make SLUB atomic" patch [1]
> (3): (1) with zbud instead of zsmalloc
> (4): (2) with compaction defer logic mostly disabled

Disabling compaction deferring leads to less compaction stalls? That 
indeed looks very weird and counter-intuitive. Also what's "mostly" 
disabled mean?


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
@ 2015-09-30 15:37       ` Vlastimil Babka
  0 siblings, 0 replies; 63+ messages in thread
From: Vlastimil Babka @ 2015-09-30 15:37 UTC (permalink / raw)
  To: Vitaly Wool, Minchan Kim
  Cc: Sergey Senozhatsky, Dan Streetman, LKML, Linux-MM,
	김준수, Gioh Kim

On 09/25/2015 11:54 AM, Vitaly Wool wrote:
> Hello Minchan,
>
> the main use case where I see unacceptably long stalls in UI with
> zsmalloc is switching between users in Android.
> There is a way to automate user creation and switching between them so
> the test I run both to get vmstat statistics and to profile stalls is
> to create a user, switch to it and switch back. Each test cycle does
> that 10 times, and all the results presented below are averages for 20
> runs.
>
> Kernel configurations used for testing:
>
> (1): vanilla
> (2): (1) plus "make SLUB atomic" patch [1]
> (3): (1) with zbud instead of zsmalloc
> (4): (2) with compaction defer logic mostly disabled

Disabling compaction deferring leads to less compaction stalls? That 
indeed looks very weird and counter-intuitive. Also what's "mostly" 
disabled mean?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
  2015-09-30 15:37       ` Vlastimil Babka
@ 2015-09-30 15:46         ` Vitaly Wool
  -1 siblings, 0 replies; 63+ messages in thread
From: Vitaly Wool @ 2015-09-30 15:46 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Minchan Kim, Sergey Senozhatsky, Dan Streetman, LKML, Linux-MM,
	김준수, Gioh Kim

On Wed, Sep 30, 2015 at 5:37 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
> On 09/25/2015 11:54 AM, Vitaly Wool wrote:
>>
>> Hello Minchan,
>>
>> the main use case where I see unacceptably long stalls in UI with
>> zsmalloc is switching between users in Android.
>> There is a way to automate user creation and switching between them so
>> the test I run both to get vmstat statistics and to profile stalls is
>> to create a user, switch to it and switch back. Each test cycle does
>> that 10 times, and all the results presented below are averages for 20
>> runs.
>>
>> Kernel configurations used for testing:
>>
>> (1): vanilla
>> (2): (1) plus "make SLUB atomic" patch [1]
>> (3): (1) with zbud instead of zsmalloc
>> (4): (2) with compaction defer logic mostly disabled
>
>
> Disabling compaction deferring leads to less compaction stalls? That indeed
> looks very weird and counter-intuitive. Also what's "mostly" disabled mean?

Not that I'm not surprised myself. However, this is how it goes.
Namely, I reverted the following patches:
- mm, compaction: defer each zone individually instead of preferred zone
- mm, compaction: embed migration mode in compact_control
- mm, compaction: add per-zone migration pfn cache for async compaction
- mm: compaction: encapsulate defer reset logic

~vitaly

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
@ 2015-09-30 15:46         ` Vitaly Wool
  0 siblings, 0 replies; 63+ messages in thread
From: Vitaly Wool @ 2015-09-30 15:46 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Minchan Kim, Sergey Senozhatsky, Dan Streetman, LKML, Linux-MM,
	김준수, Gioh Kim

On Wed, Sep 30, 2015 at 5:37 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
> On 09/25/2015 11:54 AM, Vitaly Wool wrote:
>>
>> Hello Minchan,
>>
>> the main use case where I see unacceptably long stalls in UI with
>> zsmalloc is switching between users in Android.
>> There is a way to automate user creation and switching between them so
>> the test I run both to get vmstat statistics and to profile stalls is
>> to create a user, switch to it and switch back. Each test cycle does
>> that 10 times, and all the results presented below are averages for 20
>> runs.
>>
>> Kernel configurations used for testing:
>>
>> (1): vanilla
>> (2): (1) plus "make SLUB atomic" patch [1]
>> (3): (1) with zbud instead of zsmalloc
>> (4): (2) with compaction defer logic mostly disabled
>
>
> Disabling compaction deferring leads to less compaction stalls? That indeed
> looks very weird and counter-intuitive. Also what's "mostly" disabled mean?

Not that I'm not surprised myself. However, this is how it goes.
Namely, I reverted the following patches:
- mm, compaction: defer each zone individually instead of preferred zone
- mm, compaction: embed migration mode in compact_control
- mm, compaction: add per-zone migration pfn cache for async compaction
- mm: compaction: encapsulate defer reset logic

~vitaly

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
  2015-09-30 15:46         ` Vitaly Wool
@ 2015-10-01  7:52           ` Vlastimil Babka
  -1 siblings, 0 replies; 63+ messages in thread
From: Vlastimil Babka @ 2015-10-01  7:52 UTC (permalink / raw)
  To: Vitaly Wool
  Cc: Minchan Kim, Sergey Senozhatsky, Dan Streetman, LKML, Linux-MM,
	김준수, Gioh Kim

On 09/30/2015 05:46 PM, Vitaly Wool wrote:
> On Wed, Sep 30, 2015 at 5:37 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
>> On 09/25/2015 11:54 AM, Vitaly Wool wrote:
>>>
>>> Hello Minchan,
>>>
>>> the main use case where I see unacceptably long stalls in UI with
>>> zsmalloc is switching between users in Android.
>>> There is a way to automate user creation and switching between them so
>>> the test I run both to get vmstat statistics and to profile stalls is
>>> to create a user, switch to it and switch back. Each test cycle does
>>> that 10 times, and all the results presented below are averages for 20
>>> runs.
>>>
>>> Kernel configurations used for testing:
>>>
>>> (1): vanilla
>>> (2): (1) plus "make SLUB atomic" patch [1]
>>> (3): (1) with zbud instead of zsmalloc
>>> (4): (2) with compaction defer logic mostly disabled
>>
>>
>> Disabling compaction deferring leads to less compaction stalls? That indeed
>> looks very weird and counter-intuitive. Also what's "mostly" disabled mean?
>
> Not that I'm not surprised myself. However, this is how it goes.
> Namely, I reverted the following patches:
> - mm, compaction: defer each zone individually instead of preferred zone

Oh, I see. Then you didn't disable compaction defer logic, but made it 
coarse again instead of per-zone. Which means that an allocation that 
can be satisfied from Normal zone will use the Normal zone's deferred 
state to decide whether to compact also DMA and DMA32 zones *within the 
same allocation attempt*. So by reverting the patch you might indeed get 
less compact_stall (and success+failure) counts, but each stall will try 
to compact all three zones. With individual defer, some stall might be 
just for DMA32, some just for Normal, and the total number might be 
higher, but the compaction overhead should be better distributed among 
all the attempts. Looking at your latencies, looks like that's working fine:

>
> The UI is blocked after user switching for, average:
> (1) 1.84 seconds
> (2) 0.89 seconds
> (3) 1.32 seconds
> (4) 0.87 seconds

Average for (2) vs (4) is roughly the same, I would guess within noise.

> The UI us blocked after user switching for, worst-case:
> (1) 2.91
> (2) 1.12
> (3) 1.79
> (4) 1.34

The worst case is actually worse without individual defer, because you 
end up compacting all zones in each single stall. With individual defer, 
there's a low probability of that happening.

> - mm, compaction: embed migration mode in compact_control

This probably affects just THPs.

> - mm, compaction: add per-zone migration pfn cache for async compaction

Hard to say what's the effect of this.

> - mm: compaction: encapsulate defer reset logic

This is just code consolidation.

> ~vitaly
>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
@ 2015-10-01  7:52           ` Vlastimil Babka
  0 siblings, 0 replies; 63+ messages in thread
From: Vlastimil Babka @ 2015-10-01  7:52 UTC (permalink / raw)
  To: Vitaly Wool
  Cc: Minchan Kim, Sergey Senozhatsky, Dan Streetman, LKML, Linux-MM,
	김준수, Gioh Kim

On 09/30/2015 05:46 PM, Vitaly Wool wrote:
> On Wed, Sep 30, 2015 at 5:37 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
>> On 09/25/2015 11:54 AM, Vitaly Wool wrote:
>>>
>>> Hello Minchan,
>>>
>>> the main use case where I see unacceptably long stalls in UI with
>>> zsmalloc is switching between users in Android.
>>> There is a way to automate user creation and switching between them so
>>> the test I run both to get vmstat statistics and to profile stalls is
>>> to create a user, switch to it and switch back. Each test cycle does
>>> that 10 times, and all the results presented below are averages for 20
>>> runs.
>>>
>>> Kernel configurations used for testing:
>>>
>>> (1): vanilla
>>> (2): (1) plus "make SLUB atomic" patch [1]
>>> (3): (1) with zbud instead of zsmalloc
>>> (4): (2) with compaction defer logic mostly disabled
>>
>>
>> Disabling compaction deferring leads to less compaction stalls? That indeed
>> looks very weird and counter-intuitive. Also what's "mostly" disabled mean?
>
> Not that I'm not surprised myself. However, this is how it goes.
> Namely, I reverted the following patches:
> - mm, compaction: defer each zone individually instead of preferred zone

Oh, I see. Then you didn't disable compaction defer logic, but made it 
coarse again instead of per-zone. Which means that an allocation that 
can be satisfied from Normal zone will use the Normal zone's deferred 
state to decide whether to compact also DMA and DMA32 zones *within the 
same allocation attempt*. So by reverting the patch you might indeed get 
less compact_stall (and success+failure) counts, but each stall will try 
to compact all three zones. With individual defer, some stall might be 
just for DMA32, some just for Normal, and the total number might be 
higher, but the compaction overhead should be better distributed among 
all the attempts. Looking at your latencies, looks like that's working fine:

>
> The UI is blocked after user switching for, average:
> (1) 1.84 seconds
> (2) 0.89 seconds
> (3) 1.32 seconds
> (4) 0.87 seconds

Average for (2) vs (4) is roughly the same, I would guess within noise.

> The UI us blocked after user switching for, worst-case:
> (1) 2.91
> (2) 1.12
> (3) 1.79
> (4) 1.34

The worst case is actually worse without individual defer, because you 
end up compacting all zones in each single stall. With individual defer, 
there's a low probability of that happening.

> - mm, compaction: embed migration mode in compact_control

This probably affects just THPs.

> - mm, compaction: add per-zone migration pfn cache for async compaction

Hard to say what's the effect of this.

> - i? 1/4 mm: compaction: encapsulate defer reset logic

This is just code consolidation.

> ~vitaly
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
  2015-10-01  7:52           ` Vlastimil Babka
@ 2015-10-10  9:33             ` Vitaly Wool
  -1 siblings, 0 replies; 63+ messages in thread
From: Vitaly Wool @ 2015-10-10  9:33 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Minchan Kim, Sergey Senozhatsky, Dan Streetman, LKML, Linux-MM,
	김준수, Gioh Kim

On Thu, Oct 1, 2015 at 9:52 AM, Vlastimil Babka <vbabka@suse.cz> wrote:
> On 09/30/2015 05:46 PM, Vitaly Wool wrote:
>>
>> On Wed, Sep 30, 2015 at 5:37 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
>>>
>>> On 09/25/2015 11:54 AM, Vitaly Wool wrote:
>>>>
>>>>
>>>> Hello Minchan,
>>>>
>>>> the main use case where I see unacceptably long stalls in UI with
>>>> zsmalloc is switching between users in Android.
>>>> There is a way to automate user creation and switching between them so
>>>> the test I run both to get vmstat statistics and to profile stalls is
>>>> to create a user, switch to it and switch back. Each test cycle does
>>>> that 10 times, and all the results presented below are averages for 20
>>>> runs.
>>>>
>>>> Kernel configurations used for testing:
>>>>
>>>> (1): vanilla
>>>> (2): (1) plus "make SLUB atomic" patch [1]
>>>> (3): (1) with zbud instead of zsmalloc
>>>> (4): (2) with compaction defer logic mostly disabled
>>>
>>>
>>>
>>> Disabling compaction deferring leads to less compaction stalls? That
>>> indeed
>>> looks very weird and counter-intuitive. Also what's "mostly" disabled
>>> mean?
>>
>>
>> Not that I'm not surprised myself. However, this is how it goes.
>> Namely, I reverted the following patches:
>> - mm, compaction: defer each zone individually instead of preferred zone
>
>
> Oh, I see. Then you didn't disable compaction defer logic, but made it
> coarse again instead of per-zone. Which means that an allocation that can be
> satisfied from Normal zone will use the Normal zone's deferred state to
> decide whether to compact also DMA and DMA32 zones *within the same
> allocation attempt*. So by reverting the patch you might indeed get less
> compact_stall (and success+failure) counts, but each stall will try to
> compact all three zones. With individual defer, some stall might be just for
> DMA32, some just for Normal, and the total number might be higher, but the
> compaction overhead should be better distributed among all the attempts.

The thing is, this happens on an ARM64 and I only have one zone there.

> Looking at your latencies, looks like that's working fine:
>
>>
>> The UI is blocked after user switching for, average:
>> (1) 1.84 seconds
>> (2) 0.89 seconds
>> (3) 1.32 seconds
>> (4) 0.87 seconds
>
>
> Average for (2) vs (4) is roughly the same, I would guess within noise.

That I surely won't argue with :)

>> The UI us blocked after user switching for, worst-case:
>> (1) 2.91
>> (2) 1.12
>> (3) 1.79
>> (4) 1.34
>
>
> The worst case is actually worse without individual defer, because you end
> up compacting all zones in each single stall. With individual defer, there's
> a low probability of that happening.

Okay, but in case of a single zone, isn't this more fine-grained logic
resulting in more defers and less async compactions?

>> - mm, compaction: embed migration mode in compact_control
>
>
> This probably affects just THPs.
>
>> - mm, compaction: add per-zone migration pfn cache for async compaction
>
>
> Hard to say what's the effect of this.
>
>> - mm: compaction: encapsulate defer reset logic
>
>
> This is just code consolidation.
>
>> ~vitaly
>>
>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
@ 2015-10-10  9:33             ` Vitaly Wool
  0 siblings, 0 replies; 63+ messages in thread
From: Vitaly Wool @ 2015-10-10  9:33 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Minchan Kim, Sergey Senozhatsky, Dan Streetman, LKML, Linux-MM,
	김준수, Gioh Kim

On Thu, Oct 1, 2015 at 9:52 AM, Vlastimil Babka <vbabka@suse.cz> wrote:
> On 09/30/2015 05:46 PM, Vitaly Wool wrote:
>>
>> On Wed, Sep 30, 2015 at 5:37 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
>>>
>>> On 09/25/2015 11:54 AM, Vitaly Wool wrote:
>>>>
>>>>
>>>> Hello Minchan,
>>>>
>>>> the main use case where I see unacceptably long stalls in UI with
>>>> zsmalloc is switching between users in Android.
>>>> There is a way to automate user creation and switching between them so
>>>> the test I run both to get vmstat statistics and to profile stalls is
>>>> to create a user, switch to it and switch back. Each test cycle does
>>>> that 10 times, and all the results presented below are averages for 20
>>>> runs.
>>>>
>>>> Kernel configurations used for testing:
>>>>
>>>> (1): vanilla
>>>> (2): (1) plus "make SLUB atomic" patch [1]
>>>> (3): (1) with zbud instead of zsmalloc
>>>> (4): (2) with compaction defer logic mostly disabled
>>>
>>>
>>>
>>> Disabling compaction deferring leads to less compaction stalls? That
>>> indeed
>>> looks very weird and counter-intuitive. Also what's "mostly" disabled
>>> mean?
>>
>>
>> Not that I'm not surprised myself. However, this is how it goes.
>> Namely, I reverted the following patches:
>> - mm, compaction: defer each zone individually instead of preferred zone
>
>
> Oh, I see. Then you didn't disable compaction defer logic, but made it
> coarse again instead of per-zone. Which means that an allocation that can be
> satisfied from Normal zone will use the Normal zone's deferred state to
> decide whether to compact also DMA and DMA32 zones *within the same
> allocation attempt*. So by reverting the patch you might indeed get less
> compact_stall (and success+failure) counts, but each stall will try to
> compact all three zones. With individual defer, some stall might be just for
> DMA32, some just for Normal, and the total number might be higher, but the
> compaction overhead should be better distributed among all the attempts.

The thing is, this happens on an ARM64 and I only have one zone there.

> Looking at your latencies, looks like that's working fine:
>
>>
>> The UI is blocked after user switching for, average:
>> (1) 1.84 seconds
>> (2) 0.89 seconds
>> (3) 1.32 seconds
>> (4) 0.87 seconds
>
>
> Average for (2) vs (4) is roughly the same, I would guess within noise.

That I surely won't argue with :)

>> The UI us blocked after user switching for, worst-case:
>> (1) 2.91
>> (2) 1.12
>> (3) 1.79
>> (4) 1.34
>
>
> The worst case is actually worse without individual defer, because you end
> up compacting all zones in each single stall. With individual defer, there's
> a low probability of that happening.

Okay, but in case of a single zone, isn't this more fine-grained logic
resulting in more defers and less async compactions?

>> - mm, compaction: embed migration mode in compact_control
>
>
> This probably affects just THPs.
>
>> - mm, compaction: add per-zone migration pfn cache for async compaction
>
>
> Hard to say what's the effect of this.
>
>> - mm: compaction: encapsulate defer reset logic
>
>
> This is just code consolidation.
>
>> ~vitaly
>>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
  2015-10-10  9:33             ` Vitaly Wool
@ 2015-10-14 13:28               ` Vlastimil Babka
  -1 siblings, 0 replies; 63+ messages in thread
From: Vlastimil Babka @ 2015-10-14 13:28 UTC (permalink / raw)
  To: Vitaly Wool
  Cc: Minchan Kim, Sergey Senozhatsky, Dan Streetman, LKML, Linux-MM,
	김준수, Gioh Kim

On 10/10/2015 11:33 AM, Vitaly Wool wrote:
> On Thu, Oct 1, 2015 at 9:52 AM, Vlastimil Babka <vbabka@suse.cz> wrote:
>> On 09/30/2015 05:46 PM, Vitaly Wool wrote:
>>>
>>> On Wed, Sep 30, 2015 at 5:37 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
>>>>
>>>> On 09/25/2015 11:54 AM, Vitaly Wool wrote:
>>>>>
>>>>>
>>>>> Hello Minchan,
>>>>>
>>>>> the main use case where I see unacceptably long stalls in UI with
>>>>> zsmalloc is switching between users in Android.
>>>>> There is a way to automate user creation and switching between them so
>>>>> the test I run both to get vmstat statistics and to profile stalls is
>>>>> to create a user, switch to it and switch back. Each test cycle does
>>>>> that 10 times, and all the results presented below are averages for 20
>>>>> runs.
>>>>>
>>>>> Kernel configurations used for testing:
>>>>>
>>>>> (1): vanilla
>>>>> (2): (1) plus "make SLUB atomic" patch [1]
>>>>> (3): (1) with zbud instead of zsmalloc
>>>>> (4): (2) with compaction defer logic mostly disabled
>>>>
>>>>
>>>>
>>>> Disabling compaction deferring leads to less compaction stalls? That
>>>> indeed
>>>> looks very weird and counter-intuitive. Also what's "mostly" disabled
>>>> mean?
>>>
>>>
>>> Not that I'm not surprised myself. However, this is how it goes.
>>> Namely, I reverted the following patches:
>>> - mm, compaction: defer each zone individually instead of preferred zone
>>
>>
>> Oh, I see. Then you didn't disable compaction defer logic, but made it
>> coarse again instead of per-zone. Which means that an allocation that can be
>> satisfied from Normal zone will use the Normal zone's deferred state to
>> decide whether to compact also DMA and DMA32 zones *within the same
>> allocation attempt*. So by reverting the patch you might indeed get less
>> compact_stall (and success+failure) counts, but each stall will try to
>> compact all three zones. With individual defer, some stall might be just for
>> DMA32, some just for Normal, and the total number might be higher, but the
>> compaction overhead should be better distributed among all the attempts.
> 
> The thing is, this happens on an ARM64 and I only have one zone there.

Hmm, then it shouldn't make a difference... unless there's a bug.

>> Looking at your latencies, looks like that's working fine:
>>
>>>
>>> The UI is blocked after user switching for, average:
>>> (1) 1.84 seconds
>>> (2) 0.89 seconds
>>> (3) 1.32 seconds
>>> (4) 0.87 seconds
>>
>>
>> Average for (2) vs (4) is roughly the same, I would guess within noise.
> 
> That I surely won't argue with :)
> 
>>> The UI us blocked after user switching for, worst-case:
>>> (1) 2.91
>>> (2) 1.12
>>> (3) 1.79
>>> (4) 1.34
>>
>>
>> The worst case is actually worse without individual defer, because you end
>> up compacting all zones in each single stall. With individual defer, there's
>> a low probability of that happening.
> 
> Okay, but in case of a single zone, isn't this more fine-grained logic
> resulting in more defers and less async compactions?

In case of single zone, it has only the single zone to consider with or without
the patch, so the result should be the same.

>>> - mm, compaction: embed migration mode in compact_control
>>
>>
>> This probably affects just THPs.
>>
>>> - mm, compaction: add per-zone migration pfn cache for async compaction
>>
>>
>> Hard to say what's the effect of this.
>>
>>> - mm: compaction: encapsulate defer reset logic
>>
>>
>> This is just code consolidation.
>>
>>> ~vitaly
>>>
>>
> 


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH 0/3] allow zram to use zbud as underlying allocator
@ 2015-10-14 13:28               ` Vlastimil Babka
  0 siblings, 0 replies; 63+ messages in thread
From: Vlastimil Babka @ 2015-10-14 13:28 UTC (permalink / raw)
  To: Vitaly Wool
  Cc: Minchan Kim, Sergey Senozhatsky, Dan Streetman, LKML, Linux-MM,
	김준수, Gioh Kim

On 10/10/2015 11:33 AM, Vitaly Wool wrote:
> On Thu, Oct 1, 2015 at 9:52 AM, Vlastimil Babka <vbabka@suse.cz> wrote:
>> On 09/30/2015 05:46 PM, Vitaly Wool wrote:
>>>
>>> On Wed, Sep 30, 2015 at 5:37 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
>>>>
>>>> On 09/25/2015 11:54 AM, Vitaly Wool wrote:
>>>>>
>>>>>
>>>>> Hello Minchan,
>>>>>
>>>>> the main use case where I see unacceptably long stalls in UI with
>>>>> zsmalloc is switching between users in Android.
>>>>> There is a way to automate user creation and switching between them so
>>>>> the test I run both to get vmstat statistics and to profile stalls is
>>>>> to create a user, switch to it and switch back. Each test cycle does
>>>>> that 10 times, and all the results presented below are averages for 20
>>>>> runs.
>>>>>
>>>>> Kernel configurations used for testing:
>>>>>
>>>>> (1): vanilla
>>>>> (2): (1) plus "make SLUB atomic" patch [1]
>>>>> (3): (1) with zbud instead of zsmalloc
>>>>> (4): (2) with compaction defer logic mostly disabled
>>>>
>>>>
>>>>
>>>> Disabling compaction deferring leads to less compaction stalls? That
>>>> indeed
>>>> looks very weird and counter-intuitive. Also what's "mostly" disabled
>>>> mean?
>>>
>>>
>>> Not that I'm not surprised myself. However, this is how it goes.
>>> Namely, I reverted the following patches:
>>> - mm, compaction: defer each zone individually instead of preferred zone
>>
>>
>> Oh, I see. Then you didn't disable compaction defer logic, but made it
>> coarse again instead of per-zone. Which means that an allocation that can be
>> satisfied from Normal zone will use the Normal zone's deferred state to
>> decide whether to compact also DMA and DMA32 zones *within the same
>> allocation attempt*. So by reverting the patch you might indeed get less
>> compact_stall (and success+failure) counts, but each stall will try to
>> compact all three zones. With individual defer, some stall might be just for
>> DMA32, some just for Normal, and the total number might be higher, but the
>> compaction overhead should be better distributed among all the attempts.
> 
> The thing is, this happens on an ARM64 and I only have one zone there.

Hmm, then it shouldn't make a difference... unless there's a bug.

>> Looking at your latencies, looks like that's working fine:
>>
>>>
>>> The UI is blocked after user switching for, average:
>>> (1) 1.84 seconds
>>> (2) 0.89 seconds
>>> (3) 1.32 seconds
>>> (4) 0.87 seconds
>>
>>
>> Average for (2) vs (4) is roughly the same, I would guess within noise.
> 
> That I surely won't argue with :)
> 
>>> The UI us blocked after user switching for, worst-case:
>>> (1) 2.91
>>> (2) 1.12
>>> (3) 1.79
>>> (4) 1.34
>>
>>
>> The worst case is actually worse without individual defer, because you end
>> up compacting all zones in each single stall. With individual defer, there's
>> a low probability of that happening.
> 
> Okay, but in case of a single zone, isn't this more fine-grained logic
> resulting in more defers and less async compactions?

In case of single zone, it has only the single zone to consider with or without
the patch, so the result should be the same.

>>> - mm, compaction: embed migration mode in compact_control
>>
>>
>> This probably affects just THPs.
>>
>>> - mm, compaction: add per-zone migration pfn cache for async compaction
>>
>>
>> Hard to say what's the effect of this.
>>
>>> - i? 1/4 mm: compaction: encapsulate defer reset logic
>>
>>
>> This is just code consolidation.
>>
>>> ~vitaly
>>>
>>
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 63+ messages in thread

end of thread, other threads:[~2015-10-14 13:28 UTC | newest]

Thread overview: 63+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-14 13:49 [PATCH 0/3] allow zram to use zbud as underlying allocator Vitaly Wool
2015-09-14 13:49 ` Vitaly Wool
2015-09-14 13:50 ` [PATCH 1/3] zram: make max_zpage_size configurable Vitaly Wool
2015-09-14 13:50   ` Vitaly Wool
2015-09-15  1:00   ` Sergey Senozhatsky
2015-09-15  1:00     ` Sergey Senozhatsky
2015-09-15  7:18     ` Vitaly Wool
2015-09-15  7:38       ` Sergey Senozhatsky
2015-09-15  7:38         ` Sergey Senozhatsky
2015-09-15  5:42   ` Dan Streetman
2015-09-15  5:42     ` Dan Streetman
2015-09-15  6:08     ` Sergey Senozhatsky
2015-09-15  6:08       ` Sergey Senozhatsky
2015-09-14 13:51 ` [PATCH 2/3] zpool/zsmalloc/zbud: align on interfaces Vitaly Wool
2015-09-14 13:51   ` Vitaly Wool
2015-09-15  1:06   ` Sergey Senozhatsky
2015-09-15  1:06     ` Sergey Senozhatsky
2015-09-15  5:09     ` Dan Streetman
2015-09-15  5:09       ` Dan Streetman
2015-09-14 13:55 ` [PATCH 3/3] zram: use common zpool interface Vitaly Wool
2015-09-14 13:55   ` Vitaly Wool
2015-09-15  1:12   ` Sergey Senozhatsky
2015-09-15  1:12     ` Sergey Senozhatsky
2015-09-15  6:03     ` Dan Streetman
2015-09-15  6:03       ` Dan Streetman
2015-09-14 14:01 ` [PATCH 0/3] allow zram to use zbud as underlying allocator Vlastimil Babka
2015-09-14 14:01   ` Vlastimil Babka
2015-09-14 14:12   ` Vitaly Wool
2015-09-14 14:12     ` Vitaly Wool
2015-09-14 14:14     ` Vlastimil Babka
2015-09-14 14:14       ` Vlastimil Babka
2015-09-15  4:08       ` Dan Streetman
2015-09-15  4:08         ` Dan Streetman
2015-09-15  4:22         ` Sergey Senozhatsky
2015-09-15  4:22           ` Sergey Senozhatsky
2015-09-17  6:21           ` Vlastimil Babka
2015-09-17  6:21             ` Vlastimil Babka
2015-09-17  9:19             ` Sergey Senozhatsky
2015-09-17  9:19               ` Sergey Senozhatsky
2015-09-15  0:49 ` Sergey Senozhatsky
2015-09-15  0:49   ` Sergey Senozhatsky
2015-09-15  6:13 ` Minchan Kim
2015-09-15  6:13   ` Minchan Kim
2015-09-25  9:54   ` Vitaly Wool
2015-09-25  9:54     ` Vitaly Wool
2015-09-30  7:52     ` Minchan Kim
2015-09-30  7:52       ` Minchan Kim
2015-09-30  8:01       ` Vitaly Wool
2015-09-30  8:01         ` Vitaly Wool
2015-09-30  8:13         ` Minchan Kim
2015-09-30  8:13           ` Minchan Kim
2015-09-30  8:18           ` Vitaly Wool
2015-09-30  8:18             ` Vitaly Wool
2015-09-30 15:37     ` Vlastimil Babka
2015-09-30 15:37       ` Vlastimil Babka
2015-09-30 15:46       ` Vitaly Wool
2015-09-30 15:46         ` Vitaly Wool
2015-10-01  7:52         ` Vlastimil Babka
2015-10-01  7:52           ` Vlastimil Babka
2015-10-10  9:33           ` Vitaly Wool
2015-10-10  9:33             ` Vitaly Wool
2015-10-14 13:28             ` Vlastimil Babka
2015-10-14 13:28               ` Vlastimil Babka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.