All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: "Ananyev, Konstantin" <konstantin.ananyev-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
To: "De Lara Guarch,
	Pablo"
	<pablo.de.lara.guarch-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	"dev-VfR2kkLFssw@public.gmane.org"
	<dev-VfR2kkLFssw@public.gmane.org>
Subject: Re: [PATCH v3 3/6] hash: update jhash function with the	latest available
Date: Wed, 6 May 2015 16:11:23 +0000	[thread overview]
Message-ID: <2601191342CEEE43887BDE71AB9772582142512E@irsmsx105.ger.corp.intel.com> (raw)
In-Reply-To: <E115CCD9D858EF4F90C690B0DCB4D8972729A7B7-kPTMFJFq+rEMvF1YICWikbfspsVTdybXVpNB7YpNyf8@public.gmane.org>

Hi Pablo,

> -----Original Message-----
> From: De Lara Guarch, Pablo
> Sent: Wednesday, May 06, 2015 10:36 AM
> To: Ananyev, Konstantin; dev-VfR2kkLFssw@public.gmane.org
> Subject: RE: [dpdk-dev] [PATCH v3 3/6] hash: update jhash function with the latest available
> 
> Hi Konstantin,
> 
> > -----Original Message-----
> > From: Ananyev, Konstantin
> > Sent: Wednesday, May 06, 2015 1:36 AM
> > To: De Lara Guarch, Pablo; dev-VfR2kkLFssw@public.gmane.org
> > Subject: RE: [dpdk-dev] [PATCH v3 3/6] hash: update jhash function with the
> > latest available
> >
> >
> > Hi Pablo,
> >
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces-VfR2kkLFssw@public.gmane.org] On Behalf Of Pablo de Lara
> > > Sent: Tuesday, May 05, 2015 3:44 PM
> > > To: dev-VfR2kkLFssw@public.gmane.org
> > > Subject: [dpdk-dev] [PATCH v3 3/6] hash: update jhash function with the
> > latest available
> > >
> > > Jenkins hash function was developed originally in 1996,
> > > and was integrated in first versions of DPDK.
> > > The function has been improved in 2006,
> > > achieving up to 60% better performance, compared to the original one.
> > >
> > > This patch integrates that code into the rte_jhash library.
> > >
> > > Signed-off-by: Pablo de Lara <pablo.de.lara.guarch-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > > ---
> > >  lib/librte_hash/rte_jhash.h |  261
> > +++++++++++++++++++++++++++++++------------
> > >  1 files changed, 188 insertions(+), 73 deletions(-)
> > >
> > > diff --git a/lib/librte_hash/rte_jhash.h b/lib/librte_hash/rte_jhash.h
> > > index a4bf5a1..0e96b7c 100644
> > > --- a/lib/librte_hash/rte_jhash.h
> > > +++ b/lib/librte_hash/rte_jhash.h
> > > @@ -1,7 +1,7 @@
> > >  /*-
> > >   *   BSD LICENSE
> > >   *
> > > - *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> > > + *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
> > >   *   All rights reserved.
> > >   *
> > >   *   Redistribution and use in source and binary forms, with or without
> > > @@ -45,38 +45,68 @@ extern "C" {
> > >  #endif
> > >
> > >  #include <stdint.h>
> > > +#include <string.h>
> > > +#include <rte_byteorder.h>
> > >
> > >  /* jhash.h: Jenkins hash support.
> > >   *
> > > - * Copyright (C) 1996 Bob Jenkins (bob_jenkins-+9kZLMxy4yAbbCauIUEb2A@public.gmane.org)
> > > + * Copyright (C) 2006 Bob Jenkins (bob_jenkins-+9kZLMxy4yAbbCauIUEb2A@public.gmane.org)
> > >   *
> > >   * http://burtleburtle.net/bob/hash/
> > >   *
> > >   * These are the credits from Bob's sources:
> > >   *
> > > - * lookup2.c, by Bob Jenkins, December 1996, Public Domain.
> > > - * hash(), hash2(), hash3, and mix() are externally useful functions.
> > > - * Routines to test the hash are included if SELF_TEST is defined.
> > > - * You can use this free for any purpose.  It has no warranty.
> > > + * lookup3.c, by Bob Jenkins, May 2006, Public Domain.
> > > + *
> > > + * These are functions for producing 32-bit hashes for hash table lookup.
> > > + * hashword(), hashlittle(), hashlittle2(), hashbig(), mix(), and final()
> > > + * are externally useful functions.  Routines to test the hash are included
> > > + * if SELF_TEST is defined.  You can use this free for any purpose.  It's in
> > > + * the public domain.  It has no warranty.
> > >   *
> > >   * $FreeBSD$
> > >   */
> > >
> > > +#define rot(x, k) (((x) << (k)) | ((x) >> (32-(k))))
> > > +
> > >  /** @internal Internal function. NOTE: Arguments are modified. */
> > >  #define __rte_jhash_mix(a, b, c) do { \
> > > -	a -= b; a -= c; a ^= (c>>13); \
> > > -	b -= c; b -= a; b ^= (a<<8); \
> > > -	c -= a; c -= b; c ^= (b>>13); \
> > > -	a -= b; a -= c; a ^= (c>>12); \
> > > -	b -= c; b -= a; b ^= (a<<16); \
> > > -	c -= a; c -= b; c ^= (b>>5); \
> > > -	a -= b; a -= c; a ^= (c>>3); \
> > > -	b -= c; b -= a; b ^= (a<<10); \
> > > -	c -= a; c -= b; c ^= (b>>15); \
> > > +	a -= c; a ^= rot(c, 4); c += b; \
> > > +	b -= a; b ^= rot(a, 6); a += c; \
> > > +	c -= b; c ^= rot(b, 8); b += a; \
> > > +	a -= c; a ^= rot(c, 16); c += b; \
> > > +	b -= a; b ^= rot(a, 19); a += c; \
> > > +	c -= b; c ^= rot(b, 4); b += a; \
> > > +} while (0)
> > > +
> > > +#define __rte_jhash_final(a, b, c) do { \
> > > +	c ^= b; c -= rot(b, 14); \
> > > +	a ^= c; a -= rot(c, 11); \
> > > +	b ^= a; b -= rot(a, 25); \
> > > +	c ^= b; c -= rot(b, 16); \
> > > +	a ^= c; a -= rot(c, 4);  \
> > > +	b ^= a; b -= rot(a, 14); \
> > > +	c ^= b; c -= rot(b, 24); \
> > >  } while (0)
> > >
> > >  /** The golden ratio: an arbitrary value. */
> > > -#define RTE_JHASH_GOLDEN_RATIO      0x9e3779b9
> > > +#define RTE_JHASH_GOLDEN_RATIO      0xdeadbeef
> > > +
> > > +#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> > > +#define RTE_JHASH_BYTE0_SHIFT 0
> > > +#define RTE_JHASH_BYTE1_SHIFT 8
> > > +#define RTE_JHASH_BYTE2_SHIFT 16
> > > +#define RTE_JHASH_BYTE3_SHIFT 24
> > > +#else
> > > +#define RTE_JHASH_BYTE0_SHIFT 24
> > > +#define RTE_JHASH_BYTE1_SHIFT 16
> > > +#define RTE_JHASH_BYTE2_SHIFT 8
> > > +#define RTE_JHASH_BYTE3_SHIFT 0
> > > +#endif
> > > +
> > > +#define LOWER8b_MASK rte_le_to_cpu_32(0xff)
> > > +#define LOWER16b_MASK rte_le_to_cpu_32(0xffff)
> > > +#define LOWER24b_MASK rte_le_to_cpu_32(0xffffff)
> > >
> > >  /**
> > >   * The most generic version, hashes an arbitrary sequence
> > > @@ -95,42 +125,119 @@ extern "C" {
> > >  static inline uint32_t
> > >  rte_jhash(const void *key, uint32_t length, uint32_t initval)
> > >  {
> > > -	uint32_t a, b, c, len;
> > > -	const uint8_t *k = (const uint8_t *)key;
> > > -	const uint32_t *k32 = (const uint32_t *)key;
> > > +	uint32_t a, b, c;
> > > +	union {
> > > +		const void *ptr;
> > > +		size_t i;
> > > +	} u;
> > >
> > > -	len = length;
> > > -	a = b = RTE_JHASH_GOLDEN_RATIO;
> > > -	c = initval;
> > > +	/* Set up the internal state */
> > > +	a = b = c = RTE_JHASH_GOLDEN_RATIO + ((uint32_t)length) + initval;
> > >
> > > -	while (len >= 12) {
> > > -		a += k32[0];
> > > -		b += k32[1];
> > > -		c += k32[2];
> > > +	u.ptr = key;
> > >
> > > -		__rte_jhash_mix(a,b,c);
> > > +	/* Check key alignment. For x86 architecture, first case is always
> > optimal */
> > > +	if (!strcmp(RTE_ARCH,"x86_64") || !strcmp(RTE_ARCH,"i686") || (u.i
> > & 0x3) == 0) {
> >
> > Wonder why strcmp(), why not something like: 'if defined(RTE_ARCH_I686)
> > || defined(RTE_ARCH_X86_64)' as in all other places?
> > Another question what would be in case of RTE_ARCH="x86_x32"?
> > Konstantin
> 
> Functionally is the same and using this method, I can integrate all conditions in one line, so it takes less code.
> I also checked the assembly code, and the compiler removes the check if it is Intel architecture, so performance remains the same.

Well,  yes I think most modern compilers  treat strcmp() as a builtin function and are able to optimise these strcmp() calls off for that case.
But  we probably can't guarantee that it would always be the case for all different compiler/libc combinations.
Again, by some reason user might need to use ' -fno-builtin' flag while building his stuff.
So I would use pre-processor macros here, it is more predictable.
Again, that way it is consistent with other places.
 
Actually I wonder do you really need such sort of diversity for aligned/non-aligned case?
Wonder wouldn't something like that work for you:

#infdef  RTE_ARCH_X86
        const uint32_t *k = (uint32_t *)((uintptr_t)key & (uintptr_t)~3);
        const uint32_t s = ((uintptr_t)key & 3) * CHAR_BIT;
#else /*X86*/
        const uint32_t *k = key;
        const uint32_t s = 0;
#endif

  while (len > 12) {
                a += k[0] >> s | (uint64_t)k[1] << (32 - s);
                b += k[1] >> s | (uint64_t)k[2] << (32 - s);
                c += k[2] >> s | (uint64_t)k[3] << (32 - s);
                k += 3;
                length -= 12;
}

switch (length) {
case 12:
    a += k[0] >> s | (uint64_t)k[1] << (32 - s);
    b += k[1] >> s | (uint64_t)k[2] << (32 - s);
    c += k[2] >> s | (uint64_t)k[3] << (32 - s);
    break;
case 11:
    a += k[0] >> s | (uint64_t)k[1] << (32 - s);
    b += k[1] >> s | (uint64_t)k[2] << (32 - s);
    c += (k[2] >> s | (uint64_t)k[3] << (32 - s)) & & LOWER24b_MASK;
    break;
...
case 1:
   a += (k[0] >> s | (uint64_t)k[1] << (32 - s)) & LOWER8b_MASK;
   break;
...

In that way, even for non-aligned you don't need do 4B reads.
For x86, compiler would do it's optimisation work and strip off '>> s | (uint64_t)k[..] << (32 - s);'.

> 
> Re x86_x32, you are right, probably I need to include it. Although, I just realized that it is not used in any other place.
> Wonder if we should include it somewhere else? E.g. rte_hash_crc.h

Yep, that's true we are not doing it for hash_crc also...
Would probably good to have some sort of ' RTE_ARCH_X86' - that would be defined for all x86 targets and use it whenever applicable.
But I suppose, that's a subject for another patch. 

Konstantin

  parent reply	other threads:[~2015-05-06 16:11 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-16 13:26 [PATCH] hash: update jhash function with the latest available Pablo de Lara
     [not found] ` <1429190819-27402-1-git-send-email-pablo.de.lara.guarch-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-04-16 14:01   ` Bruce Richardson
2015-04-17 16:03     ` De Lara Guarch, Pablo
2015-04-24 11:23   ` [PATCH v2 0/6] update jhash function Pablo de Lara
     [not found]     ` <1429874587-17939-1-git-send-email-pablo.de.lara.guarch-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-04-24 11:23       ` [PATCH v2 1/6] test/hash: move hash function perf tests to separate file Pablo de Lara
2015-04-24 11:23       ` [PATCH v2 2/6] test/hash: improve accuracy on cycle measurements Pablo de Lara
2015-04-24 11:23       ` [PATCH v2 3/6] hash: update jhash function with the latest available Pablo de Lara
2015-04-24 11:23       ` [PATCH v2 4/6] hash: add two new functions to jhash library Pablo de Lara
2015-04-24 11:23       ` [PATCH v2 5/6] hash: remove duplicated code Pablo de Lara
2015-04-24 11:23       ` [PATCH v2 6/6] hash: rename rte_jhash2 to rte_jhash_32b Pablo de Lara
2015-05-05 14:43       ` [PATCH v3 0/6] update jhash function Pablo de Lara
     [not found]         ` <1430837034-21031-1-git-send-email-pablo.de.lara.guarch-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-05-05 14:43           ` [PATCH v3 1/6] test/hash: move hash function perf tests to separate file Pablo de Lara
2015-05-05 14:43           ` [PATCH v3 2/6] test/hash: improve accuracy on cycle measurements Pablo de Lara
2015-05-05 14:43           ` [PATCH v3 3/6] hash: update jhash function with the latest available Pablo de Lara
     [not found]             ` <1430837034-21031-4-git-send-email-pablo.de.lara.guarch-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-05-06  0:35               ` Ananyev, Konstantin
     [not found]                 ` <2601191342CEEE43887BDE71AB97725821424ED1-pww93C2UFcwu0RiL9chJVbfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-05-06  9:36                   ` De Lara Guarch, Pablo
     [not found]                     ` <E115CCD9D858EF4F90C690B0DCB4D8972729A7B7-kPTMFJFq+rEMvF1YICWikbfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-05-06 16:11                       ` Ananyev, Konstantin [this message]
2015-05-07 11:11                     ` Ananyev, Konstantin
2015-05-05 14:43           ` [PATCH v3 4/6] hash: add two new functions to jhash library Pablo de Lara
2015-05-05 14:43           ` [PATCH v3 5/6] hash: remove duplicated code Pablo de Lara
2015-05-05 14:43           ` [PATCH v3 6/6] hash: rename rte_jhash2 to rte_jhash_32b Pablo de Lara
2015-05-12 11:02         ` [PATCH v4 0/6] update jhash function Pablo de Lara
2015-05-12 11:02           ` [PATCH v4 1/6] test/hash: move hash function perf tests to separate file Pablo de Lara
2015-05-12 11:02           ` [PATCH v4 2/6] test/hash: improve accuracy on cycle measurements Pablo de Lara
2015-05-12 11:02           ` [PATCH v4 3/6] hash: update jhash function with the latest available Pablo de Lara
2015-05-12 11:02           ` [PATCH v4 4/6] hash: add two new functions to jhash library Pablo de Lara
2015-05-12 11:02           ` [PATCH v4 5/6] hash: remove duplicated code Pablo de Lara
2015-05-12 11:02           ` [PATCH v4 6/6] hash: rename rte_jhash2 to rte_jhash_32b Pablo de Lara
2015-05-12 15:33           ` [PATCH v4 0/6] update jhash function Neil Horman
2015-05-13 13:52             ` De Lara Guarch, Pablo
2015-05-13 14:20               ` Neil Horman
2015-05-18 16:14           ` Bruce Richardson
2015-05-22 10:16           ` [PATCH v5 00/10] " Pablo de Lara
2015-05-22 10:16             ` [PATCH v5 01/10] test/hash: move hash function perf tests to separate file Pablo de Lara
2015-05-22 10:16             ` [PATCH v5 02/10] test/hash: improve accuracy on cycle measurements Pablo de Lara
2015-05-22 10:16             ` [PATCH v5 03/10] test/hash: update key size range and initial values for testing Pablo de Lara
2015-05-22 10:16             ` [PATCH v5 04/10] test/hash: change order of loops in hash function tests Pablo de Lara
2015-06-10 11:05               ` Bruce Richardson
2015-05-22 10:16             ` [PATCH v5 05/10] test/hash: add new functional tests for hash functions Pablo de Lara
2015-05-22 10:16             ` [PATCH v5 06/10] hash: update jhash function with the latest available Pablo de Lara
2015-06-10 11:07               ` Bruce Richardson
2015-05-22 10:16             ` [PATCH v5 07/10] hash: add two new functions to jhash library Pablo de Lara
2015-05-22 10:16             ` [PATCH v5 08/10] hash: remove duplicated code Pablo de Lara
2015-05-22 10:16             ` [PATCH v5 09/10] hash: rename rte_jhash2 to rte_jhash_32b Pablo de Lara
2015-06-10 11:09               ` Bruce Richardson
2015-05-22 10:16             ` [PATCH v5 10/10] test/hash: verify rte_jhash_1word/2words/3words Pablo de Lara
2015-06-10 15:25             ` [PATCH v6 00/10] update jhash function Pablo de Lara
2015-06-10 15:25               ` [PATCH v6 01/10] test/hash: move hash function perf tests to separate file Pablo de Lara
2015-06-10 15:25               ` [PATCH v6 02/10] test/hash: improve accuracy on cycle measurements Pablo de Lara
2015-06-10 15:25               ` [PATCH v6 03/10] test/hash: update key size range and initial values for testing Pablo de Lara
2015-06-10 15:25               ` [PATCH v6 04/10] test/hash: change order of loops in hash function tests Pablo de Lara
2015-06-10 15:25               ` [PATCH v6 05/10] test/hash: add new functional tests for hash functions Pablo de Lara
2015-06-10 15:25               ` [PATCH v6 06/10] hash: update jhash function with the latest available Pablo de Lara
2015-06-10 15:25               ` [PATCH v6 07/10] hash: add two new functions to jhash library Pablo de Lara
2015-06-10 15:25               ` [PATCH v6 08/10] hash: remove duplicated code Pablo de Lara
2015-06-16  9:33                 ` Thomas Monjalon
2015-06-16 10:31                   ` De Lara Guarch, Pablo
2015-06-16 13:08                     ` Thomas Monjalon
2015-06-10 15:25               ` [PATCH v6 09/10] hash: rename rte_jhash2 to rte_jhash_32b Pablo de Lara
2015-06-10 15:25               ` [PATCH v6 10/10] test/hash: verify rte_jhash_1word/2words/3words Pablo de Lara
2015-06-12 10:37               ` [PATCH v6 00/10] update jhash function Bruce Richardson
2015-06-16 10:22                 ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2601191342CEEE43887BDE71AB9772582142512E@irsmsx105.ger.corp.intel.com \
    --to=konstantin.ananyev-ral2jqcrhueavxtiumwx3w@public.gmane.org \
    --cc=dev-VfR2kkLFssw@public.gmane.org \
    --cc=pablo.de.lara.guarch-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.