All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/1] buildstats.bbclass: add functionality to collect
@ 2020-10-23 16:56 Sakib Sajal
  2020-10-23 16:56 ` [PATCH 1/1] buildstats.bbclass: add functionality to collect build system stats Sakib Sajal
  0 siblings, 1 reply; 5+ messages in thread
From: Sakib Sajal @ 2020-10-23 16:56 UTC (permalink / raw
  To: openembedded-core

This functionality allows users to log host stats on a regualar
interval and/or upon failure.

Initial implementation design was to predefine a list of commands
to collect host system stats.

Obstacles:
Event TaskFailed runs in recipe specific environments, ie, the PATH
variable does not point to host tools. Moreover, having a predefined
list of tools to run is rigid and many of the tools may not be
available on the host system.

Solution:
Allow users to specify the tools with the absolute path, as well as 
the desired options to be run. The onus is on the users to make sure the
tools exist and the command runs to completion and exits.

Built core-image-minimal with an interval of 10s which resulted in a
log file of size 15Mb at a cost of 1-2% increase in build time.
core-image-minimal tmp-glibc is ~20Gb, and this increase is a trivial
increase.

Grepping the log file for specific tokens give a useful indication of
how the system resources are being used.

To Do:
1) Do selftests need to be added for this functionality?
2) Documentation about the usage.

Sakib Sajal (1):
  buildstats.bbclass: add functionality to collect build system stats

 meta/classes/buildstats.bbclass | 33 ++++++++++++++++++++++++++++++---
 1 file changed, 30 insertions(+), 3 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/1] buildstats.bbclass: add functionality to collect build system stats
  2020-10-23 16:56 [PATCH 0/1] buildstats.bbclass: add functionality to collect Sakib Sajal
@ 2020-10-23 16:56 ` Sakib Sajal
  2020-10-23 17:25   ` [OE-core] " Christopher Larson
  2020-10-28 14:27   ` Richard Purdie
  0 siblings, 2 replies; 5+ messages in thread
From: Sakib Sajal @ 2020-10-23 16:56 UTC (permalink / raw
  To: openembedded-core

There are a number of timeout and hang defects where
it would be useful to collect statistics about what
is running on a build host when that condition occurs.

This adds functionality to collect build system stats
on a regular interval and/or on task failure. Both
features are disabled by default.

To enable logging on a regular interval, set:
BB_HEARTBEAT_EVENT = "<interval>"
Logs are stored in ${BUILDSTATS_BASE}/<build_name>/host_stats

To enable logging on a task failure, set:
BB_LOG_HOST_STAT_ON_FAILURE = "1"
Logs are stored in ${BUILDSTATS_BASE}/<build_name>/build_stats

The list of commands, along with the desired options, need
to be specified in the BB_LOG_HOST_STAT_CMDS variable
delimited by ; as such:
BB_LOG_HOST_STAT_CMDS = "/<absolute>/<path>/<executable> <options> ; ... ;"

Signed-off-by: Sakib Sajal <sakib.sajal@windriver.com>
---
 meta/classes/buildstats.bbclass | 32 +++++++++++++++++++++++++++++---
 1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/meta/classes/buildstats.bbclass b/meta/classes/buildstats.bbclass
index 6f87187233..c68d7bb8a2 100644
--- a/meta/classes/buildstats.bbclass
+++ b/meta/classes/buildstats.bbclass
@@ -104,14 +104,38 @@ def write_task_data(status, logfile, e, d):
             f.write("Status: FAILED \n")
         f.write("Ended: %0.2f \n" % e.time)
 
+def write_host_data(logfile, e, d):
+    import subprocess, os, datetime
+    cmds = d.getVar('BB_LOG_HOST_STAT_CMDS').split(";")
+    with open(logfile, "a") as f:
+        f.write("Event Time: %f\nDate: %s\n" % (e.time, datetime.datetime.now()))
+        for cmd in cmds:
+            if len(cmd) == 0:
+                continue
+            c = cmd.split()
+            if os.path.isfile(c[0]) and os.access(c[0], os.X_OK):
+                try:
+                    output = subprocess.check_output(c, stderr=subprocess.STDOUT).decode('utf-8')
+                except subprocess.CalledProcessError as err:
+                    output = "Error running command: %s\n%s" % (cmd, err)
+                f.write("%s\n%s\n" % (cmd, output))
+            else:
+                f.write("Error running command: '%s': %s is not an executable.\n" % (cmd, c[0]))
+
 python run_buildstats () {
     import bb.build
     import bb.event
     import time, subprocess, platform
 
     bn = d.getVar('BUILDNAME')
-    bsdir = os.path.join(d.getVar('BUILDSTATS_BASE'), bn)
-    taskdir = os.path.join(bsdir, d.getVar('PF'))
+    # bitbake fires HeartbeatEvent even before a build has been
+    # triggered, causing BUILDNAME to be None
+    if bn is not None:
+        bsdir = os.path.join(d.getVar('BUILDSTATS_BASE'), bn)
+        taskdir = os.path.join(bsdir, d.getVar('PF'))
+        if isinstance(e, bb.event.HeartbeatEvent):
+            bb.utils.mkdirhier(bsdir)
+            write_host_data(os.path.join(bsdir, "host_stats"), e, d)
 
     if isinstance(e, bb.event.BuildStarted):
         ########################################################################
@@ -186,10 +210,12 @@ python run_buildstats () {
         build_status = os.path.join(bsdir, "build_stats")
         with open(build_status, "a") as f:
             f.write(d.expand("Failed at: ${PF} at task: %s \n" % e.task))
+            if bb.utils.to_boolean(d.getVar("BB_LOG_HOST_STAT_ON_FAILURE")):
+                write_host_data(build_status, e, d)
 }
 
 addhandler run_buildstats
-run_buildstats[eventmask] = "bb.event.BuildStarted bb.event.BuildCompleted bb.build.TaskStarted bb.build.TaskSucceeded bb.build.TaskFailed"
+run_buildstats[eventmask] = "bb.event.BuildStarted bb.event.BuildCompleted bb.event.HeartbeatEvent bb.build.TaskStarted bb.build.TaskSucceeded bb.build.TaskFailed"
 
 python runqueue_stats () {
     import buildstats
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [OE-core] [PATCH 1/1] buildstats.bbclass: add functionality to collect build system stats
  2020-10-23 16:56 ` [PATCH 1/1] buildstats.bbclass: add functionality to collect build system stats Sakib Sajal
@ 2020-10-23 17:25   ` Christopher Larson
  2020-10-23 18:58     ` Sakib Sajal
  2020-10-28 14:27   ` Richard Purdie
  1 sibling, 1 reply; 5+ messages in thread
From: Christopher Larson @ 2020-10-23 17:25 UTC (permalink / raw
  To: Sakib Sajal; +Cc: Patches and discussions about the oe-core layer

[-- Attachment #1: Type: text/plain, Size: 4530 bytes --]

Testing for isfile() and +x will mean you can't run commands via the PATH
via that commands variable, only full absolute paths. is that really the
intention? It's inconsistent with every other such variable, and also means
you can't define bitbake functions to be run instead.

On Fri, Oct 23, 2020 at 9:57 AM Sakib Sajal <sakib.sajal@windriver.com>
wrote:

> There are a number of timeout and hang defects where
> it would be useful to collect statistics about what
> is running on a build host when that condition occurs.
>
> This adds functionality to collect build system stats
> on a regular interval and/or on task failure. Both
> features are disabled by default.
>
> To enable logging on a regular interval, set:
> BB_HEARTBEAT_EVENT = "<interval>"
> Logs are stored in ${BUILDSTATS_BASE}/<build_name>/host_stats
>
> To enable logging on a task failure, set:
> BB_LOG_HOST_STAT_ON_FAILURE = "1"
> Logs are stored in ${BUILDSTATS_BASE}/<build_name>/build_stats
>
> The list of commands, along with the desired options, need
> to be specified in the BB_LOG_HOST_STAT_CMDS variable
> delimited by ; as such:
> BB_LOG_HOST_STAT_CMDS = "/<absolute>/<path>/<executable> <options> ; ... ;"
>
> Signed-off-by: Sakib Sajal <sakib.sajal@windriver.com>
> ---
>  meta/classes/buildstats.bbclass | 32 +++++++++++++++++++++++++++++---
>  1 file changed, 29 insertions(+), 3 deletions(-)
>
> diff --git a/meta/classes/buildstats.bbclass
> b/meta/classes/buildstats.bbclass
> index 6f87187233..c68d7bb8a2 100644
> --- a/meta/classes/buildstats.bbclass
> +++ b/meta/classes/buildstats.bbclass
> @@ -104,14 +104,38 @@ def write_task_data(status, logfile, e, d):
>              f.write("Status: FAILED \n")
>          f.write("Ended: %0.2f \n" % e.time)
>
> +def write_host_data(logfile, e, d):
> +    import subprocess, os, datetime
> +    cmds = d.getVar('BB_LOG_HOST_STAT_CMDS').split(";")
> +    with open(logfile, "a") as f:
> +        f.write("Event Time: %f\nDate: %s\n" % (e.time,
> datetime.datetime.now()))
> +        for cmd in cmds:
> +            if len(cmd) == 0:
> +                continue
> +            c = cmd.split()
> +            if os.path.isfile(c[0]) and os.access(c[0], os.X_OK):
> +                try:
> +                    output = subprocess.check_output(c,
> stderr=subprocess.STDOUT).decode('utf-8')
> +                except subprocess.CalledProcessError as err:
> +                    output = "Error running command: %s\n%s" % (cmd, err)
> +                f.write("%s\n%s\n" % (cmd, output))
> +            else:
> +                f.write("Error running command: '%s': %s is not an
> executable.\n" % (cmd, c[0]))
> +
>  python run_buildstats () {
>      import bb.build
>      import bb.event
>      import time, subprocess, platform
>
>      bn = d.getVar('BUILDNAME')
> -    bsdir = os.path.join(d.getVar('BUILDSTATS_BASE'), bn)
> -    taskdir = os.path.join(bsdir, d.getVar('PF'))
> +    # bitbake fires HeartbeatEvent even before a build has been
> +    # triggered, causing BUILDNAME to be None
> +    if bn is not None:
> +        bsdir = os.path.join(d.getVar('BUILDSTATS_BASE'), bn)
> +        taskdir = os.path.join(bsdir, d.getVar('PF'))
> +        if isinstance(e, bb.event.HeartbeatEvent):
> +            bb.utils.mkdirhier(bsdir)
> +            write_host_data(os.path.join(bsdir, "host_stats"), e, d)
>
>      if isinstance(e, bb.event.BuildStarted):
>
>  ########################################################################
> @@ -186,10 +210,12 @@ python run_buildstats () {
>          build_status = os.path.join(bsdir, "build_stats")
>          with open(build_status, "a") as f:
>              f.write(d.expand("Failed at: ${PF} at task: %s \n" % e.task))
> +            if
> bb.utils.to_boolean(d.getVar("BB_LOG_HOST_STAT_ON_FAILURE")):
> +                write_host_data(build_status, e, d)
>  }
>
>  addhandler run_buildstats
> -run_buildstats[eventmask] = "bb.event.BuildStarted
> bb.event.BuildCompleted bb.build.TaskStarted bb.build.TaskSucceeded
> bb.build.TaskFailed"
> +run_buildstats[eventmask] = "bb.event.BuildStarted
> bb.event.BuildCompleted bb.event.HeartbeatEvent bb.build.TaskStarted
> bb.build.TaskSucceeded bb.build.TaskFailed"
>
>  python runqueue_stats () {
>      import buildstats
> --
> 2.27.0
>
>
> 
>
>

-- 
Christopher Larson
kergoth at gmail dot com
Founder - BitBake, OpenEmbedded, OpenZaurus
Senior Software Engineer, Mentor Graphics

[-- Attachment #2: Type: text/html, Size: 5684 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [OE-core] [PATCH 1/1] buildstats.bbclass: add functionality to collect build system stats
  2020-10-23 17:25   ` [OE-core] " Christopher Larson
@ 2020-10-23 18:58     ` Sakib Sajal
  0 siblings, 0 replies; 5+ messages in thread
From: Sakib Sajal @ 2020-10-23 18:58 UTC (permalink / raw
  To: Christopher Larson; +Cc: Patches and discussions about the oe-core layer

[-- Attachment #1: Type: text/plain, Size: 4919 bytes --]

On 2020-10-23 1:25 p.m., Christopher Larson wrote:
> Testing for isfile() and +x will mean you can't run commands via the PATH
> via that commands variable, only full absolute paths. is that really the
> intention? It's inconsistent with every other such variable, and also means
> you can't define bitbake functions to be run instead.

Hi Christopher,

That is the intended usage. We need to break out of the 
recipe-specific-sysroot and the tools run would not typically be built 
as -native, I expect. See:

https://lists.openembedded.org/g/openembedded-core/message/143717

This is still work in progress and any kind of feedback/suggestion is 
welcome!

Sakib

>
> On Fri, Oct 23, 2020 at 9:57 AM Sakib Sajal <sakib.sajal@windriver.com>
> wrote:
>
>> There are a number of timeout and hang defects where
>> it would be useful to collect statistics about what
>> is running on a build host when that condition occurs.
>>
>> This adds functionality to collect build system stats
>> on a regular interval and/or on task failure. Both
>> features are disabled by default.
>>
>> To enable logging on a regular interval, set:
>> BB_HEARTBEAT_EVENT = "<interval>"
>> Logs are stored in ${BUILDSTATS_BASE}/<build_name>/host_stats
>>
>> To enable logging on a task failure, set:
>> BB_LOG_HOST_STAT_ON_FAILURE = "1"
>> Logs are stored in ${BUILDSTATS_BASE}/<build_name>/build_stats
>>
>> The list of commands, along with the desired options, need
>> to be specified in the BB_LOG_HOST_STAT_CMDS variable
>> delimited by ; as such:
>> BB_LOG_HOST_STAT_CMDS = "/<absolute>/<path>/<executable> <options> ; ... ;"
>>
>> Signed-off-by: Sakib Sajal <sakib.sajal@windriver.com>
>> ---
>>   meta/classes/buildstats.bbclass | 32 +++++++++++++++++++++++++++++---
>>   1 file changed, 29 insertions(+), 3 deletions(-)
>>
>> diff --git a/meta/classes/buildstats.bbclass
>> b/meta/classes/buildstats.bbclass
>> index 6f87187233..c68d7bb8a2 100644
>> --- a/meta/classes/buildstats.bbclass
>> +++ b/meta/classes/buildstats.bbclass
>> @@ -104,14 +104,38 @@ def write_task_data(status, logfile, e, d):
>>               f.write("Status: FAILED \n")
>>           f.write("Ended: %0.2f \n" % e.time)
>>
>> +def write_host_data(logfile, e, d):
>> +    import subprocess, os, datetime
>> +    cmds = d.getVar('BB_LOG_HOST_STAT_CMDS').split(";")
>> +    with open(logfile, "a") as f:
>> +        f.write("Event Time: %f\nDate: %s\n" % (e.time,
>> datetime.datetime.now()))
>> +        for cmd in cmds:
>> +            if len(cmd) == 0:
>> +                continue
>> +            c = cmd.split()
>> +            if os.path.isfile(c[0]) and os.access(c[0], os.X_OK):
>> +                try:
>> +                    output = subprocess.check_output(c,
>> stderr=subprocess.STDOUT).decode('utf-8')
>> +                except subprocess.CalledProcessError as err:
>> +                    output = "Error running command: %s\n%s" % (cmd, err)
>> +                f.write("%s\n%s\n" % (cmd, output))
>> +            else:
>> +                f.write("Error running command: '%s': %s is not an
>> executable.\n" % (cmd, c[0]))
>> +
>>   python run_buildstats () {
>>       import bb.build
>>       import bb.event
>>       import time, subprocess, platform
>>
>>       bn = d.getVar('BUILDNAME')
>> -    bsdir = os.path.join(d.getVar('BUILDSTATS_BASE'), bn)
>> -    taskdir = os.path.join(bsdir, d.getVar('PF'))
>> +    # bitbake fires HeartbeatEvent even before a build has been
>> +    # triggered, causing BUILDNAME to be None
>> +    if bn is not None:
>> +        bsdir = os.path.join(d.getVar('BUILDSTATS_BASE'), bn)
>> +        taskdir = os.path.join(bsdir, d.getVar('PF'))
>> +        if isinstance(e, bb.event.HeartbeatEvent):
>> +            bb.utils.mkdirhier(bsdir)
>> +            write_host_data(os.path.join(bsdir, "host_stats"), e, d)
>>
>>       if isinstance(e, bb.event.BuildStarted):
>>
>>   ########################################################################
>> @@ -186,10 +210,12 @@ python run_buildstats () {
>>           build_status = os.path.join(bsdir, "build_stats")
>>           with open(build_status, "a") as f:
>>               f.write(d.expand("Failed at: ${PF} at task: %s \n" % e.task))
>> +            if
>> bb.utils.to_boolean(d.getVar("BB_LOG_HOST_STAT_ON_FAILURE")):
>> +                write_host_data(build_status, e, d)
>>   }
>>
>>   addhandler run_buildstats
>> -run_buildstats[eventmask] = "bb.event.BuildStarted
>> bb.event.BuildCompleted bb.build.TaskStarted bb.build.TaskSucceeded
>> bb.build.TaskFailed"
>> +run_buildstats[eventmask] = "bb.event.BuildStarted
>> bb.event.BuildCompleted bb.event.HeartbeatEvent bb.build.TaskStarted
>> bb.build.TaskSucceeded bb.build.TaskFailed"
>>
>>   python runqueue_stats () {
>>       import buildstats
>> --
>> 2.27.0
>>
>>
>> 
>>
>>

[-- Attachment #2: Type: text/html, Size: 6027 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [OE-core] [PATCH 1/1] buildstats.bbclass: add functionality to collect build system stats
  2020-10-23 16:56 ` [PATCH 1/1] buildstats.bbclass: add functionality to collect build system stats Sakib Sajal
  2020-10-23 17:25   ` [OE-core] " Christopher Larson
@ 2020-10-28 14:27   ` Richard Purdie
  1 sibling, 0 replies; 5+ messages in thread
From: Richard Purdie @ 2020-10-28 14:27 UTC (permalink / raw
  To: Sakib Sajal, openembedded-core

On Fri, 2020-10-23 at 12:56 -0400, Sakib Sajal wrote:
> There are a number of timeout and hang defects where
> it would be useful to collect statistics about what
> is running on a build host when that condition occurs.
> 
> This adds functionality to collect build system stats
> on a regular interval and/or on task failure. Both
> features are disabled by default.
> 
> To enable logging on a regular interval, set:
> BB_HEARTBEAT_EVENT = "<interval>"
> Logs are stored in ${BUILDSTATS_BASE}/<build_name>/host_stats
> 
> To enable logging on a task failure, set:
> BB_LOG_HOST_STAT_ON_FAILURE = "1"
> Logs are stored in ${BUILDSTATS_BASE}/<build_name>/build_stats
> 
> The list of commands, along with the desired options, need
> to be specified in the BB_LOG_HOST_STAT_CMDS variable
> delimited by ; as such:
> BB_LOG_HOST_STAT_CMDS = "/<absolute>/<path>/<executable> <options> ; ... ;"
> 
> Signed-off-by: Sakib Sajal <sakib.sajal@windriver.com>
> ---
>  meta/classes/buildstats.bbclass | 32 +++++++++++++++++++++++++++++---
>  1 file changed, 29 insertions(+), 3 deletions(-)
> 
> diff --git a/meta/classes/buildstats.bbclass b/meta/classes/buildstats.bbclass
> index 6f87187233..c68d7bb8a2 100644
> --- a/meta/classes/buildstats.bbclass
> +++ b/meta/classes/buildstats.bbclass
> @@ -104,14 +104,38 @@ def write_task_data(status, logfile, e, d):
>              f.write("Status: FAILED \n")
>          f.write("Ended: %0.2f \n" % e.time)
>  
> +def write_host_data(logfile, e, d):
> +    import subprocess, os, datetime
> +    cmds = d.getVar('BB_LOG_HOST_STAT_CMDS').split(";")
> +    with open(logfile, "a") as f:
> +        f.write("Event Time: %f\nDate: %s\n" % (e.time, datetime.datetime.now()))
> +        for cmd in cmds:
> +            if len(cmd) == 0:
> +                continue
> +            c = cmd.split()
> +            if os.path.isfile(c[0]) and os.access(c[0], os.X_OK):
> +                try:
> +                    output = subprocess.check_output(c, stderr=subprocess.STDOUT).decode('utf-8')
> +                except subprocess.CalledProcessError as err:
> +                    output = "Error running command: %s\n%s" % (cmd, err)
> +                f.write("%s\n%s\n" % (cmd, output))
> +            else:
> +                f.write("Error running command: '%s': %s is not an executable.\n" % (cmd, c[0]))
> +


I am a little worried about this for some of the reasons Chris
mentions. I worry that not all distros will have a standard location
for some of the tools we want to run.

One trick you could try is to use something like: 

path = d.getVar("PATH") + ":" + d.getVar("BB_ORIGENV", False).getVar("PATH")

which means we'd add back in the original search PATH for the tools as
well as our own directories.

Cheers,

Richard



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-10-28 14:27 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-10-23 16:56 [PATCH 0/1] buildstats.bbclass: add functionality to collect Sakib Sajal
2020-10-23 16:56 ` [PATCH 1/1] buildstats.bbclass: add functionality to collect build system stats Sakib Sajal
2020-10-23 17:25   ` [OE-core] " Christopher Larson
2020-10-23 18:58     ` Sakib Sajal
2020-10-28 14:27   ` Richard Purdie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.