Discussion:
Parallel building of Linux Kernel is broken
Masahiro Yamada
2018-07-12 01:26:48 UTC
Permalink
Hello GNU Make folks,


I am a Linux kernel developer.



I think the following commit broke
the parallel building of Linux kernel.


commit 2b8e3bb23f96c2458818f011593557d3353dade3
Author: Paul Smith <***@gnu.org>
Date: Mon Jan 2 14:08:54 2017 -0500

Clean up close-on-exec, particularly with jobserver pipes.





How to reproduce the problem
----------------------------


You can get the Linux kernel source code by

$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
$ git checkout v4.17


At the top of the Linux kernel source tree, run

$ make -j8 defconfig all



This worked fine with GNU Make prior to that commit.



If you use GNU Make with that commit,
'jobserver unavailable: using -j1. Add '+' to parent make rule'
is displayed, then the whole build process of Linux kernel
is serialized.




$ make -j8 defconfig all
HOSTCC scripts/basic/fixdep
HOSTCC scripts/kconfig/conf.o
YACC scripts/kconfig/zconf.tab.c
LEX scripts/kconfig/zconf.lex.c
HOSTCC scripts/kconfig/zconf.tab.o
HOSTLD scripts/kconfig/conf
*** Default configuration is based on 'x86_64_defconfig'
#
# configuration written to .config
#
scripts/kconfig/conf --syncconfig Kconfig
make[1]: warning: jobserver unavailable: using -j1. Add '+' to parent
make rule.
SYSTBL arch/x86/include/generated/asm/syscalls_32.h



The commit subject says 'Clean up', but
I think this is accidental breakage.

Any clue?
--
Best Regards
Masahiro Yamada
Paul Smith
2018-07-12 04:37:27 UTC
Permalink
Post by Masahiro Yamada
$ make -j8 defconfig all
HOSTCC scripts/basic/fixdep
HOSTCC scripts/kconfig/conf.o
YACC scripts/kconfig/zconf.tab.c
LEX scripts/kconfig/zconf.lex.c
HOSTCC scripts/kconfig/zconf.tab.o
HOSTLD scripts/kconfig/conf
*** Default configuration is based on 'x86_64_defconfig'
#
# configuration written to .config
#
scripts/kconfig/conf --syncconfig Kconfig
make[1]: warning: jobserver unavailable: using -j1. Add '+' to parent
make rule.
SYSTBL arch/x86/include/generated/asm/syscalls_32.h
Someone will need to investigate this further as the information we
need has been elided. Is there a way to invoke the Linux kernel
makefile with a "verbose" mode that doesn't use these shorthand outputs
for commands but instead actually shows the commands that recipes ask
make to invoke? That's what we need to see; the actual command that
was used to recursively invoke make.

This warning means that a sub-make has been invoked but that the
makefile invoking it didn't realize it was a sub-make: that was hidden
from make by not using the $(MAKE) variable in the recipe.

In that case, make recommends that the makefile rule be prefixed with
the '+' character to tell make that the rule is a recursive invocation.

Unless make realizes that the recipe it runs will invoke a sub-make, it
can't properly prepare the environment for that sub-make.

I'm somewhat surprised that the build was not seeing this error before,
because it definitely was previously emitted: that error message has
been around for a long time.

As mentioned I'd need to see the actual command line invocation in
order to understand why this might have happened.
Masahiro Yamada
2018-07-12 04:55:51 UTC
Permalink
Hi.
Post by Paul Smith
Post by Masahiro Yamada
$ make -j8 defconfig all
HOSTCC scripts/basic/fixdep
HOSTCC scripts/kconfig/conf.o
YACC scripts/kconfig/zconf.tab.c
LEX scripts/kconfig/zconf.lex.c
HOSTCC scripts/kconfig/zconf.tab.o
HOSTLD scripts/kconfig/conf
*** Default configuration is based on 'x86_64_defconfig'
#
# configuration written to .config
#
scripts/kconfig/conf --syncconfig Kconfig
make[1]: warning: jobserver unavailable: using -j1. Add '+' to parent
make rule.
SYSTBL arch/x86/include/generated/asm/syscalls_32.h
Someone will need to investigate this further as the information we
need has been elided. Is there a way to invoke the Linux kernel
makefile with a "verbose" mode that doesn't use these shorthand outputs
for commands but instead actually shows the commands that recipes ask
make to invoke?
Please add V=1 to the build command, like this:

$ make -j8 V=1 defconfig all


It will emit _almost_ raw log.
Post by Paul Smith
That's what we need to see; the actual command that
was used to recursively invoke make.
Linux Makefile hides "Entering directory ..." messages.

If you want to see them,
please comment out the following line of the top Makefile.

MAKEFLAGS += --no-print-directory


Thanks,
Post by Paul Smith
This warning means that a sub-make has been invoked but that the
makefile invoking it didn't realize it was a sub-make: that was hidden
from make by not using the $(MAKE) variable in the recipe.
In that case, make recommends that the makefile rule be prefixed with
the '+' character to tell make that the rule is a recursive invocation.
Unless make realizes that the recipe it runs will invoke a sub-make, it
can't properly prepare the environment for that sub-make.
I'm somewhat surprised that the build was not seeing this error before,
because it definitely was previously emitted: that error message has
been around for a long time.
As mentioned I'd need to see the actual command line invocation in
order to understand why this might have happened.
--
Best Regards
Masahiro Yamada
Paul Smith
2018-07-12 05:13:19 UTC
Permalink
Post by Masahiro Yamada
$ make -j8 V=1 defconfig all
It will emit _almost_ raw log.
At this time I don't have the resources to do this. It will take me
some time if you're going to wait for me.
Post by Masahiro Yamada
Post by Paul Smith
That's what we need to see; the actual command that
was used to recursively invoke make.
Linux Makefile hides "Entering directory ..." messages.
If you want to see them,
please comment out the following line of the top Makefile.
MAKEFLAGS += --no-print-directory
I'm not that interested in seeing these but it wouldn't hurt. Mainly I
need to see the manner in which the recursive make is invoked and the
recipes used to do so.
Masahiro Yamada
2018-07-12 05:51:02 UTC
Permalink
Hi.
Post by Paul Smith
Post by Masahiro Yamada
$ make -j8 V=1 defconfig all
It will emit _almost_ raw log.
At this time I don't have the resources to do this. It will take me
some time if you're going to wait for me.
No problem. No rush here.

The Linux build system is pretty complex.


I attached information that might be helpful.
Please take it FWIW.
Post by Paul Smith
Post by Masahiro Yamada
Post by Paul Smith
That's what we need to see; the actual command that
was used to recursively invoke make.
Linux Makefile hides "Entering directory ..." messages.
If you want to see them,
please comment out the following line of the top Makefile.
MAKEFLAGS += --no-print-directory
I'm not that interested in seeing these but it wouldn't hurt. Mainly I
need to see the manner in which the recursive make is invoked and the
recipes used to do so.
OK, then V=1 is good enough
to show what you want.






How Linux Makefile works?
--------------------------


Linux needs the configuration before building any objects.


"make defconfig" configures the build setting
suitable for general usage.

"make all" actually builds the kernel.



"make defconfig all" is a useful shorthand
to do the two in a row.


[1] User run "make -j8 defconfig all"

[2] The configuration must finish before the Makefile
starts building any objects.

The following code divide "make defconfig all"
into two sub-makes "make defconfig" and "make all"
These two are run one by one.

__build_one_by_one:
$(Q)set -e; \
for i in $(MAKECMDGOALS); do \
$(MAKE) -f $(srctree)/Makefile $$i; \
done



[3] Sub-make "make defconfig" will find a recipe here

%config: scripts_basic outputmakefile FORCE
$(Q)$(MAKE) $(build)=scripts/kconfig $@


[4] Sub-make "make all" starts, but top Makefile
needs to load include/config/auto.conf.

ifeq ($(dot-config),1)
-include include/config/auto.conf
endif


[5] A recipe to generate include/config/auto.conf
is found here.

include/config/%.conf: $(KCONFIG_CONFIG) include/config/auto.conf.cmd
$(Q)$(MAKE) -f $(srctree)/Makefile syncconfig


[6] include/config/auto.conf has been generated,
so "make all" restarts.


I think the latest GNU Make emits
'jobserver unavailable: using -j1. Add '+' to parent make rule'
just before "make all" restarts.



[7] '-j8 --jobserver-auth' disappears from MAKEFLAGS,
then rest of the build process is serialized.
--
Best Regards
Masahiro Yamada
Paul Smith
2018-07-12 11:26:40 UTC
Permalink
Post by Masahiro Yamada
I attached information that might be helpful.
Please take it FWIW.
The content you quote looks correct to me so if that's what's really in
the makefiles then the problem is a deeper mystery.

Can you clarify what version of GNU make you're using and how you
obtained it? I believe that the change you mention is not available in
any released version of GNU make, yet.
Masahiro Yamada
2018-07-12 11:55:48 UTC
Permalink
Post by Paul Smith
Post by Masahiro Yamada
I attached information that might be helpful.
Please take it FWIW.
The content you quote looks correct to me so if that's what's really in
the makefiles then the problem is a deeper mystery.
Can you clarify what version of GNU make you're using and how you
obtained it? I believe that the change you mention is not available in
any released version of GNU make, yet.
The latest release is GNU Make 4.2.1,
which is fine with me.


The problem is in the state-of-the-art version in git.

I built Make from git tree, like this:

$ git checkout 2b8e3bb23f96c2458818f011593557d3353dade3
$ autoreconf -i
$ ./configure
$ make update
$ make
$ make install
--
Best Regards
Masahiro Yamada
Masahiro Yamada
2018-08-09 02:47:50 UTC
Permalink
Hello.
Post by Masahiro Yamada
Post by Paul Smith
Post by Masahiro Yamada
I attached information that might be helpful.
Please take it FWIW.
The content you quote looks correct to me so if that's what's really in
the makefiles then the problem is a deeper mystery.
Can you clarify what version of GNU make you're using and how you
obtained it? I believe that the change you mention is not available in
any released version of GNU make, yet.
Any news about this?


I tested the latest git version:

commit a1bb739165a944769cbb4a6e4f027ac9c2587122
Author: Paul Smith <***@gnu.org>
Date: Sat Aug 4 19:20:58 2018 -0400

* NEWS: Update for the latest changes.




I still see the same problem when building Linux kernel
with -j option.


Thanks.
Post by Masahiro Yamada
The latest release is GNU Make 4.2.1,
which is fine with me.
The problem is in the state-of-the-art version in git.
$ git checkout 2b8e3bb23f96c2458818f011593557d3353dade3
$ autoreconf -i
$ ./configure
$ make update
$ make
$ make install
--
Best Regards
Masahiro Yamada
--
Best Regards
Masahiro Yamada
Masahiro Yamada
2018-09-10 08:16:32 UTC
Permalink
Hello.


Seems no more feedback for this regression report.

OK, the Linux kernel build system is too complicated.
So, I have come back with a much simpler test-case.

Here, the test-case is only 2 makefiles, less than 50 lines.


Please take a look this problem.

As I already reported, the git-bisect points to

commit 2b8e3bb23f96c2458818f011593557d3353dade3
Author: Paul Smith <***@gnu.org>
Date: Mon Jan 2 14:08:54 2017 -0500

Clean up close-on-exec, particularly with jobserver pipes.




I attached the test case below.

For convenience, this test-case is available from my GitHub repository as well:
https://github.com/masahir0y/make-testcase



[Test Case]

----------------------(Makefile)-------------------------------

# If MAKECMDGOALS contains two or more targets, handle them one by one.
ifneq ($(word 2,$(MAKECMDGOALS)),)
PHONY += $(MAKECMDGOALS) __build_one_by_one

$(filter-out __build_one_by_one, $(MAKECMDGOALS)): __build_one_by_one
@:

__build_one_by_one:
set -e; \
for i in $(MAKECMDGOALS); do \
$(MAKE) -f Makefile $$i; \
done

else

ifeq ($(MAKECMDGOALS),config)

config: FORCE
touch .config

else

include auto.conf

PHONY += all
all:
echo all

auto.conf: .config
$(MAKE) -f Makefile.config syncconfig

endif
endif

PHONY += FORCE
FORCE:

.PHONY: $(PHONY)
----------------------(Makefile END)---------------------------

----------------------(Makefile.config)---------------------------
syncconfig:
touch auto.conf
----------------------(Makefile.config END)---------------------------



[How to run the test case?]

$ make -j8 config all



Thanks.
Post by Masahiro Yamada
Hello.
Post by Masahiro Yamada
Post by Paul Smith
Post by Masahiro Yamada
I attached information that might be helpful.
Please take it FWIW.
The content you quote looks correct to me so if that's what's really in
the makefiles then the problem is a deeper mystery.
Can you clarify what version of GNU make you're using and how you
obtained it? I believe that the change you mention is not available in
any released version of GNU make, yet.
Any news about this?
commit a1bb739165a944769cbb4a6e4f027ac9c2587122
Date: Sat Aug 4 19:20:58 2018 -0400
* NEWS: Update for the latest changes.
I still see the same problem when building Linux kernel
with -j option.
Thanks.
Post by Masahiro Yamada
The latest release is GNU Make 4.2.1,
which is fine with me.
The problem is in the state-of-the-art version in git.
$ git checkout 2b8e3bb23f96c2458818f011593557d3353dade3
$ autoreconf -i
$ ./configure
$ make update
$ make
$ make install
--
Best Regards
Masahiro Yamada
--
Best Regards
Masahiro Yamada
--
Best Regards
Masahiro Yamada
Mike Shal
2018-09-10 16:24:21 UTC
Permalink
On Mon, Sep 10, 2018 at 4:18 AM Masahiro Yamada <
Post by Masahiro Yamada
Hello.
Seems no more feedback for this regression report.
OK, the Linux kernel build system is too complicated.
So, I have come back with a much simpler test-case.
Here, the test-case is only 2 makefiles, less than 50 lines.
Please take a look this problem.
As I already reported, the git-bisect points to
commit 2b8e3bb23f96c2458818f011593557d3353dade3
Date: Mon Jan 2 14:08:54 2017 -0500
Clean up close-on-exec, particularly with jobserver pipes.
I attached the test case below.
https://github.com/masahir0y/make-testcase
[Test Case]
----------------------(Makefile)-------------------------------
# If MAKECMDGOALS contains two or more targets, handle them one by one.
ifneq ($(word 2,$(MAKECMDGOALS)),)
PHONY += $(MAKECMDGOALS) __build_one_by_one
$(filter-out __build_one_by_one, $(MAKECMDGOALS)): __build_one_by_one
set -e; \
for i in $(MAKECMDGOALS); do \
$(MAKE) -f Makefile $$i; \
done
else
ifeq ($(MAKECMDGOALS),config)
config: FORCE
touch .config
else
include auto.conf
PHONY += all
echo all
auto.conf: .config
$(MAKE) -f Makefile.config syncconfig
endif
endif
PHONY += FORCE
.PHONY: $(PHONY)
----------------------(Makefile END)---------------------------
----------------------(Makefile.config)---------------------------
touch auto.conf
----------------------(Makefile.config END)---------------------------
It looks like the patch in question changed the default state of the
jobserver tokens to be not inherited. Since this Makefile generates an
included file (auto.conf), the 'make all' invocation creates that file and
then re-invokes itself. However, the re-invoking of make happens in main.c
instead of posixos.c, and is not wrapped in jobserver_pre_child/post_child
which are responsible for updating the jobserver fd inheritance. I'd guess
main.c also needs to call fd_inherit on the jobserver tokens (or use
jobserver_pre_child?) before the re-invocation call. The following hack
seems to fix the issue:

diff --git a/main.c b/main.c
index 5dd539b..83f30f8 100644
--- a/main.c
+++ b/main.c
@@ -2446,7 +2446,9 @@ main (int argc, char **argv, char **envp)
if (stack_limit.rlim_cur)
setrlimit (RLIMIT_STACK, &stack_limit);
#endif
+ jobserver_pre_child(1);
exec_command ((char **)nargv, environ);
+ jobserver_post_child(1);
#endif
free (aargv);
break;

-Mike
Masahiro Yamada
2018-09-11 01:39:57 UTC
Permalink
Hello Mike,
Post by Mike Shal
It looks like the patch in question changed the default state of the
jobserver tokens to be not inherited. Since this Makefile generates an
included file (auto.conf), the 'make all' invocation creates that file and
then re-invokes itself. However, the re-invoking of make happens in main.c
instead of posixos.c, and is not wrapped in jobserver_pre_child/post_child
which are responsible for updating the jobserver fd inheritance. I'd guess
main.c also needs to call fd_inherit on the jobserver tokens (or use
jobserver_pre_child?) before the re-invocation call. The following hack
diff --git a/main.c b/main.c
index 5dd539b..83f30f8 100644
--- a/main.c
+++ b/main.c
@@ -2446,7 +2446,9 @@ main (int argc, char **argv, char **envp)
if (stack_limit.rlim_cur)
setrlimit (RLIMIT_STACK, &stack_limit);
#endif
+ jobserver_pre_child(1);
exec_command ((char **)nargv, environ);
+ jobserver_post_child(1);
#endif
free (aargv);
break;
-Mike
Yes, with this change,
I can build Linux kernel in parallel now.

Thank you!
--
Best Regards
Masahiro Yamada
Paul Smith
2018-09-15 20:20:39 UTC
Permalink
Yes, with this change, I can build Linux kernel in parallel now.
I've pushed a similar change along with a regression test to the Git
repository.

Thanks for your efforts tracking this down! It was certainly an
obscure situation.

Continue reading on narkive:
Loading...