The Bazel build part of tensorflow builds can really use as many cores
as is available - to avoid blocking a small machine for literally hours
while it churns away, mark the build as big-parallel so we can schedule
it to run quickly on a machine with lots of cores.
Use the attribute mpi to provide a system wide default MPI
implementation. The default is openmpi (as before).
This now allows for overriding the MPI implentation by using
the overlay mechanism. Build all packages with mpich instead
of the default openmpi can now be achived like this:
self: super:
{
mpi = super.mpich;
}
All derivations that have been using "mpi ? null" to provide optional
building with MPI have been change in the following way to allow for
optional builds with MPI:
{ ...
, mpi
, useMpi ? false
}
Also the following related changes:
* Removed Python 2 support because it's not supported by TF and related packages for a long time.
* Upgraded tensorboard and estimator packages to the required versions.
* Added extra plugins for tensorboard to support profiling.
* In the previous derivation versions, TF_SYSTEM_LIBS didn't have any effect because it was reset at repo fetching stage, so TF always used its own dependencies. Made TF_SYSTEM_LIBS actually work and fixed the errors caused by enabling it.
* Enabled tensorboard by default (but still keeping an option to disable it if needed).
Also:
- patch to remove scipy requirement
- add cuda to RPATH
- don’t include nvidia_x11 (This isn’t needed, we can get it from
/run/opengl-driver being in the RPATH.)
Co-authored-by: Arnout Engelen <arnout@bzzt.net>
Co-authored-by: Daniël de Kok <me@github.danieldk.eu>
This is done by default by the go/rust/bazel builders and allows scripts/tools/users
to inspect the dependencies; since tensorflow is wrapped as a python package, we
should pass this through for consistency.
flat hashes can be substituted through hashed-mirrors, while recursive
hashes can’t. This is especially important for Bazel since the bazel
fetch dependencies can come from multiple different methods (git,
http, ftp, etc.). To do this, we create tar archives from the
output/external directory, which is then extracted to build. All of
the Bazel hashes are all updated.
Major breaking change from 1.x, so treating keeping both versions for now.
(adapted from 33f11be707e39acf96423f97f3baa80d8f11a0cb)
(adapted from 9e8dea7986dbdde850a58c7704182776642d8919)
This allows us to get rid of the compatibility hacks that we had to add
(tf-1.15-bazel-1.0.patch) and also fixes#77626.
(cherry picked from commit c7adb4ee7282672c330b2f8b37ac5f6d74e1a523)