This seems to have worked in 15f105d41f (5
months ago) but broke somewhere in the meantime.
The current module doesn't seem to be underdocumented and might need a
serious refactor. It requires quite some hacks to get it to work (see
https://github.com/NixOS/nixpkgs/issues/86305#issuecomment-621129942),
or how the ldap.nix test used systemd.services.openldap.preStart and
made quite some assumptions on internals.
Mic92 agreed on being added as a maintainer for the module, as he uses
it a lot and can possibly fix eventual breakages. For the most basic
startup breakages, the remaining openldap.nix test might suffice.
`doas` is a lighter alternative to `sudo` that "provide[s] 95% of the
features of `sudo` with a fraction of the codebase" [1]. I prefer it to
`sudo`, so I figured I would add a NixOS module in order for it to be
easier to use. The module is based off of the existing `sudo` module.
[1] https://github.com/Duncaen/OpenDoas
This is a follow-up to the PR #82026 that contains the promised tests.
In this test I am testing if we can properly propagate prefixes received
via DHCPv6 PD with the networkd options in our module system.
The comments in the test should be sufficient to follow the idea and
what is going on.
Setting up a XMPP chat server is a pretty deep rabbit whole to jump in
when you're not familiar with this whole universe. Your experience
with this environment will greatly depends on whether or not your
server implements the right set of XEPs.
To tackle this problem, the XMPP community came with the idea of
creating a meta-XEP in charge of listing the desirable XEPs to comply
with. This meta-XMP is issued every year under an new XEP number. The
2020 one being XEP-0423[1].
This prosody nixos module refactoring makes complying with XEP-0423
easier. All the necessary extensions are enabled by default. For some
extensions (MUC and HTTP_UPLOAD), we need some input from the user and
cannot provide a sensible default nixpkgs-wide. For those, we guide
the user using a couple of assertions explaining the remaining manual
steps to perform.
We took advantage of this substential refactoring to refresh the
associated nixos test.
Changelog:
- Update the prosody package to provide the necessary community
modules in order to comply with XEP-0423. This is a tradeoff, as
depending on their configuration, the user might end up not using them
and wasting some disk space. That being said, adding those will
allow the XEP-0423 users, which I expect to be the majority of
users, to leverage a bit more the binary cache.
- Add a muc submodule populated with the prosody muc defaults.
- Add a http_upload submodule in charge of setting up a basic http
server handling the user uploads. This submodule is in is
spinning up an HTTP(s) server in charge of receiving and serving the
user's attachments.
- Advertise both the MUCs and the http_upload endpoints using mod disco.
- Use the slixmpp library in place of the now defunct sleekxmpp for
the prosody NixOS test.
- Update the nixos test to setup and test the MUC and http upload
features.
- Add a couple of assertions triggered if the setup is not xep-0423
compliant.
[1] https://xmpp.org/extensions/xep-0423.html
When testing WireGuard updates, I usually run the VM-tests with
different kernels to make sure we're not introducing accidental
regressions for e.g. older kernels.
I figured that we should automate this process to ensure continuously
that WireGuard works fine on several kernels.
For now I decided to test the latest LTS version (5.4) and
the latest kernel (currently 5.6). We can add more kernels in the
future, however this seems to significantly slow down evaluation and
time.
The list can be customized by running a command like this:
nix-build nixos/tests/wireguard --arg kernelVersionsToTest '["4.19"]'
The `kernelPackages` argument in the tests is null by default to make
sure that it's still possible to invoke the test-files directly. In that
case the default kernel of NixOS (currently 5.4) is used.
The elasticsearch-curator was not deleting indices because the indices
had ILM policies associated with them. This is now fixed by
configuring the elasticsearch-curator with `allow_ilm_indices: true`.
Also see: https://github.com/elastic/curator/issues/1490
Enables multi-site configurations.
This break compatibility with prior configurations that expect options
for a single dokuwiki instance in `services.dokuwiki`.
Shimming out the Let's Encrypt domain name to reuse client configuration
doesn't work properly (Pebble uses different endpoint URL formats), is
recommended against by upstream,[1] and is unnecessary now that the ACME
module supports specifying an ACME server. This commit changes the tests
to use the domain name acme.test instead, and renames the letsencrypt
node to acme to reflect that it has nothing to do with the ACME server
that Let's Encrypt runs. The imports are renamed for clarity:
* nixos/tests/common/{letsencrypt => acme}/{common.nix => client}
* nixos/tests/common/{letsencrypt => acme}/{default.nix => server}
The test's other domain names are also adjusted to use *.test for
consistency (and to avoid misuse of non-reserved domain names such
as standalone.com).
[1] https://github.com/letsencrypt/pebble/issues/283#issuecomment-545123242
Co-authored-by: Yegor Timoshenko <yegortimoshenko@riseup.net>
This was added in aade4e577b, but the
implementation of the ACME module has been entirely rewritten since
then, and the test seems to run fine on AArch64.
linux-hardened sets kernel.unprivileged_userns_clone=0 by default; see
anthraxx/linux-hardened@104f44058f.
This allows the Nix sandbox to function while reducing the attack
surface posed by user namespaces, which allow unprivileged code to
exercise lots of root-only code paths and have lead to privilege
escalation vulnerabilities in the past.
We can safely leave user namespaces on for privileged users, as root
already has root privileges, but if you're not running builds on your
machine and really want to minimize the kernel attack surface then you
can set security.allowUserNamespaces to false.
Note that Chrome's sandbox requires either unprivileged CLONE_NEWUSER or
setuid, and Firefox's silently reduces the security level if it isn't
allowed (see about:support), so desktop users may want to set:
boot.kernel.sysctl."kernel.unprivileged_userns_clone" = true;
* nixos/k3s: simplify config expression
* nixos/k3s: add config assertions and trim unneeded bits
* nixos/k3s: add a test that k3s works; minor module improvements
This is a single-node test. Eventually we should also have a multi-node
test to verify the agent bit works, but that one's more involved.
* nixos/k3s: add option description
* nixos/k3s: add defaults for token/serveraddr
Now that the assertion enforces their presence, we dont' need to use the typesystem for it.
* nixos/k3s: remove unneeded sudo in test
* nixos/k3s: add to test list
For reasons yet unknown, the vxlan backend doesn't work (at least inside
the qemu networking), so this is moved to the udp backend.
Note changing the backend apparently also changes the interface name,
it's now `flannel0`, not `flannel.1`
fixes#74941
This was whitespace-sensitive, kept fighting with my editor and broke
the tests easily. To fix this, let python convert the output to
individual lines, and strip whitespace from them before comparing.
Also removed `pkgs.hydra-flakes` since flake-support has been merged
into master[1]. Because of that, `pkgs.hydra-unstable` is now compiled
against `pkgs.nixFlakes` and currently requires a patch since Hydra's
master doesn't compile[2] atm.
[1] https://github.com/NixOS/hydra/pull/730
[2] https://github.com/NixOS/hydra/pull/732
Upgrades Hydra to the latest master/flake branch. To perform this
upgrade, it's needed to do a non-trivial db-migration which provides a
massive performance-improvement[1].
The basic ideas behind multi-step upgrades of services between NixOS versions
have been gathered already[2]. For further context it's recommended to
read this first.
Basically, the following steps are needed:
* Upgrade to a non-breaking version of Hydra with the db-changes
(columns are still nullable here). If `system.stateVersion` is set to
something older than 20.03, the package will be selected
automatically, otherwise `pkgs.hydra-migration` needs to be used.
* Run `hydra-backfill-ids` on the server.
* Deploy either `pkgs.hydra-unstable` (for Hydra master) or
`pkgs.hydra-flakes` (for flakes-support) to activate the optimization.
The steps are also documented in the release-notes and in the module
using `warnings`.
`pkgs.hydra` has been removed as latest Hydra doesn't compile with
`pkgs.nixStable` and to ensure a graceful migration using the newly
introduced packages.
To verify the approach, a simple vm-test has been added which verifies
the migration steps.
[1] https://github.com/NixOS/hydra/pull/711
[2] https://github.com/NixOS/nixpkgs/pull/82353#issuecomment-598269471
While our ETag patch works pretty fine if it comes to serving data off
store paths, it unfortunately broke something that might be a bit more
common, namely when using regexes to extract path components of
location directives for example.
Recently, @devhell has reported a bug with a nginx location directive
like this:
location ~^/\~([a-z0-9_]+)(/.*)?$" {
alias /home/$1/public_html$2;
}
While this might look harmless at first glance, it does however cause
issues with our ETag patch. The alias directive gets broken up by nginx
like this:
*2 http script copy: "/home/"
*2 http script capture: "foo"
*2 http script copy: "/public_html/"
*2 http script capture: "bar.txt"
In our patch however, we use realpath(3) to get the canonicalised path
from ngx_http_core_loc_conf_s.root, which returns the *configured* value
from the root or alias directive. So in the example above, realpath(3)
boils down to the following syscalls:
lstat("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("/home/$1", 0x7ffd08da6f60) = -1 ENOENT (No such file or directory)
During my review[1] of the initial patch, I didn't actually notice that
what we're doing here is returning NGX_ERROR if the realpath(3) call
fails, which in turn causes an HTTP 500 error.
Since our patch actually made the canonicalisation (and thus additional
syscalls) necessary, we really shouldn't introduce an additional error
so let's - at least for now - silently skip return value if realpath(3)
has failed.
However since we're using the unaltered root from the config we have
another issue, consider this root:
/nix/store/...-abcde/$1
Calling realpath(3) on this path will fail (except if there's a file
called "$1" of course), so even this fix is not enough because it
results in the ETag not being set to the store path hash.
While this is very ugly and we should fix this very soon, it's not as
serious as getting HTTP 500 errors for serving static files.
I added a small NixOS VM test, which uses the example above as a
regression test.
It seems that my memory is failing these days, since apparently I *knew*
about this issue since digging for existing issues in nixpkgs, I found
this similar pull request which I even reviewed:
https://github.com/NixOS/nixpkgs/pull/66532
However, since the comments weren't addressed and the author hasn't
responded to the pull request, I decided to keep this very commit and do
a follow-up pull request.
[1]: https://github.com/NixOS/nixpkgs/pull/48337
Signed-off-by: aszlig <aszlig@nix.build>
Reported-by: @devhell
Acked-by: @7c6f434c
Acked-by: @yorickvP
Merges: https://github.com/NixOS/nixpkgs/pull/80671
Fixes: https://github.com/NixOS/nixpkgs/pull/66532
Dropbear lags behind OpenSSH significantly in both support for modern
key formats like `ssh-ed25519`, let alone the recently-introduced
U2F/FIDO2-based `sk-ssh-ed25519@openssh.com` (as I found when I switched
my `authorizedKeys` over to it and promptly locked myself out of my
server's initrd SSH, breaking reboots), as well as security features
like multiprocess isolation. Using the same SSH daemon for stage-1 and
the main system ensures key formats will always remain compatible, as
well as more conveniently allowing the sharing of configuration and
host keys.
The main reason to use Dropbear over OpenSSH would be initrd space
concerns, but NixOS initrds are already large (17 MiB currently on my
server), and the size difference between the two isn't huge (the test's
initrd goes from 9.7 MiB to 12 MiB with this change). If the size is
still a problem, then it would be easy to shrink sshd down to a few
hundred kilobytes by using an initrd-specific build that uses musl and
disables things like Kerberos support.
This passes the test and works on my server, but more rigorous testing
and review from people who use initrd SSH would be appreciated!
This mirrors the behaviour of systemd - It's udev that parses `.link`
files, not `systemd-networkd`.
This was originally applied in 36ef112a47,
but was reverted due to 1115959a8d causing
evaluation errors on hydra.
...even when networkd is disabled
This reverts commit ce78f3ac70, reversing
changes made to dc34da0755.
I'm sorry; Hydra has been unable to evaluate, always returning
> error: unexpected EOF reading a line
and I've been unable to reproduce the problem locally. Bisecting
pointed to this merge, but I still can't see what exactly was wrong.
- Fix misspelled option. mkRenamedOptionModule is not used because the
option hasn't really worked before.
- Add missing cfg.telemetryPath arg to ExecStart.
- Fix mkdir invocation in test.
Add a cage module to nixos. This can be used to make kiosk-style
systems that boot directly to a single application. The user (demo by
default) is automatically logged in by this service and the
program (xterm by default) is automatically started.
This is useful for some embedded, single-user systems where we want
automatic booting. To keep the system secure, the user should have
limited privileges.
Based on the service provided in the Cage wiki here:
https://github.com/Hjdskes/cage/wiki/Starting-Cage-on-boot-with-systemd
Co-Authored-By: Florian Klink <flokli@flokli.de>
* prometheus-nginx-exporter: 0.5.0 -> 0.6.0
* nixos/prometheus-nginx-exporter: update for 0.6.0
Added new option constLabels and updated virtualHost name in the
exporter's test.
This is useful when buildLayeredImage is called in a generic way
that should allow simple (base) images to be built, which may not
reference any store paths.
I am not sure how this ever passed on hydra but 30s is barely enough to
pass the configure phase of opensmtpd. It is likely the package was
built as part of another jobset. Whenever it is built as part of the
test execution the timeout propagates and 30s is clearly not enough for
that.
The subtest was mainly written to demonstrate the VRF-issues with a
5.x-kernel. However this breaks the entire test now as we have 5.4 as
default kernel. Disabling the test for now, I still need to find some
time to investigate.
Depending on the network management backend being used, if the interface
configuration in stage 1 is not cleared, there might still be some old
addresses or routes from stage 1 present in stage 2 after network
configuration has finished.
This makes predictable interfaces names available as soon as possible
with udev by adding the default network link units to initrd which are read
by udev. Also adds some udev rules that are needed but which would normally
loaded from the udev store path which is not included in the initrd.
invalid test was introduced in 297d1598ef
and it is disabled in the shipped daemon.conf.
I forgot to reflect that in the module, which caused the daemon to print the following on start-up:
FuEngine invalid has incorrect built version invalid
and the command to warn:
WARNING: The daemon has loaded 3rd party code and is no longer supported by the upstream developers!
To reduce the change of this happening in the future, I moved the list of default disabled plug-ins to the package expression.
I also set the value of the NixOS module option in the config section of the module instead of the default value used previously,
which will allow users to not care about these plug-ins.
In 87a19e9048 I merged staging-next into master using the GitHub gui as intended.
In ac241fb7a5 I merged master into staging-next for the next staging cycle, however, I accidentally pushed it to master.
Thinking this may cause trouble, I reverted it in 0be87c7979. This was however wrong, as it "removed" master.
This reverts commit 0be87c7979.
I merged master into staging-next but accidentally pushed it to master.
This should get us back to 87a19e9048.
This reverts commit ac241fb7a5, reversing
changes made to 76a439239e.
The old Quake3 NixOS test was removed in
50ea99cbc1 which served as a nice demo to
showcase what NixOS tests are capable of. This commit adds the same
functionality to run real openarena clients.
This module allows root autoLogin, so we would break that for users, but
they shouldn't be using it anyways. This gives the impression like auto
is some special display manager, when it's just lightdm and special pam
rules to allow root autoLogin. It was created for NixOS's testing
so I believe this is where it belongs.
- the `imageFile` option allows to load an image from a derivation
- the `dependsOn` option can be used to specify dependencies between container systemd units.
Co-authored-by: Christian Höppner <mkaito@users.noreply.github.com>
* nixos/buildkite: drop user option
This reverts 8c6b1c3eaa.
Turns out, buildkite-agent has logic to write .ssh/known_hosts files and
only really works when $HOME and the user homedir are in sync.
On top of that, we provision ssh keys in /var/lib/buildkite-agent, which
doesn't work if that other users' homedir points elsewhere (we can cheat
by setting $HOME, but then getent and $HOME provide conflicting
results).
So after all, it's better to only run the system-wide buildkite agent as
the "buildkite-agent" user only - if one wants to run buildkite as
different users, systemd user services might be a better fit.
* nixosTests.buildkite-agent: add node with separate user and no ssh key
The current nixpkgs use elasticsearch-curator 5.8.1. As of version 5.7.0,
elasticsearch-curator supports elasticsearch 7, thus this change enables tests
with ES 7.
Updates required:
- Use vpc image format (new default, supported by Amazon)
- Pass full image filename to makeEc2Test
- Increase memory allocation for nixos-rebuild
- Set a networking.hostName for services.httpd
- Add appropriate escaping in literal userdata
While I'm here, try to make it fail fast.
This fixes the patch for nginx to clear the Last-Modified header if a
static file is served from the Nix store.
So far we only used the ETag from the store path, but if the
Last-Modified header is always set to "Thu, 01 Jan 1970 00:00:01 GMT",
Firefox and Chrome/Chromium seem to ignore the ETag and simply use the
cached content instead of revalidating.
Alongside the fix, this also adds a dedicated NixOS VM test, which uses
WebDriver and Firefox to check whether the content is actually served
from the browser's cache and to have a more real-world test case.
This is what I've suspected a while ago[1]:
> Heads-up everyone: After testing this in a few production instances,
> it seems that some browsers still get cache hits for new store paths
> (and changed contents) for some reason. I highly suspect that it might
> be due to the last-modified header (as mentioned in [2]).
>
> Going to test this with last-modified disabled for a little while and
> if this is the case I think we should improve that patch by disabling
> last-modified if serving from a store path.
Much earlier[2] when I reviewed the patch, I wrote this:
> Other than that, it looks good to me.
>
> However, I'm not sure what we should do with Last-Modified header.
> From RFC 2616, section 13.3.4:
>
> - If both an entity tag and a Last-Modified value have been
> provided by the origin server, SHOULD use both validators in
> cache-conditional requests. This allows both HTTP/1.0 and
> HTTP/1.1 caches to respond appropriately.
>
> I'm a bit nervous about the SHOULD here, as user agents in the wild
> could possibly just use Last-Modified and use the cached content
> instead.
Unfortunately, I didn't pursue this any further back then because
@pbogdan noted[3] the following:
> Hmm, could they (assuming they are conforming):
>
> * If an entity tag has been provided by the origin server, MUST
> use that entity tag in any cache-conditional request (using If-
> Match or If-None-Match).
Since running with this patch in some deployments, I found that both
Firefox and Chrome/Chromium do NOT re-validate against the ETag if the
Last-Modified header is still the same.
So I wrote a small NixOS VM test with Geckodriver to have a test case
which is closer to the real world and I indeed was able to reproduce
this.
Whether this is actually a bug in Chrome or Firefox is an entirely
different issue and even IF it is the fault of the browsers and it is
fixed at some point, we'd still need to handle this for older browser
versions.
Apart from clearing the header, I also recreated the patch by using a
plain "git diff" with a small description on top. This should make it
easier for future authors to work on that patch.
[1]: https://github.com/NixOS/nixpkgs/pull/48337#issuecomment-495072764
[2]: https://github.com/NixOS/nixpkgs/pull/48337#issuecomment-451644084
[3]: https://github.com/NixOS/nixpkgs/pull/48337#issuecomment-451646135
Signed-off-by: aszlig <aszlig@nix.build>
When using format-strings, curly brackets need to be escaped using `{{`
to avoid errors from python.
And apparently, Perl's `==` is used to compare substrings[1] which is why
the translation to `assert http_code == "304"` failed as the string
contains several headers from curl.
[1] Just check `perl <(echo 'die "alarm" if "foo\n304" == 304')`
We've rewritten it use GDM, and we can now autologin
to the X11 session because of the accountsservice preStart
script for autologin. It should work similar to how the wayland
test works, just with a few nuanced differences for xorg.
The upstream session files display managers use have no concept of sessions being composed from
desktop manager and window manager. To be able to set upstream session files as default
session, we need a single option. Having two different ways to set default session would be confusing,
though, so we decided to deprecate the old method.
We also created separate script for each session, just like we already had a separate desktop
file for each one, and started using displayManager.sessionPackages mechanism to make the
session handling more uniform.
There's two ways of providing graphical sessions now:
- `displayManager.session` via. `desktopManager.session` and
`windowManager.session`
- `displayManager.sessionPackages`
`sessionPackages` doesn't make a distinction between desktop and window
managers. This makes selecting a session provided by a package using
`desktopManager.default` nonsensical.
We therefor introduce `displayManager.defaultSession` which can select a session
from either `displayManager.session` or `displayManager.sessionPackages`.
It will default to `desktopManager.default + windowManager.default` as before.
If the dm default is "none" it will select the first provided session from
`sessionPackages`.
This reduces the length of the gitea-test by creating a single
`makeGiteaTest` function which creates the configuration for a testcase
with a given database driver.
This PR is part of the networking.* namespace cleanup. We feel that
networking.hostConf is rarely used and provides little value compared to
using environment.etc."host.conf" directly.
Provide sensible default: multi on
From looking at
* https://hydra.nixos.org/build/107447356
it appears the subtest fails at this exact step.
OCR in the testing driver has been notoriously
flaky, so let's just match alice's user description.
This does have the downside of not verifying the
appearence of other user cards, which was an
issue with the greeter in the past.
This PR is part of the networking.* namespace cleanup.
ssmtp used to be configured via `networking.defaultMailServer` which is
sort of misleading since it provides options only for ssmtp. Other
dumb mail relays like nullmailer have always been living under
services.
The intent of this PR is to align ssmtp's options with those of similar
services. Specifically, two renames have been done:
* Rename `networking.defaultMailHost` to `services.ssmtp`.
* Rename `directDelivery` to `enable` because this is what it basically does.
Previously systemd-networkd.service ran as systemd-network:nogroup.
The wireguard private key file is now owned by root:systemd-network with
mode 0640. It is therefore required that the systemd-network user is in the group
with the same name, so that it is able to read the key file.
osquery was marked as broken since April.
If somebody steps up to fix it, we can always revive it from the
histroy, but there's not much value in shipping completely broken things
in current master.
cc @ma27
Updating `systemd-networkd-wireguard` to use the python test runner.
This change was purely syntactic. This migration did not require any
semantic change.
The SLIM project is abandoned and their last release was in 2013.
Because of this it poses a security risk to systems, no one is working
on it or picked up maintenance. It also lacks compatibility with systemd
and logind sessions. For users, there liikely isn't anything like slim
that's as lightweight in terms of dependencies.
Slurmdbd requires a password database which is stored in slurmdbd.conf.
A seperate config file avoids that the password ends up in the nix store.
Slurmdbd does 19.5 does not support MySQL socket conections.
Adapated the slurm test to provide username and password.
https://roundcube.net/news/2019/11/09/roundcube-1.4.0-released
* `curl` cmd in the test can fail as roundcube returns a http/401 if
unauthorized (and we're explicitly requesting the login form). By
checking if the `persistent_login` plugin is loaded, the assertion is
still valid)
* Use `$argv[0]` to determine install path in the installer script. I'm
not exactly sure why, but it seems as `__DIR__` now resolves symlinks
which breaks the installer if roundcube is in a `buildEnv` with
third-party plugins.
This prevents services to be started before they're initialized, and
renders the `systemd.targets.ceph.wantedBy = lib.mkForce [];` hack in
the vm tests obsolete - The config now starts up ceph after a reboot,
too.
Let's take advantage of that, crash all VMs, and boot them up again.
Don't pass user and group to ceph, and rely on it to drop ceps, but let
systemd handle running it as the appropriate user.
This also inlines the extraServiceConfig into the makeService function,
as we have conditionals depending on daemonType there anyways.
Use StateDirectory to create directories in
/var/lib/ceph/${daemonType}/${clusterName}-${daemonId}.
There previously was a condition on daemonType being one of mds,mon,rgw
or mgr. We only instantiate makeServices with these types, and "osd" was
special.
In the osd case, test examples suggest it'd be in something like
/var/lib/ceph/osd/ceph-${cfg.osd0.name} - so it's not special at all,
but exactly like the pattern for the others.
During initialization, we also need these folders, before the unit is
started up. Move the mkdir -p commands in the vm tests to the line
immediately before they're required.
The two new options make it possible to create the interface in one namespace
and move it to a different one, as explained at https://www.wireguard.com/netns/.
This is a good example of a package/module that should be distributed
externally (e.g. as a flake [1]): it's not stable yet so anybody who
seriously wants to use it will want to use the upstream repo. Also,
it's highly specialized so NixOS is not really the right place at the
moment (every NixOS module slows down NixOS evaluation for everybody).
[1] https://github.com/edolstra/jormungandr/tree/flake
The recent custom endpoint addition allows us to directly point
certbot to the custom Pebble directory endpoint.
Thanks to that, we can ditch the Pebble patch we were using so far;
making this test maintenance easier.
* lm_sensors: add fancontrol module + nixos test
fancontrol is a small script that checks temperature sensors and adapts
fan speeds accordingly. It reads a text config file that can be
auto-generated by running the pwmconfig wizard on the live system.
Rename the old ceph test to ceph-single-node and add a new test
ceph-multi-node. The ceph-single-node represents a dev cluster whereas
ceph-multi-node is closer to a prod cluster.
Let's encrypt bumped ACME to V2. We need to update our nixos test to
be compatible with this new protocol version.
We decided to drop the Boulder ACME server in favor of the more
integration test friendly Pebble.
- overriding cacert not necessary
- this avoids rebuilding lots of packages needlessly
- nixos/tests/acme: use pebble's ca for client tests
- pebble always generates its own ca which has to be fetched
TODO: write proper commit msg :)
Default getfacl behavior is to remove leading slash on absolute
paths in its header printed to stdout.
Before the header it will also print a message about it...
Switches -p -or --absolute-names can turn this off
and remove some noise from our tests logs.
The networking.virtual test does not work with networkd yet, for
multiple reasons:
- network-online.target is not reached, because tun0 and tap0 are
considered as required for online but _not_ brought up or assigned
the configured addresses
- the commands later in the test rely on some units from the scripted
network setup
cc @fpletz networkd exper
cc @globin we looked at this together
The test has recently been failing due to the IPv6 address
on the server still being in the tentative state, when the
client sends its first request. The server will not start
using the IPv6 address until DAD has completed.
Scripted networking seems not to wait for DAD completion
before completing network-online.target, so let's switch
to networkd instead, which does.
During the last update, `hydra-notify` was rewritten as a daemon which
listens to postgresql notifications for each build[1]. The module
uses the `hydra-notify.service` unit from upstream's Hydra module and
the VM test ensures that email notifications are sent properly.
Also updated `hydra-init.service` to install `pg_trgm` on a local
database if needed[2].
[1] c7861b85c4
[2] 8a0a5ec3a3
We ship `https://cache.nixos.org` as binary cache by default which
automatically substitutes the test derivation used inside the Hydra
test. However it needs to be built locally to confirm that
`hydra-queue-runner` works properly.
Also inherited the platform name for the test derivation from `system`
to ensure that the build can be tested on each supported platform.
ZHF #68361
It turned out that /dev/snd/* always exists even if there are no sound
drivers loaded at all. Loading `snd` and `snd_timer` fixes that
situation. It is probably fair to assume someone that wants to use sound
also enables that in the NixOS configuration.
Add support for storing secrets in files outside the nix store, since
files in the nix store are world-readable and secrets therefore can't
be stored safely there.
The old string options are kept, since they can potentially be handy
for testing purposes, but their descriptions now state that they
shouldn't be used in production. The manual section is updated to use
the file options rather than the string options and the tests now test
both.
Always enable the UART because the VirtualBug bug that required running without the UART was fixed in 6.0.10. Stop using an old kernel version because the tests work with the default kernel.
(cherry picked from commit ae93571e8d04cebd69491a789d902d6481e05d3f)
In c814d72b51, a bunch of packages were
changed to use the pname attribute, among them were the quake3-demodata
and quake3-pointrelease which we use for the quake3 test.
Fortunately, having pname available means that we no longer need to
match using a prefix, so fixing this eval error also simplifies our
matching.
I directly pushed this to master because the change is non-controversial
and we can't break things that are already broken :-)
Signed-off-by: aszlig <aszlig@nix.build>
* remove kinetic
* release note
* add johanot as maintainer
nixos/ceph: create option for mgr_module_path
- since the upstream default is no longer correct in v14
* fix module, default location for libexec has changed
* ceph: fix test
* maintain only one version
* ceph-client: init
* include ceph-volume python tool in output
nixos/ceph: extraConfig, fix test, wait for ceph-mgr to become active
* run ceph with disk group permission
* add extraConfig option for the global section
needed per cluster
* clear up how ceph.conf is generated
* fix ceph testcase
Since https://github.com/NixOS/nixpkgs/pull/61321, local-fs.target is
part of sysinit.target again, meaning units without
DefaultDependencies=no will automatically depend on it, and the manual
set dependencies can be dropped.
It turns out that checking for the last mount time of an ext4 file
system isn't a very reliable way to check whether the file system was
properly unmounted.
When creating that test in the first place (88530e02b6),
I was reluctant to inspect the file system when the VM is down and was
searching for a way to check for a clean unmount *after* the file system
was mounted again to make sure we don't need to create a 512 MB raw
image on the host.
Fortunately however, when converting from qcow2, qemu-img actually
writes a sparse file, so for most file systems (that is, file systems
supporting sparse files) this shouldn't waste a lot of disk space.
So when investigating the flakiness, I found that whenever the test is
failing, the unmount of /test-x-initrd-mount was done *before* the final
step during which systemd remounts+unmounts all the remaining file
systems.
I haven't investigated why this is the case, but the test is a
regression test for https://github.com/NixOS/nixpkgs/issues/35268, which
actually didn't unmount the file system *at* *all*, so really all we
need to take care here is whether the unmount has happened and not
*how*.
To make sure that checking the filesystem state is enough for this, I
temporarily replaced the $machine->shutdown call with $machine->crash
and verified that the file system state is "not clean".
Signed-off-by: aszlig <aszlig@nix.build>
Fixes: https://github.com/NixOS/nixpkgs/issues/67555
* nixos/acme: Fix ordering of cert requests
When subsequent certificates would be added, they would
not wake up nginx correctly due to target units only being triggered
once. We now added more fine-grained systemd dependencies to make sure
nginx always is aware of new certificates and doesn't restart too early
resulting in a crash.
Furthermore, the acme module has been refactored. Mostly to get
rid of the deprecated PermissionStartOnly systemd options which were
deprecated. Below is a summary of changes made.
* Use SERVICE_RESULT to determine status
This was added in systemd v232. we don't have to keep track
of the EXITCODE ourselves anymore.
* Add regression test for requesting mutliple domains
* Deprecate 'directory' option
We now use systemd's StateDirectory option to manage
create and permissions of the acme state directory.
* The webroot is created using a systemd.tmpfiles.rules rule
instead of the preStart script.
* Depend on certs directly
By getting rid of the target units, we make sure ordering
is correct in the case that you add new certs after already
having deployed some.
Reason it broke before: acme-certificates.target would
be in active state, and if you then add a new cert, it
would still be active and hence nginx would restart
without even requesting a new cert. Not good! We
make the dependencies more fine-grained now. this should fix that
* Remove activationDelay option
It complicated the code a lot, and is rather arbitrary. What if
your activation script takes more than activationDelay seconds?
Instead, one should use systemd dependencies to make sure some
action happens before setting the certificate live.
e.g. If you want to wait until your cert is published in DNS DANE /
TLSA, you could create a unit that blocks until it appears in DNS:
```
RequiredBy=acme-${cert}.service
After=acme-${cert}.service
ExecStart=publish-wait-for-dns-script
```