Re: meson: Make test output much more useful on failure (both in CI and locally)

public inbox for [email protected]  
help / color / mirror / Atom feed

From: Andres Freund <[email protected]>
To: Jelte Fennema-Nio <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Cc: Nazir Bilal Yavuz <[email protected]>
Subject: Re: meson: Make test output much more useful on failure (both in CI and locally)
Date: Mon, 26 Jan 2026 11:42:16 -0500
Message-ID: <luhx26gz3tiqkndldbobukt4zeaeu2ppd5qgsibf6hknhq6d7e@kngeuarhk5dg> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>

Hi,

On 2026-01-26 11:13:57 +0100, Jelte Fennema-Nio wrote:
> I've gotten totally fed up with the amount of clicking I need to do
> every time a test fails in CI to find out what exactly failed. The
> output that meson gives in its main output is completely useless
> currently.

This doesn't really seem meson specific, right? It just seems about the output
of our own test tooling?

Meson can't really output something that we hide from it...

I agree that what we currently do is pretty unhelpful.


> Subject: [PATCH v1 1/5] meson: Include pg_regress diffs in meson output

This isn't about meson, it's about adding a new argument to pg_regress.


I think we'd need to limit the size of the output here somewhat. If e.g. the
server crashes, you end up with all subsequent tests failing and printing out
a couple hundred kB of pointless diff output.


> @@ -1521,22 +1523,40 @@ results_differ(const char *testname, const char *resultsfile, const char *defaul
>  	 * append to the diffs summary file.
>  	 */
>  
> -	/* Write diff header */
> -	difffile = fopen(difffilename, "a");
> +	difffile = fopen(difffilename, "a+");
>  	if (difffile)
>  	{
> +		fseek(difffile, 0, SEEK_END);
> +		startpos = ftell(difffile);
>  		fprintf(difffile,
>  				"diff %s %s %s\n",
>  				pretty_diff_opts, best_expect_file, resultsfile);
> +		fflush(difffile);
> +
> +		/* Run diff */
> +		snprintf(cmd, sizeof(cmd),
> +				 "diff %s \"%s\" \"%s\" >> \"%s\"",
> +				 pretty_diff_opts, best_expect_file, resultsfile, difffilename);
> +		run_diff(cmd, difffilename);
> +
> +		/* Emit diff as TAP diagnostics if requested */
> +		if (print_diffs)
> +		{
> +			char		line[1024];
> +
> +			fseek(difffile, startpos, SEEK_SET);
> +			while (fgets(line, sizeof(line), difffile))
> +			{
> +				size_t		len = strlen(line);
> +
> +				if (len > 0 && line[len - 1] == '\n')
> +					line[len - 1] = '\0';
> +				diag("%s", line);
> +			}
> +		}
>  		fclose(difffile);
>  	}

I'm a bit confused. Why are we rerunning diffs and appending to the diff file?


> From c1db07a8bba54e6b903766585f1b521ef526de12 Mon Sep 17 00:00:00 2001
> From: Jelte Fennema-Nio <[email protected]>
> Date: Mon, 26 Jan 2026 09:09:11 +0100
> Subject: [PATCH v1 2/5] perl tap: Show failed command output
> 
> This adds the output of failed commands to the TAP output. Before a
> failed libpq_pipeline test would look like this:
> 
>   Failed test 'libpq_pipeline cancel'
>   at /home/jelte/work/postgres-3/src/test/modules/libpq_pipeline/t/001_libpq_pipeline.pl line 55.
> 
> Now you can actually see the reason of the failure:
> 
>   Failed test 'libpq_pipeline cancel'
>   at /home/jelte/work/postgres-3/src/test/modules/libpq_pipeline/t/001_libpq_pipeline.pl line 55.
> ----- command failed -----
> libpq_pipeline -r 700 cancel port=14309 host=/tmp/htMib451qD dbname='postgres' max_protocol_version=latest
> --------- stderr ---------
> test cancellations...
> libpq_pipeline:315: unexpected number of rows received: 1
> --------------------------
> ---
>  src/test/perl/PostgreSQL/Test/Utils.pm | 36 +++++++++++++++++++++++---
>  1 file changed, 32 insertions(+), 4 deletions(-)
> 
> diff --git a/src/test/perl/PostgreSQL/Test/Utils.pm b/src/test/perl/PostgreSQL/Test/Utils.pm
> index ff843eecc6e..bd1e981c6f0 100644
> --- a/src/test/perl/PostgreSQL/Test/Utils.pm
> +++ b/src/test/perl/PostgreSQL/Test/Utils.pm
> @@ -955,8 +955,22 @@ sub command_ok
>  {
>  	local $Test::Builder::Level = $Test::Builder::Level + 1;
>  	my ($cmd, $test_name) = @_;
> -	my $result = run_log($cmd);
> -	ok($result, $test_name);
> +	# Doesn't rely on detecting end of file on the file descriptors,
> +	# which can fail, causing the process to hang, notably on Msys
> +	# when used with 'pg_ctl start'
> +	my $stdoutfile = File::Temp->new();
> +	my $stderrfile = File::Temp->new();

I'm afraid that the creation of this many additional tempfiles would slow down
the tests on some platforms...

> +	my $result = IPC::Run::run $cmd, '>' => $stdoutfile, '2>' => $stderrfile;
> +	ok($result, $test_name) or do
> +	{
> +		my $stdout = slurp_file($stdoutfile);
> +		my $stderr = slurp_file($stderrfile);
> +		diag("----- command failed -----");
> +		diag(join(" ", @$cmd));
> +		diag("--------- stdout ---------"), diag($stdout) if $stdout;
> +		diag("--------- stderr ---------"), diag($stderr) if $stderr;
> +		diag("--------------------------");
> +	};

Outputting the entire file seems to again have the danger of completely
swamping the output.

>  	return;
>  }
>  
> @@ -972,8 +986,22 @@ sub command_fails
>  {
>  	local $Test::Builder::Level = $Test::Builder::Level + 1;
>  	my ($cmd, $test_name) = @_;
> -	my $result = run_log($cmd);
> -	ok(!$result, $test_name);
> +	# Doesn't rely on detecting end of file on the file descriptors,
> +	# which can fail, causing the process to hang, notably on Msys
> +	# when used with 'pg_ctl start'

Huh? Any more details?

> +	my $stdoutfile = File::Temp->new();
> +	my $stderrfile = File::Temp->new();
> +	my $result = IPC::Run::run $cmd, '>' => $stdoutfile, '2>' => $stderrfile;
> +	ok(!$result, $test_name) or do
> +	{
> +		my $stdout = slurp_file($stdoutfile);
> +		my $stderr = slurp_file($stderrfile);
> +		diag("-- command succeeded unexpectedly --");
> +		diag(join(" ", @$cmd));
> +		diag("-------------- stdout --------------"), diag($stdout) if $stdout;
> +		diag("-------------- stderr --------------"), diag($stderr) if $stderr;
> +		diag("------------------------------------");
> +	};

I don't think it makes sense to duplicate the logic for this multiple times.


> From c88f1fbf6462da56a1f0739dd83a6284d6bdd389 Mon Sep 17 00:00:00 2001
> From: Jelte Fennema-Nio <[email protected]>
> Date: Mon, 26 Jan 2026 09:32:15 +0100
> Subject: [PATCH v1 3/5] perl tap: Show die reason in TAP output
>...
> diff --git a/src/test/perl/PostgreSQL/Test/Utils.pm b/src/test/perl/PostgreSQL/Test/Utils.pm
> index bd1e981c6f0..df328d0f571 100644
> --- a/src/test/perl/PostgreSQL/Test/Utils.pm
> +++ b/src/test/perl/PostgreSQL/Test/Utils.pm
> @@ -206,6 +206,14 @@ INIT
>  	# test may still fail, but it's more likely to report useful facts.
>  	$SIG{PIPE} = 'IGNORE';
>  
> +	# Emit die messages as TAP diagnostics so they appear in test output.
> +	$SIG{__DIE__} = sub {
> +		return if $^S;    # Ignore dies inside eval
> +		my $msg = shift;
> +		chomp $msg;
> +		diag("die: $msg");
> +	};
> +

This looked familiar - turns out I had complained about this in the past and
then forgotten about it :(

https://www.postgresql.org/message-id/20220222181924.eehi7o4pmneeb4hm%40alap3.anarazel.de

At least at the time I found it necessary to separately test for $^S, to avoid
triggering the logic for syntax errors.  Not sure why I concluded that though.

In a reply Andrew suggested to use done_testing(), which avoids the pointless
  "Tests were run but no plan was declared and done_testing() was not seen."
which does seem like an improvement.


> From a6b324678293d65f3a417b8f66889b3f3f6a9c29 Mon Sep 17 00:00:00 2001
> From: Jelte Fennema-Nio <[email protected]>
> Date: Mon, 26 Jan 2026 10:04:44 +0100
> Subject: [PATCH v1 4/5] perl tap: Include caller in die messages of psql
> 
> This way by looking at the error output you can actually figure out
> where it occured, instead of just knowing that is in the very often
> called psql function.

I think this is too localized a fix. Right now every die in a .pm file has
this issue.  The easiest fix would probably be to just replace all uses of die
in .pm files with croak (which we already, inconsistently, use).

stderr:
# die: error running SQL: 'psql:<stdin>:1: ERROR:  syntax error at or near "I"
# LINE 1: I am the worst
#         ^'
# while running 'psql --no-psqlrc --no-align --tuples-only --quiet --dbname port=20878 host=/tmp/K5V48mzFx9 dbname='postgres' --file - --variable ON_ERROR_STOP=1' with sql 'I am the worst' at /home/andres/src/postgresql/src/bin/scripts/t/020_createdb.pl line 19.

Note that the failure now is reported as coming from the actual test file.


We could also use Test::More->builder->croak(), but that's a heck of a lot
longer, and would require us to be more liberal about sprinkling

	local $Test::Builder::Level = $Test::Builder::Level + 1;
around. I don't see an improvement in the output either.

In contrast Carp.pm just looks for a caller outside of the current package. If
we are worried about errors across Cluster.pm/Utils.pm, we could just fix that
in one place by using @CARP_NOT.

Maybe there's a way to do that more centrally?

Greetings,

Andres Freund

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: meson: Make test output much more useful on failure (both in CI and locally)
  In-Reply-To: <luhx26gz3tiqkndldbobukt4zeaeu2ppd5qgsibf6hknhq6d7e@kngeuarhk5dg>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox