I can think of few better investments than to have a reliable suite of end-to-end tests running as part of your merge pipeline even if they're difficult to set up. Sleep quality improves once you have this in place. Your tests don't have to be brittle or flaky. If your tests are brittle it is definitely a good investment to fix whatever's causing the brittleness rather than accept it as a fact of life.
Having the tests run against an isolated data and infrastructure environment without additional noise from shared activity is a good first step.
Distributed systems automatically bring challenges that makes it very hard to create reliable test suites. You can reduce the flakey-ness, but I don't think you can ever completely eliminate it.
Beyond that, you haven't addressed the fact that a comprehensive end-to-end test suite in a complex system is really, really slow.
Just to clarify, I'm talking about integration testing the service itself - posting a payload, saving to the database, producing messages to a mock queue, etc. Test the entire service in isolation from other services and validate that it is behaving correctly. Not end to end across services. You should be mocking out all service dependencies and testing against the contracts for those systems.
Our API tests run flawlessly every time because they write against an isolated database with well-defined endpoint and messaging contracts. They also execute all remote operations against a mock API that conforms to those contracts. This is perfectly achievable.
I agree that this is achievable but it seems to contradict your original assertion. Mocking out other services doesn't give you the same assurances as end to end testing the whole stack.
To me it's a classic tradeoff: the more you integrate in your tests, the more meaningful they are - but also the harder to write and maintain.