The US Library of Congress has a "web archive" (as of January 2025, in beta testing but available for use) that tries to capture the content of a large number of websites (not just their own).
This is similar in concept to the Internet Archive's Wayback Machine, which has been doing that for much longer.
The LoC describes their work on this in this 2023 blog posting, with a further update in January 2025.