4

Debian 64-bit-time transition

 1 week ago
source link: https://wiki.debian.org/ReleaseGoals/64bit-time
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

64-bit time

Current Status

The t64 transition is ongoing (end March 2024) in Debian

Co-ordination is occurring on #debian-devel IRC

  • A fairly complete analysis of ABI changes was done from May-Oct 2023. About 495 library packages change ABI, and between 5063 and 5975 packages which depend on those will need a no-change rebuild. Also 600-700 perl packages which make XS-modules (and depend on perl-abi-5.x.x or libperl-5.xx)

  • Packages built with gnulib released between 2022-07-02 and 2022-12-25 will automatically get 64-bit time_t if glibc >=2.34 is present, unless the macro gl_cv_type_time_t_bits_macro is set to stop it. The offending change was reverted to stop this surprise happening early.

  • Uploads to experimental were done from Feb 2nd 2024 to enable analysis of potential usrmove related issues. (e.g. 1063329 was found)

  • NMU bugs for the transition were filed from Jan 30th.

The actual transition started on 27th Feb 2024

  • gcc (gcc-13 and gcc-14) enabling these options by default was uploaded 26th Feb (version 13.2.0-16.1)
  • Dpkg enabling -D_FILE_OFFSET_BITS=64 -D_TIME_BITS=64 in buildflags by default was uploaded in 1.22.5 on 27th Feb
  • NMUs for all the affected libraries (list) started once all the ports had built gcc and dpkg (around March 3rd)

  • Bootstrapping of various self-depending packages and loops is ongoing (March 29th)
    • (notable blockers fixed: libglib, cargo/rust, curl, protobuf, openjdk-17, git/subversion)

Notes about bootstrapping architectures, work arounds, current status: BrainDumpT64

Affected packages

Goal description

Use 64-bit time_t on 32-bit architectures to avoid the 'year 2038 problem' when the existing 32-bit signed int rolls over (potentially setting time back to 1900). Good technical details are given on this glibc page, and a very general overview at https://theyear2038problem.com/

This FOSDEM talk (PDF) also gives a good overview of the status as of Feb 2023

This is now less that 15 years away and plenty of system that will have problems have already been shipped. We should stop adding to the problem. Most computing, especially computing using Debian or its derivatives, is now done on 64-bit hardware where this issue does not arise. However there is quite a lot of cost-sensitive 32-bit computing still out there, and still shipping new devices (automotive, IOT, TVs, routers, plant control, building monitoring/control, cheap Android phones). Some of that hardware will probably be running Debian or its derivatives. Other binary distros are dropping 32-bit support (RedHat/Fedora have already done so, SUSE's support is unofficial), so what is left is more likely to end up in the Debian ecosystem. Most such new hardware will be running build-from-source OSes like OpenEmbedded, or Alpine, Android, or Gentoo, but the Debian-based niche is likely to remain for some years, and some stuff built with it is likely to be in use/installed for long enough to hit Jan 2038.

Debian is primarily concerned about the armhf architecture as the one 32-bit architecture most likely to still be getting significant usage in new systems over the next decade. But i386, armel, mipsel (and hppa, hurd-i386, powerpc, m68k, and sh4 ports) are also affected. Other 32-bit architectures already use 64-bit time: x32, riscv32, arc, and loong32.

64-bit architectures are not affected by the y2k38 problem, but they are affected by this transition.

Because you have to have LFS if you have 64-bit time_t (glibc enforces this), this goal is a superset of ReleaseGoals/LFS.

Background

time_t appears all over the place. 6429 of Debian's 35960 packages have time_t in the source. Packages which expose structs in their ABI which contain time_t will change their ABI. All such libraries need to migrate together, as is the case for any library ABI change.

glibc 2.34 provides support for both the existing 32-bit ABI/API and a new 64-bit ABI/API. However it does not provide a switch forcing use of the new API/ABI - each build/package chooses explicitly to use the 64bit API/ABI (by setting _TIME_BITS=64). This is a problem for Debian as in a normal transition we expect that simply building against the new library will get you the new ABI. Something (glibc, dpkg, gcc?) has to say 'use 64-bit time by default'. 1030159 has implemented a DEB_BUILD_OPTIONS=abi=+time64 option as a consistent Debian mechanism.

This transition is similar to the LFS (Large Filesystem Support) transition, where glibc also provided both 32-bit and 64-bit APIs and using the 64-bit ABI (by setting _FILE_OFFSET_BITS=64 (or DEB_BUILD_OPTIONS=abi=+lfs which does the same thing)) changes the ABI. And just to add to the fun if you set 64-bit time then glibc enforces 64-bit file offsets so software that has not dealt with support for LFS will also have to fix that in order to move to 64-bit time. (There are about 75 libraries that change ABI for LFS, but not time_t).

Other projects

Links to work in this area by other projects

Choices

We could either transition the ABI within the existing architecture(s), or we can bootstrap a new architecture (with a new triplet and ABI). Initial thoughts were that it was highly uncertain that a transition in place was feasible because too much stuff would break, and thus a new arch was simpler, safer and easier.

New triplet:

  • Strictly speaking a new ABI should mean a new triplet, but in fact we do ABI-changing migrations within an existing triplet all the time (most SONAME changes).
  • If Debian used a new triplet for the new ABI, but all the rest of the Linux world migrated the ABI within the existing triplet it would become very unclear what the existing 'arm-linux-gnueabihf' triplet means. It's quite important that there is cross-distro agreement about the way forward here.
  • There has been very little interest in a new triplet from other distros.

ABI transition:

  • Research to date has suggested that a standard ABI transition will be feasible.
  • We've done large ABI transitions before like LFS and libc5->libc6. So treating this as just another migration makes sense.

We also need to choose whether to do this for all our 32-bit architectures or not, for example one could decide that x32 fulfills the '32-bit x64 with 64-bit time' role and i386 should remain with 32-bit time for compatibility reasons (the ability to run ancient x86 binaries, especially proprietary ones that cannot be updated).

Decision

After a long discussion (mostly about i386) it was decided to do an in-architecture ABI transition for all 32-bit architectures except i386 (and hurd-i386).

The i386 port will be left with the existing 32-bit time_t, as a compatibility architecture for existing x86 binaries. A new 'i686' x86 ABI/architecture using 64-bit time, and potentially newer ISA features, could be created if there was sufficient enthusiasm for dragging 32-bit x86 into its now very limited future. The hurd-i386 port is not going to be switched, as its kernel lacks support, and efforts are underway instead to switch to hurd-amd64.

So, in summary:

  • amd64, arm64, mips64el, ppc64el, riscv64, s390x are all 64-bit, so they already had 64-bit time_t
    • - non-release architectures in the same category: alpha hurd-amd64 ia64 loong64 ppc64 sparc64
  • i386 is 32-bit but has been excluded from the 64-bit time_t transition because its major purpose this decade is running legacy 32-bit binaries, a purpose that would no longer be possible if it broke ABI
    • - non-release architectures in the same category: hurd-i386
  • There is currently no release architecture that is 32-bit but already had a 64-bit time_t prior to 2024
    • - non-release architectures in this category: x32
  • armel, armhf are the two 32-bit release architectures which are changing ABI
    • - non-release architectures in the same category: hppa m68k powerpc sh4

Transition in place

We have done large ABI break transitions before such as libc5 -> libc6 ('g' suffix - which still remains today in libpam0g and zlib1g!), and GCC 4.0 C++ ABI ('c2' suffix). However those affected all architectures, not just old 32 bits ones. We have also done transitions which only affected 'minor' architectures such as the long double migration from 64-bit to 128 bit on alpha, powerpc, sparc, s390 (2007) ('ldbl' suffix).

A large in-place transition will affect all of Debian, but only benefit the remaining 32-bit arches, so we do need to try and do this reasonably efficiently in order not to hold things up for too long. Fallout from breakage should fall almost entirely on the 32-bit arches that are changing ABIs.

How to help

  • Build your package with DEB_BUILD_OPTIONS=abi=+lfs and _FILE_OFFSET_BITS=64 and _TIME_BITS=64 set and test it on 32-bit systems.

  • Check especially if on-disk files/formats are affected or if there are any ABI/API changes your package exposes.
  • Don't upload your package with these options enabled until any libraries you depend on have been uploaded.
  • Record explicit tests for potential transition bugs on this page.
  • Report bug on packages that have issues, with tag 'time-t'.
  • Fix bugs tagged 'time-t'.

Issues

People are rightly worried about stuff that will break when time_t is changed for some but not all of the packages on a running system. However not that many things which will actually break have so far been found.

Because some 32-bit arches have been using 64-bit time for some years, and x86 already went through the 32 -> 64-bit transition so things like file-formats have generally been made interoperable, quite a lot of things that were problems have already been dealt with.

The largest area of uncertainty is in possible issues with changing file formats, database structures, data passed between programs over IPC mechanisms.

So far as I can determine most language runtimes do not appear to have problems here except insofar as they have C-library interfaces. Those issues resolve to the known C/C++ ABI issue.

  • Haskell will change ABI, but then it changes ABI all the time anyway, and has mechanisms to deal with it.
  • Rust is all statically linked anyway so should not expose any ABI changes, except for external C/C++-libaries.
  • Java, Perl, Python, Go and Ruby all have a consistent internal time representation so will only expose ABI changes on C/C++-library linkage.

Another source of issue is packages that fails building with the time64 build flags on. The flags can be turned on with:

DEB_BUILD_MAINT_OPTIONS=abi=+time64 dpkg-buildpackage

And explicitly turned off with:

DEB_BUILD_MAINT_OPTIONS=abi=-time64 dpkg-buildpackage

Please update this section if you know of any related issues.

Note that there are two classes of breakage:

  1. Things that break in 2023 due to the transition to 64-bit time.
  2. Things that break in 2038 due to the date actually wrapping.

We only need to worry about the 1st of those for the transition itself. And that transition should fix quite a lot of the things in the 2nd class.

Known Issues

  • The utmp and wtmp files currently record timestamps as 32-bit integers even on most 64-bit architectures. However, in 32-bit programs with a 64-bit time_t these timestamps inadvertently change to 64-bit integers (1042562), creating compatibility problems (1027135).

  • NFSv3 (uses 32-bit time). Some client may use an unsigned int so have another 70 years?
  • INN has time_t embedded in the disk format of its overview and history databases, which will require manual rebuilds when the size of time_t changes. This probably cannot reasonably be done by maintainer scripts and will therefore require manual intervention by the user. (It may be possible to write a migration program that avoids the need for a complete rebuild, but this has not yet been done.) The CNFS storage format does not have problems with its disk format, but the less-used timecaf storage format might (yet to be confirmed).

  • cpio - does it use signed int (31 bits) or 33-bit time (11 octal digits)? - conflicting info exists.
  • 32-bit wine (i386 only). This does not make much sense with 64-bit time. It's whole purpose is to run old i386-ABI binaries. The ABI for this arch (and thus wine-32) should not change.
  • PHP ints are same size as DEB_HOST_ARCH_BITS so PHP on 32-bit machines using time() fns will break. DateTime API exists and will not break so PHP apps still using time() stored in INTs need to be updated. This does not affect the ABI transition itself.

Bug tracking

Please tag all 64-bit-time bugs 'time-t' ([email protected], tag=time-t) in the BTS (see bugs.debian.org/usertags for instructions).

Here is an example (assuming you have a patch file, and a body template file)

reportbug $package  -V $version -A $patchfile --src --subject "Use 64bit time_t" --tag patch --pseudo-header 'User: [email protected]' --pseudo-header 'Usertag: time-t' --no-tags-menu --severity normal --body-file $template

or to tag an existing bug

bts user [email protected] , usertags 12345 time-t

Tests

Please list explicit tests for things you think might break. Assume the people working on this transition know nothing of your software. They will be very grateful for commands/tests they can run (and expected results) to see if things are working correctly.

Milestones

We are already late with this transition and upstreams are already moving, so doing something has become quite urgent. A plan was proposed in May 2023 https://lists.debian.org/debian-devel/2023/05/msg00168.html to do a transition early in the Trixie cycle. We hope this will occur in jan 2024.

  1. Make a complete list of libraries with changed public ABI changes that must transition together.
  2. Change gcc-* to emit -D_FILE_OFFSET_BITS=64 and -D_TIME_BITS=64 by default.
  3. Change dpkg-buildflags to emit -D_FILE_OFFSET_BITS=64 and -D_TIME_BITS=64 on all 32-bit arches except i386 and hurd-i386 (filter this out for 100-odd packages which are sensitive to LFS but not time_t).
  4. NMU all libraries with binaries renamed from libfoo to libfoot64, removing old suffixes (c102, c2, ldbl, g…) if present, and emit a Provides/Replaces/Breaks libfoo on 64-bit arches + i386 and hurd-i386.
  5. Do unchanged source rebuilds (binNMUs on all architectures) of 5000-6000 packages which depend on those. By the magic of transitions this just works.

ABI transition

Packages which expose structs in their ABI which contain time_t will change their ABI. We are analysing the set of packages involved, and initial investigations (in Ubuntu) produced the following:

  • Of 4590 library packages: 1925 analyser failed, 2665 checked
  • 387 changed ABI, 2278 did not. (15%) (Ubuntu tests)
  • So maybe 500-600 libs in transition

The Debian analysis looks like this so far:

  • Total *-dev packages: 10323
  • 4963 are golang-*, librust-* and libghc-* which we can ignore, leaving 5360 packages
  • 5237 packages have a .so in them to check
  • 329 changed ABI due to time_t, 58 changed ABI due to enabling LFS, making 387 (7%)

  • 1840 did not change ABI (34%)

  • 1637 failed to run the abi-compliance-checker (31%)

  • implies 400-500 packages in the transition

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK