How to automatically keep track of packages installed/removed by user using dnf

I want to:

  • maintain a list of user-installed programs on my system
  • host it on the cloud (eg. dropbox)
  • update it when programs are installed or removed

How can I achieve the last point? Specifically, how could it be triggered after running dnf install or dnf remove?

For the first two points, I am using the following simple script which is made from this tip:

LIST_FILE=~/Dropbox/userinstalled_$(hostname).txt

dnf --cacheonly repoquery --qf "%{name}" --userinstalled > $LIST_FILE

Since the motivation is for backup and restore purposes, an alternative would be to run the script before running backup software, but it might be nice to keep it more frequently updated after each program installation/removal.

1 Like

You probably want to make a dnf plugin. Afaik there’s no simple tutorial for this, but here are some links that might help:

Best of luck!

How quickly do you need it to update?

If once a day is sufficient, you could just adapt the old /etc/cron.daily/rpm script, at least conceptually. (I think it used to be “standard equipment”, like 20 Fedora releases ago. It would update a list of installed packages at /var/log/rpmpkgs once a day — I still install a modified, more verbose version on all my systems, because I like to have that record locally.)

If updated-within-24-hours is too slow, there’s even /etc/cron.hourly, which would still let you avoid writing dnf plugins and whatnot just for that tiny bit of extra immediacy.

Hmm… a dnf plugin sounds like overkill. Also, would I really want to override transaction? I want to log only user-installed packages. Updates or system installed packages are not relevant and ideally should not trigger the script.

There is no /etc/cron.daily/rpm file on my system. I thought of a cron task, but that seems overkill too.

In the worst case I could just run the script before making updates. eg. vorta, a frontend to borgbackup, has a field for commands to run before and after a backup.

Yeah, like I said /etc/cron.daily/rpm was from back in like the “Fedora (single digit)” era. But here’s the one I still use:

#!/bin/sh

renice +15 -p $$ >/dev/null 2>&1

tmpfile=`/bin/mktemp /var/log/rpmpkgs.XXXXXXXXX` || exit 1
/bin/rpm -qa --last |tac > "$tmpfile"

if [ ! -s "$tmpfile" ]; then
	rm -f "$tmpfile"
	exit 1
fi

/bin/mv "$tmpfile" /var/log/rpmpkgs
/bin/chmod 0644 /var/log/rpmpkgs

Not sure I really see how it’s overkill, cron's kind of the path of least resistance for housekeeping tasks (which this is). If you want to keep your remote list updated, something’s gotta update it. Either that happens automatically in response to changes, or regularly on a schedule. And if it’s stored in Dropbox, it’ll even handle versioning and checking whether there are any content changes each time the list gets written.

(Actually, I’d recommend something like the tempfile trick from that cron script: Run the command redirected to /tmp/something, then mv that file over the previous one in Dropbox. Output redirection that overwrites existing files doesn’t always sit well with the Dropbox sync daemon.)

2 Likes

Thanks for sharing the cron script.

By overkill I mean it would run far more often than needed, since packages aren’t often installed/removed. Granted, running it literally takes less than a second…

Since the objective is for backups, I think I’ll just incorporate running the script into the backup routine, as frequent or infrequent as that may be.

1 Like

Ah, that makes sense.

Though, you could focus on the side benefits. If you’re running dnf once an hour to update your installed package list, you’d probably spend a lot less time waiting for metadata refreshes at the beginning of dnf transactions, since the last run of the cron job would already have triggered a recent-enough update. :wink:

Hourly! Now that really is overkill. :wink:

I don’t see why metadata needs to be refreshed to check the installed package list. I don’t think dnf repoquery --qf "%{name}" --userinstalled does that.

On a side note, how recent is recent enough for metadata refreshes not to be triggered?

It certainly shouldn’t need to be, and you may be right. I’ve gotten used to it refreshing before all sorts of operations, if it’s been a while since the last refresh, but I don’t use it much for installed-only queries. (I’ll typically go straight to rpm, for those. However, --userinstalled is one of the reporting dimensions that rpm doesn’t know about, so that one’s a dnf repoquery exclusive.)

Your guess is as good as mine, there appears to be some sort of <jazzhands> algorithm </jazzhands> at work in making that decision. I’ve seen it refresh data that was mere minutes old, and I’ve seen it skip a refresh despite the cache being hours old.

Just now I experimentally ran a dnf repoquery --unneeded. (Which, as documented, will “Display only packages that can be removed by ‘dnf autoremove’ command.”, so in theory it shouldn’t need remote repo data either.) dnf refreshed only the updates and rpmfusion-free-tainted repos, out of the 23 I have enabled, but not any of the rest. ¯\_(ツ)_/¯

1 Like

A “solution” is always using --refresh, i.e. sudo dnf update --refresh. That way there is no surprise: it will always refresh metadata so updates can be enjoyed as quickly as possible. :smiley:

Sorry I’m late and I haven’t any working example.
Going back to the original question, you can monitor a file when it changes, by using the Linux kernel subsystem :open_mouth: called inotify.
There are various tools (dnf search inotify, incron looks interesting) in order to interact with it.
In practice, when a file (or a directory) is modified (or also accessed etc.) you can trigger a command. What file to monitor in your case? Well, I guess the file used by DNF to keep the transactions history (/var/lib/dnf/history.sqlite) could be a good candidate.

Yeah… Is there a file that tracks user installed packages? That distinction is important, since I see no point in tracking all (system) installed packages.

But this user in what way could install a package?

I usually use dnf install or dnf remove.

The rationale is to remember what packages I installed manually. Whatever system or dependency packages got automatically installed don’t matter, because if I were to reinstall or restore the OS, those would get sorted out on their own.

Is the goal to have a list of installed packages in order to reinstall them in a future OS fresh install?

Yes, a list of all user-installed packages which would not be installed by default on a fresh OS install.

Well. I don’t know how to do it if you don’t have the initial list of the freshly installed OS.
Maybe tinkering with dnf history?

However, even if you have the list of all packages currently installed, it is not a problem to pass such list to dnf in a future OS fresh reinstallation: the already installed packages will be skipped.

Someone mentioned userinstalled.
Also dnf history userinstalled worth a try.

1 Like

As mentioned in the first post, dnf repoquery --qf "%{name}" --userinstalled works exactly as intended.

dnf history userinstalled is not ideal because it contains version, arch, and fedora release information instead of strictly the package name. That list won’t work as-is in the future.

Keeping a list of all packages is also not good because system packages and dependencies may change. Reinstalling an old list may not work, or installed unnecessary packages.

I think we can consider this topic sufficiently explored. As it turns out, the first post already contains the best/simplest solution. I’m just surprised a better solution does not seem to exist.

As an answer to the specific issue here, not the general one of running a script after a command: the simplest one perhaps: just keep a list? I have this script that I continuously modify, for example:

Automation always has an up-front cost, and if you want to do it robustly + correctly, the dnf plugin would be the way to go as @refi64 suggested.