Improved backup handling

Small things do make a difference. This weeks release brought some small changes to opsi-backup that should make the handling of opsi backups easier in day-to-day life.

The first change improved automatic detection of restorable data. There is no need anymore to pass what backends should be restored as the new mechanism will by default restore all backend data that is found in your backup file. This boils a restore for most cases down to opsi-backup restore <backupfile>.

The second change is that if any combination of options is given that would lead to not performing a restore the process will be aborted which is much easier to spot.

And the last change is that it is now possible to get a short listing to see what data (backendtype and if there is configuration data) is in a backup with the command opsi-backup list <filenames>

Go give it a try and send us some feedback! All you need is opsi-utils 4.1.1.22 or newer which can currently be found in the testing branch.

API: What is the signature of a method?

The opsi API does not get changed often but it happens. Additions happen as part of the normal development cycle but removals usually only appear as part of larger releases like the release of opsi 4.1. There is also the chance of a changed method signature. The same rules as above apply to the method signatures.

If one of the changes affects a method you use there are different possibilities to handle them.

Checking the version

If you know what version a change appeared in you can use this knowledge to check for a specific version. Reading the changelog of python-opsi is usually a good starting point to find out what changed. This is the module that contains the business logic and API.

The API method backend_info will return a JSON object that contains opsiVersion which is basically the version of python-opsi.

You can easily try it for yourself:

curl -X POST --user youruser --data '{"params": [], "id": 1, "method": "backend_info"}' https://localhost:4447/rpc

This approach works very well for the object-oriented API methods. These have the form objectType_action.

Checking method signature

In opsi we also have dynamically loaded backend extensions. The official extensions are distributed through the package python-opsi but nothing keeps you from using extensions from different versions. There could also be custom extensions that are not distributed through python-opsi. Methods defined through extensions usually don't follow the naming convention of objectType_action but there is no enforcing of this. As you see checking just the version may not work when the method originates from an backend extension.

Lucky for us it is possible to get a description of the available API methods and their signatures through another API call: backend_getInterface.

curl -X POST --user youruser --data '{"params": [], "id": 1, "method": "backend_getInterface"}' https://localhost:4447/rpc

This lists all available methods with their name, params and defaults among others. The values listed in params show how many parameters a method accepts and what their names are. If a parameter has one leading asterik (i.e. *attributes as a parameter of host_getObjects) then this indicates that this is an optional parameter. If a parameter has two leading asteriks (i.e. **filter as a parameter of host_getObjects) then this indicates that the value is optional and a JSON dictionary is expected. If defaults are listed then it contains the default values of optional parameters excluding those with two asteriks.

Summary

Handling multiple API version with opsi isn't too hard but it all depends on what was changed.

With backend_info you can easily check the version.

The powerful backend_getInterface will list the methods exposed through the API. This can be used to handled changed methods but it can also be useful to see what methods are available on the server.

opsi-linux-bootimage and new Dell devices

Within the last week we encountered a problem with our opsi-linux-bootimage in combination with new Dell devices. This problem results in a black screen and a seemingly dead machine whenever a Windows or 32bit Linux netboot installations runs. Furthermore an analysis is hard as the machine runs in a black screen of death (BSOD) before even properly starting the opsi-linux-bootimage kernel itself. However we have a workaround for this issue: use the 64Bit bootimage.

By default setting a windows netboot product on setup, triggers the opsipxeconfd, which uses a template to generate a specific PXE pipe. This file is within the opsi tftpboot directory. This directory is /tftpboot/linux on most opsi supported distributions (or /var/lib/tftpboot/opsi on SUSE based distributions). Within this tftpboot directory the PXE templates are found within the pxelinux.cfg directory. The specific template is the file install. This file has to be modified to use the 64bit kernel and miniroot of the opsi-linux-bootimage. The template should look like this after the modification:

default opsi-install

label opsi-install
  kernel install-x64
  append initrd=miniroot-x64.bz2 video=vesa:ywrap,mtrr vga=791 quiet splash --no-log console=tty1 console=ttyS0

The downside of this modification is that after an update one has to modify the template again. We are currently working on a permanent solution.

opsiconf 2018

It is a week already since the first ever opsiconf took place in Mainz, Germany.

I may be a bit biased as I was involved in some smaller parts of the preparation but my focus before the conference was to have a stable release for opsi 4.1 out. However I don't think that you had to be involved to have a good time.

The thing I enjoyed the most were the discussions with various opsi users. I got a lot of input and was able to help users with my knowledge. There was a great feeling of community - it didn't matter if you are an opsi veteran or just starting with opsi - everyone was welcome.

It was great for me to hear how many people actually use the API to automate their workflows with opsi. There have been lighting talks from users showing how they integrated opsi into their system which I found very inspiring!

It sure boosts my wish to improve the API documentation to make the usage simpler - especially for newcomers. If you have any questions now don't hesitate to post them in the fresh opsi development subforum.

The conference was also used to announce the public repositories for opsi on ARM devices. Head over to the announcement in the forums to read more.

I am really happy with how everything went. There is still a bit of room for improvements but it was a good debut in my eyes. If there will be another conference (I hope there will be) I'd love to see more talks given by users next time.

PS: This wasn't the first opsi-centered conference. There hase been the opsi4instituts conference in 2017. Our community sure is busy :)

News from the Machine Room - January 2018

It's been quite busy weeks from the start of the year on.

A major opsi release is lurking around the corner and even though the development is mostly done there are a lot of other things to do in preparation of it. The most obvious one is probably improving the documentation for the release.

But for me these weeks have also been filled with writing a lot of small utility scripts for our internal use. Those batteries included did help me a lot for this!

For the next week I expect to work on some finishing touches in the documentation and tweak our OBS build configuration.

opsi-script: Checking for opsi 4.1

Since this week public repositories for opsi 4.1 are available and now is a good time to prepare for the switch to opsi 4.1. If you are using opsi to also manage your opsi servers there is a chance that you may want to distinguish between servers running opsi 4.0 and opsi 4.1.

This was the case as I was working on some opsi script that makes use of the new .repo file feature in opsi 4.1 for easy configuration of opsi-package-updater.

In case the script is run on a server that does not support this feature I wanted to abort with a meaningful message.

This is what I came up with:

[Actions]
ShellInAnIcon_check_opsi_version
if "0" = getLastExitCode
    message "Doing some awesome opsi 4.1 stuff..."
else
    isFatalError "Only usable on opsi 4.1 or newer."
endif


[ShellInAnIcon_check_opsi_version]
set -x
export PATH=/sbin:/usr/sbin:/usr/local/sbin:/root/bin:/usr/local/bin:/usr/bin:/bin
python -c "from OPSI import __version__ as v; print(v)"
exit $?

Improving our integration tests

In the last weeks I focussed on our integration test environment. For opsi 4.1 we are increasing the amount of tests we run because in addition to the tests descriped here we added automated migration tests. All these tests should not only be reliable but they also should be fast so that it is possible to run them often.

One downside of our test implementation was that it spent a lot of time in sleep. If we know a reboot would happen then we inserted a sleep 120 to wait 120 seconds and then hopefully the machine was up. Two minutes are a lot of time but sometimes during heavy load that wasn't enough for the machine to be up and accessible. So in some places this got increased to 180 to make really, really sure that the machine could be reached. This made tests a little bit more reliable but also slower. The overall runtime for one of these tests is usually somewhere between two and four hours. Three minutes do not seem like much but there is usually more than one sleep in each test.

So I went on to replace all the sleeps with something better. As mentioned before the sleeps are usually inserted when we wait for a port being reachable. This usually is SSH or opsiclientd which we then use for further work.

My attempt was to write a small Python script that does the waiting for us. Connection attempts are made until the port becomes reachable with a short delay inbetween checks. Once a port is reachable it exits. If a connection can not be made after five minutes it is assumed that something went wrong and the script ends with an non-zero exitcode. The exitcode helps in using this script inside our jenkins stages as this marks the step as failed.

The script does have some tricks though. If a port is reachable right away it will exit right away. The script also understands different kinds of waiting. It may wait for a port to come up, a port to come down or a reboot to happen (port goes down and then comes up again).

With the script in use we managed to reduce the overall runtime by some minutes. Not totally overwhelming but in the end every minute counts. In addition this also improves the reliability of our tests. There is another neat advantage: most of the tests are implemented as functions and the changes made for the tests of opsi 4.1 also effect the runtime of our opsi 4.0 tests.

Using jq to work with JSON

I often see processing of JSON on the shell through tools like grep and awk. Sure, this works but usually it depends on the output having only one item per line and not having the keywords you are grep'ing for in other places.

So I was very delighted to learn that there is a better way through the commandline tool jq.

With jq you can filter, slice and alter JSON. And you get some highlighting of the output which makes it nicer to read than plain text!

Here are two small examples I tried out today to process output from opsi-admin:

opsi-admin -d method host_getHashes | jq '.[] | .id + ": " + .lastSeen'

This will list all your clients along with the time they have been last seen. The output on my test machine looks like this

...snip...
"vtest16r.uib.local: 2013-09-08 21:28:50"
"vtest18-w2k-r.uib.local: 2008-05-24 17:58:35"
"zedach.uib.local: 2011-01-19 15:47:18"
...snip...

And to see what installation status the products on your clients have you can use this snippet:

opsi-admin -d method productOnClient_getHashes | jq '.[] | "On " + .clientId + " the product " + .productId + " is " + .installationStatus'

Again with some example output:

...snip...
"On pcbon4.uib.local the product xnview is installed"
"On fscnoteb1.uib.local the product xpknife is unknown"
"On vtest16r.uib.local the product xpknife is installed"
"On fscnoteb1.uib.local the product yed is installed"
"On hpnoteb1.uib.local the product yed is installed"
"On pcbon4.uib.local the product yed is installed"
...snip...

Creating new JSON data is also possible:

opsi-admin -d method productOnClient_getHashes | jq '.[] | {hostId: .clientId, productId: .productId, status: .installationStatus}'

This return something like the following (look ahead: no JSON list but multiple standalone dictionaries instead):

...snip...
{
  "hostId": "vmex12w10x64c.uib.local",
  "productId": "winscp",
  "status": "installed"
}
{
  "hostId": "vmex12w10x86.uib.local",
  "productId": "winscp",
  "status": "installed"
}
...snip...

If you want to know more I'd suggest to start with the documentation. If you do not want to install anything on your machine there even is an online version to play with.

Smoother transition to opsi 4.1

Work towards a public opsi 4.1 release is making progress. Today's release has brought us a small step closer to a seamless migration from 4.0 to 4.1.

The release includes an opsi-atftpd which provides opsi-tftpd and the updated opsipxeconfd, opsi-depotserver and opsi4ucs all require this. This makes it easier to switch to a different tftpd without requiring any further user interaction.

opsi 4.1 will be released in a separate repository. Our current migration path is adding the new repository and then run apt, yum or zypper with their upgrade options to migrate to the new version. To finish the migration you will have to run opsi-setup with the update parameter for the backends you use and this is it.