Creating a sosreport on CoreOS

With OpenShift 4, Red Hat introduced Red Hat Enterprise Linux CoreOS. It is a very minimalistic operating system, focused on running container workload.

This new minimalism comes with some challenges. There are no more RPM packages and most of the tools we know and love are missing! Luckily, there is the Red Hat supplied toolbox container that contains all the necessary tools and is nicely integrated.

So to start the toolbox, use oc debug node/<nodename>. This will start a privileged container on the node you specify, mount the host filesystem on /host and drop you into a shell:

$ oc debug node/worker-0.lab.openshift.krenger.ch
Starting pod/worker-0labopenshiftkrengerch-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# toolbox
Container started successfully. To exit, type 'exit'.
sh-4.2#

Now we are running in the toolbox container on our CoreOS host with all the tools we know at our disposal, for example sosreport:

sh-4.2# sosreport

Running sosreport will generate a sosreport in /host/var/tmp/, which means it will be accessible in /var/tmp/ on the CoreOS host itself.

Linux Magic Reboot

If you have worked with remote Linux servers before, I am guessing you already encountered machines that just don’t want to reboot. This is typically due screwed-up network mounts or stuck processes, so the server will hang during shutdown. But it turns out that there are other ways to reboot a server.

One of these is the “Magic SysRq key“. To reboot a server using the SysRq trigger in the kernel, use the following two commands. First, enable the trigger:

echo 1 > /proc/sys/kernel/sysrq

Then, reboot the server the magic way by typing

echo b > /proc/sysrq-trigger

Note that this will reboot the server without unmounting or syncing the filesystems! There are also other options available via the SysRq trigger, some of them are listed in the Wikipedia article above.

Ansible: “python2 yum module is needed for this module”

I am currently toying around with GlusterFS and I am using Ansible to deploy and configure my server.

Using the yum module, I wanted to install the Gluster server package like so:

- name: Install glusterfs-server package
  yum:
    name: glusterfs-server
    state: latest

But when executing the playbook, I received the following error on executing this module:

Read the rest of this entry »

Fedora 25: How to make “GNOME Classic” default

As I am working more and more with Linux, I am also using a virtual machine with Fedora 25 installed to play around with some things (notably Docker and Kubernetes). On Fedora 25, the default GNOME desktop environment is GNOME 3. But I personally prefer the GNOME Classic user interface.

To change the desktop environment, on login, select “GNOME Classic” as the desktop environment:

Read the rest of this entry »

Workaround for WMI client over IPv6

Some years ago, I wrote some examples for the WMI client on Linux. I still get a lot of queries from people trying to use the WMI client to access Windows hosts and I am often happy to help if there are any problems.

One of the latest problems occurred when trying to access a Windows host over IPv6:

$ wmic -U 'user%password' //FD00:180::0:0:0:0:0 "Select Caption From Win32_OperatingSystem" [..] UNKNOWN - The WMI query had problems. The error text from wmic is: [librpc/rpc/dcerpc_util.c:343:dcerpc_parse_binding()] Unknown dcerpc transport 'FD00' [librpc/rpc/dcerpc_connect.c:337:dcerpc_pipe_connect_ncacn_ip_tcp_recv()] failed NT status (c0000017) in dcerpc_pipe_connect_ncacn_ip_tcp_recv [librpc/rpc/dcerpc_connect.c:828:dcerpc_pipe_connect_b_recv()] failed NT status (c0000017) in dcerpc_pipe_connect_b_recv [wmi/wmic.c:196:main()] ERROR: Login to remote object. NTSTATUS: NT_STATUS_NO_MEMORY - Memory allocation error

This was quite a funny problem, because the same query seemed to work when accessing the host over IPv4. So we quickly suspected that the WMI client does not support IPv6. By looking at the underlying Samba code (e.g. dcerpc_util.c and binding.c), I guessed that this seems to be a parsing issue of some kind.

Read the rest of this entry »

OpenShift: List all pods in cluster

I recently started working with OpenShift and needed to get a list of all pods on the cluster. I quickly glanced at the documentation but could not find what I wanted. My colleagues quickly pointed me in the right direction:

oc get pod --all-namespaces -o wide

Here is the command with some example output of what to expect:

# oc get pod --all-namespaces -o wide
NAMESPACE                                 NAME                                                       READY     STATUS               RESTARTS   AGE       IP               NODE
my-project                                my-pod-43-d9mo6                                            1/1       Running              0          1d        192.168.0.183    node3.krenger.local
yet-another-project                       another-pod-43-7g3r0                                       1/1       Running              0          2d        192.168.0.184    node4.krenger.local
[..]

If you just want to know which pods are on a certain node, use oc adm manage-node:

oc adm manage-node node3.krenger.local --list-pods

DBA_INDEXES and DBA_SEGMENTS mismatch

This is a fun one.

We developed a script to process certain indexes on a database somehow, it kept missing some of the indexes that clearly existed. We then found the problem: DBA_INDEXES had more entries than DBA_SEGMENTS. See the following example:

SQL> SELECT owner, index_name as i_name from dba_indexes WHERE owner = 'SIMON'
  2  MINUS
  3  SELECT owner, segment_name as i_name FROM dba_segments WHERE owner = 'SIMON';

OWNER                I_NAME
-------------------- --------------------------------------------------
SIMON                IDX_...
SIMON                IDX_...
SIMON                IDX_...
SIMON                PK_...
SIMON                PK_...
SIMON                UNQ_...
SIMON                UNQ_...

7 rows selected.

So here we see that there are clearly indexes for which there are no segments. We then looked at the tables where these indexes are located and noticed a particular thing: All the corresponding tables for these indexes were empty.

So the reason for this behaviour is called “Deferred Segment Creation“. This means that when a “CREATE TABLE” statement is issued and no rows are inserted, there are no segments that are created. This behaviour can be controlled by the DEFERRED_SEGMENT_CREATION parameter.

This makes sense in large schemas, where not all tables are populated. Instead of having segments created and extents allocated, only the definition of the table is saved. As soon as the table has at least one row, the segments are automatically created.

Oracle 12c: Show PDB saved state

As of PSU 12.1.0.2, Oracle introduced “PDB State Management Across CDB Restart”, which will automatically put a pluggable database in a previously defined state on database startup. To show the current saved state of the PDBs, you can query the documented view cdb_pdb_saved_states:

SQL> SELECT con_name, instance_name, state FROM cdb_pdb_saved_states;

CON_NAME		       INSTANCE_NAME		      STATE
------------------------------ ------------------------------ --------------
P1			       cdb2			      OPEN
P2			       cdb2			      OPEN

But beware: When you unplug and plug in the database, this saved state will be lost.

I think this is one of the more awesome improvements in 12.1.0.2, since the original startup trigger was more of a workaround than a real solution.