Soft lockup messages from Linux kernel running in an SMP-enabled virtual machine

Details:

When running a Linux kernel in a symmetric multiprocessing (SMP) enabled virtual machine, messages similar to BUG: soft lockup detected on CPU#1! are written to the message log file. The exact format of these messages vary from kernel to kernel, and might be accompanied by a kernel stack backtrace.
Many Linux kernels have a soft lockup watchdog thread, and report soft lockup messages if that watchdog thread does not get scheduled for more than 10 seconds. On a physical host, a soft lockup message generally indicates a kernel bug or hardware bug. When running in a virtual machine, this might instead indicate high levels of overcommitment (especially memory overcommitment) or other virtualization overheads.

Solution:

The soft lockup messages are not kernel panics, and can be safely ignored.Some kernels allow you to adjust the soft lockup threshold by running the command:echo time > /proc/sys/kernel/softlockup_thresh

Where time is the number of seconds after which a soft lockup is reported. The default is generally 10 seconds.

What the Error Looks Like:

abrt_version:   2.0.8

cgroup:

cmdline:        /usr/bin/ksh ./update_archive.ksh RMUAT

executable:     /bin/ksh93

kernel:         2.6.39-400.212.1.el6uek.x86_64

last_occurrence: 1403155682

pid:            25607

pwd:            /oracle/scripts/update_archive

time:           Thu 19 Jun 2014 12:28:02 AM CDT

uid:            502

username:       oracle

 

sosreport.tar.xz: Binary file, 3959288 bytes

 

environ:

:SHELL=/bin/sh

:OLDPWD=/home/oracle

:USER=oracle

:LD_LIBRARY_PATH=/lib:/usr/lib:/oracle/product/11.2.0/dbhome_1/lib

:PATH=/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin

:PWD=/oracle/scripts/update_archive

:JAVA_HOME=/usr/java

:LANG=en_US.UTF-8

:HOME=/home/oracle

:SHLVL=2

:LOGNAME=oracle

:’NLS_DATE_FORMAT=MM-DD-YYYY HH24:MI:SS’

:_=./update_archive.ksh

 

limits:

:Limit                     Soft Limit           Hard Limit           Units

:Max cpu time              unlimited            unlimited            seconds

:Max file size             unlimited            unlimited            bytes

:Max data size             unlimited            unlimited            bytes

:Max stack size            10485760             33554432             bytes

:Max core file size        0                    unlimited            bytes

:Max resident set          unlimited            unlimited            bytes

:Max processes             16384                16384                processes

:Max open files            1024                 65536                files

:Max locked memory         65536                65536                bytes

:Max address space         unlimited            unlimited            bytes

:Max file locks            unlimited            unlimited            locks

:Max pending signals       387052               387052               signals

:Max msgqueue size         819200               819200               bytes

:Max nice priority         0                    0

:Max realtime priority     0                    0

:Max realtime timeout      unlimited            unlimited            us

 

maps:

:00400000-0055b000 r-xp 00000000 fc:00 261718                             /bin/ksh93

:0075a000-0076d000 rw-p 0015a000 fc:00 261718                             /bin/ksh93

:0076d000-00773000 rw-p 00000000 00:00 0

:0096c000-0096e000 rw-p 0016c000 fc:00 261718                             /bin/ksh93

:3e8dc00000-3e8dc20000 r-xp 00000000 fc:00 1177356                        /lib64/ld-2.12.so

:3e8de1f000-3e8de20000 r–p 0001f000 fc:00 1177356                        /lib64/ld-2.12.so

:3e8de20000-3e8de21000 rw-p 00020000 fc:00 1177356                        /lib64/ld-2.12.so

:3e8de21000-3e8de22000 rw-p 00000000 00:00 0

:3e8e000000-3e8e002000 r-xp 00000000 fc:00 1177390                        /lib64/libdl-2.12.so

:3e8e002000-3e8e202000 —p 00002000 fc:00 1177390                        /lib64/libdl-2.12.so

:3e8e202000-3e8e203000 r–p 00002000 fc:00 1177390                        /lib64/libdl-2.12.so

:3e8e203000-3e8e204000 rw-p 00003000 fc:00 1177390                        /lib64/libdl-2.12.so

:3e8e400000-3e8e58b000 r-xp 00000000 fc:00 1177360                        /lib64/libc-2.12.so

:3e8e58b000-3e8e78a000 —p 0018b000 fc:00 1177360                        /lib64/libc-2.12.so

:3e8e78a000-3e8e78e000 r–p 0018a000 fc:00 1177360                        /lib64/libc-2.12.so

:3e8e78e000-3e8e78f000 rw-p 0018e000 fc:00 1177360                        /lib64/libc-2.12.so

:3e8e78f000-3e8e794000 rw-p 00000000 00:00 0

:3e8f000000-3e8f083000 r-xp 00000000 fc:00 1177398                        /lib64/libm-2.12.so

:3e8f083000-3e8f282000 —p 00083000 fc:00 1177398                        /lib64/libm-2.12.so

:3e8f282000-3e8f283000 r–p 00082000 fc:00 1177398                        /lib64/libm-2.12.so

:3e8f283000-3e8f284000 rw-p 00083000 fc:00 1177398                        /lib64/libm-2.12.so

:3e98400000-3e98402000 r-xp 00000000 fc:00 1177423                        /lib64/libutil-2.12.so

:3e98402000-3e98601000 —p 00002000 fc:00 1177423                        /lib64/libutil-2.12.so

:3e98601000-3e98602000 r–p 00001000 fc:00 1177423                        /lib64/libutil-2.12.so

:3e98602000-3e98603000 rw-p 00002000 fc:00 1177423                        /lib64/libutil-2.12.so

:7fea21b5d000-7fea21bcd000 rw-p 00000000 00:00 0

:7fea21bcd000-7fea27a5e000 r–p 00000000 fc:00 1441433                    /usr/lib/locale/locale-archive

:7fea27a5e000-7fea27aaa000 rw-p 00000000 00:00 0

:7fea27ab8000-7fea27ab9000 rw-p 00000000 00:00 0

:7fffdfc56000-7fffdfc77000 rw-p 00000000 00:00 0                          [stack]

:7fffdfde7000-7fffdfde8000 r-xp 00000000 00:00 0                          [vdso]

:ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

 

open_fds:

:0:pipe:[122311558]

:pos: 0

:flags:     00

:1:/oracle/log/RMUAT_update_archive_140619002801.log

:pos: 199

:flags:     0100001

:2:/oracle/log/RMUAT_update_archive_140619002801.log

:pos: 199

:flags:     0100001

Magnuson – Moss Warranty Act – Ever Heard of it?

That claim is simply not true. If the consumer asks for the statement in writing, he will not receive it. Nevertheless, the consumer may feel uneasy about using replacement filters that are not original equipment. With the large number of do-it-yourselfers who prefer to install their own filters, this misleading claim should be corrected.

Under the Magnuson – Moss Warranty Act, 15U.S.C. SS 2301-2312 (1982) and general principles of the Federal Trade Commission Act, a manufacturer may not require the use of any brand of filter (or any other article) unless the manufacturer provides the item free of charge under the terms of the warranty.

So, if the consumer is told that only the original equipment filter will not void the warranty, he should request that the OE filter be supplied free of charge. If he is charged for the filter, the manufacturer will be violating the Magnuson – Moss Warranty Act and other applicable law.

By providing this information to consumers, the Filter Manufacturers Council can help to combat the erroneous claim that a brand of replacement filter other than the original equipment will “void the warranty.”

It should be noted that the Magnuson – Moss Warranty Act is a federal law that applies to consumer products. The Federal Trade Commission has authority to enforce the Magnuson – Moss Warranty Act, including obtaining injunctions and orders containing affirmative relief. In addition, a consumer can bring suit under the Magnuson – Moss Warranty Act.

Multiple Virtual CPUs are Causing Performance Issues

  1. Open a console prompt on the ESX host or initiate an SSH connection to it.
  2. Type esxtop and press Enter.
  3. In the CPU screen, check the %CSTP value. If this number is higher than 3.00, the performance issues may be caused by the vCPU count. Try lowering the vCPU count of the virtual machine by 1.Note: The %CSTP value represents the amount of time a virtual machine with multiple virtual CPUs is waiting to be scheduled on multiple cores on the physical host. The higher the value, the longer it waits and the worse its performance.  Lowering the number of vCPUs reduces the scheduling wait time.
  4. In vCenter, you can also edit the realtime graphs to show Co-Stop:Co-Stop

As a general rule, I recommend starting with one processor and upgrading to multi-processors if it is really necessary. Case in point: I have a DBA that wanted 8 processors for his Oracle Database, I initially gave it to him, but soon the DB began this power grab on the vHost. Once I convinced him it was his server causing the issues, we rolled the system back several processors and the DB worked much better.

Metallic vs Ceramic Brake Pads

To heck with metallic pads.
Go with ceramic pads for your disk brake needs. They don’t make as much brake dust and they last so much longer.
Another thing to consider is that instead of turning your rotors, just replace them every other time you change the pads. It just doesn’t pay to turn them anymore.
Also. Changing your pads and rotors is super easy and can be done with simple tools, the only “special” tool I used was a large C clamp. It took about 45 minutes to do both sides and I got a good health check of the car while I was under there.

GPG, Kleopatra, and PGP

I needed to make a PGP encryption system to transfer some files around. Strictly for entertainment and memory process here’s an example of the command lines you need to encrypt a file and then decrypt it.

Encrypt looks like this:

(Now sometimes you have to do this prerequisite)

gpg –edit-key email@****.com

trust
5 (select 5 if you ultimately trust the key)
save

Ultimate Encryption Command:

F:test>gpg -r (NameOfCert) -o (NewEncryptedFile.pgp) -e (FileToBeEncrypted.pdf)

Decrypt looks like this:

F:test>gpg –batch –yes –passphrase (your passphrase) -o (UnencryptedOutput.xml) -d (EncryptedFileInput.pgp)
gpg: encrypted with 4096-bit RSA key, ID ********, created 2014-10-20
“Certificate Name, Description, Etc. <Email Address>”

Pre-Seeding with 2012 DFSR

I built a Windows 2012 file server to upgrade from 2008r2 I used this robocopy command to pre-seed the file server in order to speed things up.

robocopy.exe “\source serverd$” “d:” /b /e /copyall /r:6 /xd dfsrprivate /log:robo.log /tee /MT

We started using dfsr to populate the machine initially but it was taking too long and pulling from a remote server. Using the robocopy pre-seed command copied the files from a local host and executed at line speed – resulting in a much faster time to completion.

Also, don’t forget to take advantage of the new feature in Windows 2012, De-Duplication…. I’m saving over 20-30% on my file servers now that I’ve enabled dedupe.

 

I’ve had to come back and edit this because I discovered something that really helps ease the mind while performing the above operation. – and is something that should be done to spot check the system before introducing a pre seeded system into DFS. Check your hash for the directories and files to make sure that they are identical. This will allow a faster delta transition time between the old and new systems.

 

C:Windowssystem32>dfsrdiag filehash /filepath:\onrfs01d$vacancy_monitoring

File Hash: DBCCC7FA-E523939F-835B14D5-31020191

Operation Succeeded
C:Windowssystem32>dfsrdiag filehash /filepath:\onrfs02d$vacancy_monitoring

File Hash: DBCCC7FA-E523939F-835B14D5-31020191

Operation Succeeded

Setup High Availability with Sophos 9.x

Today I wanted to take advantage of installing a passive instance of Sophos UTM 9.x (we use version 9.307006 at the moment)

Our installation is entirely virtual, we only have virtual hosts, ESXi 5.5 2456374, Force10 Switches and SAN gear.

First thing to do is get your UTM setup and configured the way you want it. Put a couple extra nics in there for the future, get basic firewall functionality setup and “everything” working. OR, if you’ve already got a UTM setup, start by logging into your UTM shell as root and enter the following command:

cc set ha advanced virtual_mac 0

The above MUST be done for the HA system to work in the vmWare environment.

Next, clone your existing system. I have an even/odd numbered vhost scheme going on so I changed the name of the existing UTM to UTM01 and cloned it from vHost01 to vHost02 as UTM02.

Once the clone snapshot completed, I logged into the UTM01 and went to:

Management, High Availability, and clicked the Configuration tab.

Here, select Hot Standby (active-passive)

Below in Configuration, select your NIC, I used the last one added to the system. (eth7)

Then enter the device name (csutm01) and a device node select 1 and set an encryption key.

 

Go ahead and apply all your settings, (click both apply buttons)

By now your clone should be done, DO NOT POWER IT ON.

Right click the VM, and disconnect all network cards except the one connected to the HA network.

Now, power up the UTM02 and open the console. Wait for the system to come to the login screen and use your root credentials to login.

Now we will reset the configuration of the UTM02 to factory. MAKE SURE you are on the CORRECT SYSTEM!!

So, login as root,

cc (enter)

RAW (enter)

system_factory_reset (enter)

The system will power off when complete. Once it has powered off, reconnect your internal interface. Power back up again and go through the basic setup settings. The only thing required is an internal network. Don’t configure anything else. (may have to add a license file)

Once the system allows you to login,  go to

Management, High Availability, and clicked the Configuration tab.

Here, select Hot Standby (active-passive)

Below in Configuration, select your NIC, I used the last one added to the system. (eth7)

Then enter the device name (csutm02) and a device node select 2 and set an encryption key.

Go ahead and apply all your settings, (click both apply buttons)

The web interface will lock up indicating that you have lost connection to the secondary UTM02.

You should already be logged in to UTM01 and if you go to the High Availability menu, you should see the system UTM01 Active, or Master and the UTM02 status Syncing. It takes about 15 minutes for the system to stabilize so be patient.

There you have it. the above steps are exactly how I set up my three data centers and a development environment. If you have any troubles please feel free to send me a message

Non-Root User Permissions Oracle Linux

I’m working on a system recently migrated to Oracle Linux 6.6 from a very old Solaris system. There is a CIFS mount from a Windows 2012r2 server that existed on the old system. The raw mount point has 777 directory permissions.

[root@localhost ~]# ls -ld /datastore/
drwxrwxrwx 2 root root 4096 Jan 6 09:50 /datastore/
When the mount is active the permissions are:

[root@localhost ~]# ls -ld /datastore/
drwxr-xr-x 1 root root 634564 Jan 6 09:50 /datastore/
Users other than root cannot write to the share or create files. Looking at the old server, the permissions on files and subdirectories within the same share have the setuid bit. This is not present on the new system. The /etc/fstab looks like:

//cifshost/datastore /datastore cifs username=user,password=password,domain=mydomain.local 0 0

You’ll need to change /etc/fstab and add the file_mode=0666,dir_mode=0777 mount options.

//cifshost/share/datastore /datastore cifs user=user,pass=password,file_mode=0666,dir_mode=0777 0 0

And you should be good to go!