Friday, 12 June 2015

Test Solaris Root Mirror

Here's the situation. Being the good UNIX SysAdmin that you are, one of the first things you do is mirror the rootpool. You do something like:

zpool attach -f rpool c0t5000CCA03C5A7C00d0 c0t5000CCA03C5C19CCd0


...wait for the mirror to finish resilvering...

installboot -f -F zfs /usr/platform/`uname -i`/lib/fs/zfs/bootblk /dev/rdsk/c0t5000CCA03C5C19CCd0

(or better use - see comments below for why - :bootadm install-bootloader)

(Notice that my disk devices don't use slices - there'd be an "s0" at the end of the disk names - older ZFS systems needed to the root disk to be on a slice - this has fallen away)

So to test that you boot off the root disk - you go to ok prompt and try to boot off the second disk

shutdown -y -i0 -g0
...
ok> boot disk1
Boot device: /pci@3c0/pci@1/pci@0/pci@2/scsi@0/disk@p0  File and args:
ERROR: /packages/deblocker: Last Trap: Fast Data Access MMU Miss

So that's a bit of a bitch. Luckily, this is only a test. Start up your machine normally and then shut down with an init 0. Somehow rebooting with an init, sorts this out.

(If it wasn't a test, you can try to specify the path old school. Your path you can figure out - though I've had hit and miss success - by running devalias and scsi-probe-all and doing a path similar to /pci@400/pci@1/pci@0/pci@0/LSI,sas@0/disk@w5000cca02584ad19,0:a. - Sidenote: If that doesn't work I've had limited success by adding a to the last number before the comma).

Either way, once you've got a booted system. You can check which disk you're booted from by running prtconf -vp |grep bootpath.

This post is a little neither here nor there - but that's because my testing has brought various results and was done whil I was changing from a sas root disk to an ssd root disk. I'll update it as I retest.

Thursday, 4 June 2015

VLAN tagging in Solaris

If you want to have zones in multiple subnets but using the same physical port, you have to use VLAN tagging. VLAN tagging is pretty easy to configure on the zones (point 7), less so on the global zone.

  1. The Network guys have to do a few things for you:
    • set the network ports your nic connects to as "trunked"
    • give you the vlan id of the vlans you want to connect to (digits)
    • for aggregated NICs, set LACP to active (rather than auto)
    • set the default vlan-id of the ports to 1 
  2. NOTE: Configuring the ports as trunked, obsoletes any traffic that isn't vlan tagged. All or nothing baby. 
  3. Your aggregate needs LACP activity to be active
      • dladm modify-aggr -L active -T short aggr0
  4. I use aggregates, but I think most of the same steps below applies for IPMP.
  5. I wish you could add a default vlan ID to the aggregate when you create it but you can't (and I get the feeling if I think really hard about it, I'll be able to see the logic in why). Instead you have to create a vnic on the aggregate that uses that vlan ID:
      • dladm create-vnic -v 10 -l aggr0 vnic10
  6. Now create an address on that vnic
      • ipadm create-ip vnic10
      • ipadm create-addr -T static -a 196.0.10.15/24 vnic10
  7. That sorts out the global zone. For the zones its pretty easy. Just set the vlan-id attribute (under anet) on the zone config.

NOTES:
  • The active LACP is not something I'm sure needs to be there but it worked so I'm leaving it.
  • IPMP in zones - if I recall correctly - needs vnics created for you to do IPMP within the zone. Just make sure you assign the correct vlan ID to those vnics and you should be fine.

Wednesday, 27 May 2015

Solaris 11.2 breaks my zpool import of clone with missing cache disk

1.       Assign disks to machine
root@prodmachine:~# sanlun lun show |grep zpooltest
zamgnasvm01          /vol/unixprod_vol99_zpooltest/unixprod_vol99_zpooltest_data01  /dev/rdsk/c0t600A0980383034716124465434593156d0s2 qlc3       FCP        1g      C
zamgnasvm01          /vol/unixprod_vol99_zpooltest/unixprod_vol99_zpooltest_data02  /dev/rdsk/c0t600A0980383034716124465434593157d0s2 qlc2       FCP        1g      C
zamgnasvm01          /vol/unixprod_vol99_zpooltest/unixprod_vol99_zpooltest_log01   /dev/rdsk/c0t600A0980383034716124465434593158d0s2 qlc2       FCP        1g      C
zamgnasvm01          /vol/unixprod_vol99_zpooltest/unixprod_vol99_zpooltest_cache01 /dev/rdsk/c0t600A0980383034716124465434593159d0s2 qlc1       FCP        1g      C
root@prodmachine:~#

2.       Create a zpool
root@prodmachine:~# zpool status zpooltest
  pool: zpooltest
 state: ONLINE
  scan: none requested
config:

        NAME                                       STATE     READ WRITE CKSUM
        zpooltest                                  ONLINE       0     0     0
          mirror-0                                 ONLINE       0     0     0
            c0t600A0980383034716124465434593156d0  ONLINE       0     0     0
            c0t600A0980383034716124465434593157d0  ONLINE       0     0     0
        logs
          c0t600A0980383034716124465434593158d0    ONLINE       0     0     0
        cache
          c0t600A0980383034716124465434593159d0    ONLINE       0     0     0

errors: No known data errors
root@prodmachine:~#

3.       Use SAN replication to replicate disks to remote machine
4.       Check remote machine’s BE
root@drmachine:~# beadm list
BE                 Active Mountpoint Space  Policy Created
--                 ------ ---------- -----  ------ -------
solaris-5          -      -          53.83M static 2014-07-08 12:43
solaris-5-backup-1 NR     /          11.78G static 2014-10-13 14:34
solaris-7          -      -          15.05G static 2015-02-26 14:43
solaris-8          -      -          1.77G  static 2015-04-08 10:39
root@drmachine:~# pkg list entire
NAME (PUBLISHER)                                  VERSION                    IFO
entire                                            0.5.11-0.175.1.19.0.6.0    i--
root@drmachine:~#
5.       Only assign the data lun (i.e. no mirror, no log, no cache)
root@drmachine:~# sanlun lun show |grep zpooltest
zamgnasvmdr01        /vol/zamgnasvm01_unixprod_vol99_zpooltest_mirror_CLONE/unixprod_vol99_zpooltest_data01 /dev/rdsk/c0t600A0980443355634D3F46644573306Fd0s2 qlc2       FCP        1g      C
root@drmachine:~#
6.       Import the pool (with –f and –m options) RESULT: SUCCESS
root@drmachine:~# zpool import -R / -f -m zpooltest
root@drmachine:~# zpool status zpooltest
  pool: zpooltest
 state: DEGRADED
status: One or more devices are unavailable in response to persistent errors.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or 'fmadm repaired', or replace the device
        with 'zpool replace'.
        Run 'zpool status -v' to see device specific details.
  scan: none requested
config:

        NAME                                       STATE     READ WRITE CKSUM
        zpooltest                                  DEGRADED     0     0     0
          mirror-0                                 DEGRADED     0     0     0
            c0t600A0980443355634D3F46644573306Fd0  ONLINE       0     0     0
            4167588681570226322                    UNAVAIL      0     0     0
        logs
          16316075685438834122                     UNAVAIL      0     0     0
        cache
          c0t600A0980383034716124465434593159d0    UNAVAIL      0     0     0

errors: No known data errors
root@drmachine:~#
7.       Change to different BE and reboot
root@drmachine:~# beadm activate solaris-7
root@drmachine:~# init 6
root@drmachine:~# beadm list
BE                 Active Mountpoint Space   Policy Created
--                 ------ ---------- -----   ------ -------
solaris-5          -      -          53.83M  static 2014-07-08 12:43
solaris-5-backup-1 -      -          1.45G   static 2014-10-13 14:34
solaris-7          NR     /          25.94G  static 2015-02-26 14:43
solaris-8          -      -          1.77G   static 2015-04-08 10:39
root@drmachine:~# pkg list entire
NAME (PUBLISHER)                                  VERSION                    IFO
entire                                            0.5.11-0.175.2.5.0.5.0     i--
root@drmachine:~#
8.       Import the pool (with –f and –m) RESULT: FAILURE
root@drmachine:~# zpool import -R / -f -m zpooltest
cannot import 'zpooltest': one or more devices is currently unavailable
root@drmachine:~#
9.       Assign the mirror disk to the machine and retry import RESULT: FAILURE
root@drmachine:~# sanlun lun show
controller(7mode)/                                                                                          device                                            host                  lun
vserver(Cmode)       lun-pathname                                                                           filename                                          adapter    protocol   size    mode
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
zamgnasvmdr01        /vol/zamgnasvm01_unixprod_vol99_zpooltest_mirror_CLONE/unixprod_vol99_zpooltest_data01 /dev/rdsk/c0t600A0980443355634D3F46644573306Fd0s2 qlc2       FCP        1g      C
zamgnasvmdr01        /vol/zamgnasvm01_unixprod_vol99_zpooltest_mirror_CLONE/unixprod_vol99_zpooltest_data02 /dev/rdsk/c0t600A0980443355634D3F466445733070d0s2 qlc0       FCP        1g      C
root@drmachine:~# zpool import -R / -f -m zpooltest
cannot import 'zpooltest': one or more devices is currently unavailable
root@drmachine:~#
10.   Assign the log device disk to the machine and retry import RESULT: SUCCESS
root@drmachine:~# sanlun lun show
controller(7mode)/                                                                                          device                                            host                  lun
vserver(Cmode)       lun-pathname                                                                           filename                                          adapter    protocol   size    mode
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
zamgnasvmdr01        /vol/zamgnasvm01_unixprod_vol99_zpooltest_mirror_CLONE/unixprod_vol99_zpooltest_data01 /dev/rdsk/c0t600A0980443355634D3F46644573306Fd0s2 qlc2       FCP        1g      C
zamgnasvmdr01        /vol/zamgnasvm01_unixprod_vol99_zpooltest_mirror_CLONE/unixprod_vol99_zpooltest_data02 /dev/rdsk/c0t600A0980443355634D3F466445733070d0s2 qlc0       FCP        1g      C
zamgnasvmdr01        /vol/zamgnasvm01_unixprod_vol99_zpooltest_mirror_CLONE/unixprod_vol99_zpooltest_log01  /dev/rdsk/c0t600A0980443355634D3F466445733071d0s2 qlc0       FCP        1g      C
root@drmachine:~# zpool import -R / -f -m zpooltest
root@drmachine:~#
11.    Export  the zpool, unassign the log device and assign the cache device disk to the machine and retry import. RESULT: FAILURE
root@drmachine:~# sanlun lun show
controller(7mode)/                                                                                           device                                            host                  lun
vserver(Cmode)       lun-pathname                                                                            filename                                          adapter    protocol   size    mode
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
zamgnasvmdr01        /vol/zamgnasvm01_unixprod_vol99_zpooltest_mirror_CLONE/unixprod_vol99_zpooltest_data01  /dev/rdsk/c0t600A0980443355634D3F46644573306Fd0s2 qlc2       FCP        1g      C
zamgnasvmdr01        /vol/zamgnasvm01_unixprod_vol99_zpooltest_mirror_CLONE/unixprod_vol99_zpooltest_data02  /dev/rdsk/c0t600A0980443355634D3F466445733070d0s2 qlc0       FCP        1g      C
zamgnasvmdr01        /vol/zamgnasvm01_unixprod_vol99_zpooltest_mirror_CLONE/unixprod_vol99_zpooltest_cache01 /dev/rdsk/c0t600A0980443355634D3F466445733072d0s2 qlc0       FCP        1g      C
root@drmachine:~# zpool import -R / -f -m zpooltest
cannot import 'zpooltest': one or more devices is currently unavailable
root@drmachine:~#




Got a call logged with Oracle to sort this out. Fixed with Solaris 11.3.