Oracle Solaris 10 ZFS Administration

Menu

Introduction

Getting Started With ZFS

Mastering ZFS Basics

Managing ZFS Storage Pools

Managing ZFS File Systems

Working With ZFS Snapshots and Clones

Installing and Booting a ZFS Root File System

Troubleshooting ZFS Issues and Recovering ZFS Data

ZFS Advanced Topic


Using ACLs to Protect ZFS Files

ZFS Delegated Administration

Troubleshooting ZFS Issues and Recovering ZFS Data
 
From docs.oracle.com:

ZFS Troubleshooting and Data Recovery.

zdb(1M)


From www.oracle.com:

Oracle Solaris White Papaer, Configuring Oracle. Solaris ZFS for an Oracle Database.


Sun Internals

ZFS Troubleshooting Guide

ZFS Best Practices Guide

ZFS Evil Tuning Guide

ZFS Configuration Guide

ZFS for Databases

ZFS forensics scrollback script


Bloggers

Neelakanth Nadgir's blog Databases and ZFS

Neelakanth Nadgir's blog Optimizing MySQL Performance with ZFS - Slides available

Roch (rhymes with Spock) Bourbonnais : Kernel Performance Engineering Weblog ZFS to UFS Performance Comparison on Day 1.

Eric Kustarz's Weblog FS perf 201 : Postmark popular benchmark - Netapp's postmark. Let's see how long it takes to do 1,000,000 transactions.

Chad Mynhier's Weblog "To timidly go where many have gone before" ZFS benchmarking and ZFS I/O reordering benchmark.

Jeff Bonwick's Weblog ZFS Block Allocation describes device selection, metaslab selection and block selection on the zfs filesystem.

Eric Schrock's Weblog ZFS Hot Spares.

Bill Moore's Blog Site Flippin' off bits Ditto Blocks - The Amazing Tape Repellent.

Peerapong Kunasirirat's weblog, Solaris ZFS Performance (compared to VxFS, Linux Ext3 and W2K NTFS).

Bob Netherton's Weblog, ZFS and FMA - Two great tastes ......

Ben Rockwood, zdb: Examining ZFS At Point-Blank Range

Multithreaded Musings, Stand back - I'm a scientist! Seven Years of Good Luck: Splitting Mirrors Splitting a ZFS pool for backups.

Neil Perrin's Weblog, slog blog (or blogging on slogging) (See ZFS Logging below.)

Bill Pijewski's Blog Our ZFS I/O Throttle

Vincent Dumouchel, Monitoring zpool with email alert

Daren Sefcik, Replace a failing drive in a ZFS Zpool


OpenSolaris

ZFS Troubleshooting and Data Recovery

This is the in-core data structures representing a single file in ZFS.


Other

ZFS Troubleshooting and Data Recovery

This Automatic Performance Tuning in the Zettabyte File System White Paper. Authors, Val Henson, Matt Ahrens and Jeff Bonwick.

This Existential QoS for Storage White Paper. Authors, Val Henson, Matt Ahrens and Jeff Bonwick.

The Management of NFS Performance With Solaris ZFS.

From Pinceton University, Unix Systems, ZFS Management.

A very good tutorial by Richard Elling, ZFS Tutorial USENIX LISA09 Conference.


Examining ZFS On-Disk Format Using mdb and zdb: Max Bruning

The Slides for the above presentation.


ultra20:/> kstat -m zfs
module: zfs                             instance: 0
name:   arcstats                        class:    misc
        c                               3213883392
        c_max                           3213883392
        c_min                           401735424
        crtime                          37.815241035
        deleted                         1758
        demand_data_hits                118362
        demand_data_misses              3231
        demand_metadata_hits            196183
        demand_metadata_misses          3875
        evict_skip                      0
        hash_chain_max                  3
        hash_chains                     636
        hash_collisions                 35430
        hash_elements                   9949
        hash_elements_max               10038
        hdr_size                        1691424
        hits                            338259
        l2_abort_lowmem                 0
        l2_cksum_bad                    0
        l2_evict_lock_retry             0
        l2_evict_reading                0
        l2_feeds                        0
        l2_free_on_write                0
        l2_hdr_size                     0
        l2_hits                         0
        l2_io_error                     0
        l2_misses                       0
        l2_rw_clash                     0
        l2_size                         0
        l2_writes_done                  0
        l2_writes_error                 0
        l2_writes_hdr_miss              0
        l2_writes_sent                  0
        memory_throttle_count           0
        mfu_ghost_hits                  24
        mfu_hits                        220931
        misses                          10531
        mru_ghost_hits                  271
        mru_hits                        93720
        mutex_miss                      0
        p                               1609673728
        prefetch_data_hits              327
        prefetch_data_misses            811
        prefetch_metadata_hits          23387
        prefetch_metadata_misses        2614
        recycle_miss                    0
        size                            311499280
        snaptime                        186510.040517787

module: zfs                             instance: 0
name:   vdev_cache_stats                class:    misc
        crtime                          37.815271531
        delegations                     2406
        hits                            2282
        misses                          3769
        snaptime                        186510.04192546


ultra20:/> zdb -h
zdb: illegal option -- h
Usage: zdb [-udibcsvL] [-U cachefile_path] [-O order] [-B os:obj:level:blkid] [-S user:cksumalg] dataset [object...]
       zdb -C [pool]
       zdb -l dev
       zdb -R pool:vdev:offset:size:flags
       zdb [-p path_to_vdev_dir]
       zdb -e pool | GUID | devid ...
        -u uberblock
        -d datasets
        -C cached pool configuration
        -i intent logs
        -b block statistics
        -c checksum all data blocks
        -s report stats on zdb's I/O
        -S : -- dump blkptr signatures
        -v verbose (applies to all others)
        -l dump label contents
        -L live pool (allows some errors)
        -O [!] visitation order
        -U cachefile_path -- use alternate cachefile
        -B objset:object:level:blkid -- simulate bad block
        -R read and display block from a device
        -e Pool is exported/destroyed/has altroot
        -p  (use with -e)
Specify an option more than once (e.g. -bb) to make only that option verbose
Default is to dump everything non-verbosely


ultra20:/> zdb -vv
rpool
    version=10
    name='rpool'
    state=0
    txg=598222
    pool_guid=11081266947880784355
    hostid=610922745
    hostname=''
    vdev_tree
        type='root'
        id=0
        guid=11081266947880784355
        children[0]
                type='mirror'
                id=0
                guid=5256371532805648474
                whole_disk=0
                metaslab_array=15
                metaslab_shift=30
                ashift=9
                asize=159944015872
                is_log=0
                children[0]
                        type='disk'
                        id=0
                        guid=6624362033263488114
                        path='/dev/dsk/c1t0d0s0'
                        devid='id1,sd@ASEAGATE_STA7216SASUN160G_0650M9G0Q4=5LS9G0Q4/a'
                        phys_path='/pci@0,0/pci108e,534d@5/disk@0,0:a'
                        whole_disk=0
                        DTL=61
                children[1]
                        type='disk'
                        id=1
                        guid=1671266508294047211
                        path='/dev/dsk/c1t1d0s0'
                        devid='id1,sd@f2469f0f9494c3bad0007e1e50000/a'
                        phys_path='/pci@0,0/pci108e,534d@5/disk@1,0:a'
                        whole_disk=0
                        DTL=60

From Richard Elling, zilstat, How do you know if a separate ZIL log device will help your ZFS performance?

Setting Up Separate ZFS Logging Devices

The ZFS intent log (ZIL) is provided to satisfy POSIX requirements for synchronous transactions. For example, databases often require their transactions to be on stable storage devices when returning from a system call. NFS and other applications can also use fsync() to ensure data stability. By default, the ZIL is allocated from blocks within the main storage pool. However, better performance might be possible by using separate intent log devices in your ZFS storage pool, such as with NVRAM or a dedicated disk.

Log devices for the ZFS intent log are not related to database log files.

You can set up a ZFS logging device when the storage pool is created or after the pool is created. For example:

# zpool create datap mirror c1t1d0 c1t2d0 mirror c1t3d0 c1t4d0 log mirror c1t5d0 c1t8d0
# zpool status
  pool: datap
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        datap       ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c1t1d0  ONLINE       0     0     0
            c1t2d0  ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c1t3d0  ONLINE       0     0     0
            c1t4d0  ONLINE       0     0     0
        logs        ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c1t5d0  ONLINE       0     0     0
            c1t8d0  ONLINE       0     0     0

errors: No known data errors

The following example shows how to add a mirrored log device to mirrored storage pool.

# zpool status newpool
  pool: newpool
 state: ONLINE
 scrub: none requested
config:

        NAME         STATE     READ WRITE CKSUM
        newpool      ONLINE       0     0     0
          mirror     ONLINE       0     0     0
            c1t9d0   ONLINE       0     0     0
            c1t10d0  ONLINE       0     0     0

errors: No known data errors


# zpool add newpool log mirror c1t11d0 c1t12d0
# zpool status newpool
  pool: newpool
 state: ONLINE
 scrub: none requested
config:

        NAME         STATE     READ WRITE CKSUM
        newpool      ONLINE       0     0     0
          mirror     ONLINE       0     0     0
            c1t9d0   ONLINE       0     0     0
            c1t10d0  ONLINE       0     0     0
        logs         ONLINE       0     0     0
          mirror     ONLINE       0     0     0
            c1t11d0  ONLINE       0     0     0
            c1t12d0  ONLINE       0     0     0

errors: No known data errors

ZFS log device recovery . In the Solaris 10 10/09 release, ZFS identifies intent log failures in the zpool status command. FMA reports these errors as well. Both ZFS and FMA describe how to recover from an intent log failure.

For example, if the system shuts down abruptly before synchronous write operations are committed to a pool with a separate log device, you will see intent-log related error messages in the zpool status output. For information about resolving log device failures, see the Solaris ZFS Administration Guide.


Using cache devices in your ZFS storage pool . In the Solaris 10 10/09 release, you can create pool and specify cache devices, which are used to cache storage pool data. Cache devices provide an additional layer of caching between main memory and disk. Using cache devices provide the greatest performance improvement for random read-workloads of mostly static content.

One or more cache devices can specified when the pool is created. For example:


# zpool create pool mirror c0t2d0 c0t4d0 cache c0t0d0
# zpool status pool
  pool: pool
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        pool        ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c0t2d0  ONLINE       0     0     0
            c0t4d0  ONLINE       0     0     0
        cache
          c0t0d0    ONLINE       0     0     0

errors: No known data errors

For information about determining whether using cache devices is appropriate for your environment, see the Solaris ZFS Administration Guide.