summaryrefslogtreecommitdiffstats
path: root/Documentation/device-mapper/unstriped.rst
blob: 0a8d3eb3f072da26bc55071f4ea4c3f889156f5f (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
================================
Device-mapper "unstriped" target
================================

Introduction
============

The device-mapper "unstriped" target provides a transparent mechanism to
unstripe a device-mapper "striped" target to access the underlying disks
without having to touch the true backing block-device.  It can also be
used to unstripe a hardware RAID-0 to access backing disks.

Parameters:
<number of stripes> <chunk size> <stripe #> <dev_path> <offset>

<number of stripes>
        The number of stripes in the RAID 0.

<chunk size>
	The amount of 512B sectors in the chunk striping.

<dev_path>
	The block device you wish to unstripe.

<stripe #>
        The stripe number within the device that corresponds to physical
        drive you wish to unstripe.  This must be 0 indexed.


Why use this module?
====================

An example of undoing an existing dm-stripe
-------------------------------------------

This small bash script will setup 4 loop devices and use the existing
striped target to combine the 4 devices into one.  It then will use
the unstriped target ontop of the striped device to access the
individual backing loop devices.  We write data to the newly exposed
unstriped devices and verify the data written matches the correct
underlying device on the striped array::

  #!/bin/bash

  MEMBER_SIZE=$((128 * 1024 * 1024))
  NUM=4
  SEQ_END=$((${NUM}-1))
  CHUNK=256
  BS=4096

  RAID_SIZE=$((${MEMBER_SIZE}*${NUM}/512))
  DM_PARMS="0 ${RAID_SIZE} striped ${NUM} ${CHUNK}"
  COUNT=$((${MEMBER_SIZE} / ${BS}))

  for i in $(seq 0 ${SEQ_END}); do
    dd if=/dev/zero of=member-${i} bs=${MEMBER_SIZE} count=1 oflag=direct
    losetup /dev/loop${i} member-${i}
    DM_PARMS+=" /dev/loop${i} 0"
  done

  echo $DM_PARMS | dmsetup create raid0
  for i in $(seq 0 ${SEQ_END}); do
    echo "0 1 unstriped ${NUM} ${CHUNK} ${i} /dev/mapper/raid0 0" | dmsetup create set-${i}
  done;

  for i in $(seq 0 ${SEQ_END}); do
    dd if=/dev/urandom of=/dev/mapper/set-${i} bs=${BS} count=${COUNT} oflag=direct
    diff /dev/mapper/set-${i} member-${i}
  done;

  for i in $(seq 0 ${SEQ_END}); do
    dmsetup remove set-${i}
  done

  dmsetup remove raid0

  for i in $(seq 0 ${SEQ_END}); do
    losetup -d /dev/loop${i}
    rm -f member-${i}
  done

Another example
---------------

Intel NVMe drives contain two cores on the physical device.
Each core of the drive has segregated access to its LBA range.
The current LBA model has a RAID 0 128k chunk on each core, resulting
in a 256k stripe across the two cores::

   Core 0:       Core 1:
  __________    __________
  | LBA 512|    | LBA 768|
  | LBA 0  |    | LBA 256|
  ----------    ----------

The purpose of this unstriping is to provide better QoS in noisy
neighbor environments. When two partitions are created on the
aggregate drive without this unstriping, reads on one partition
can affect writes on another partition.  This is because the partitions
are striped across the two cores.  When we unstripe this hardware RAID 0
and make partitions on each new exposed device the two partitions are now
physically separated.

With the dm-unstriped target we're able to segregate an fio script that
has read and write jobs that are independent of each other.  Compared to
when we run the test on a combined drive with partitions, we were able
to get a 92% reduction in read latency using this device mapper target.


Example dmsetup usage
=====================

unstriped ontop of Intel NVMe device that has 2 cores
-----------------------------------------------------

::

  dmsetup create nvmset0 --table '0 512 unstriped 2 256 0 /dev/nvme0n1 0'
  dmsetup create nvmset1 --table '0 512 unstriped 2 256 1 /dev/nvme0n1 0'

There will now be two devices that expose Intel NVMe core 0 and 1
respectively::

  /dev/mapper/nvmset0
  /dev/mapper/nvmset1

unstriped ontop of striped with 4 drives using 128K chunk size
--------------------------------------------------------------

::

  dmsetup create raid_disk0 --table '0 512 unstriped 4 256 0 /dev/mapper/striped 0'
  dmsetup create raid_disk1 --table '0 512 unstriped 4 256 1 /dev/mapper/striped 0'
  dmsetup create raid_disk2 --table '0 512 unstriped 4 256 2 /dev/mapper/striped 0'
  dmsetup create raid_disk3 --table '0 512 unstriped 4 256 3 /dev/mapper/striped 0'
OpenPOWER on IntegriCloud