2021-04-26 06:27:21 +00:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0 */
|
2021-09-06 14:23:14 +00:00
|
|
|
/*
|
|
|
|
* This program is free software; you can redistribute it and/or
|
|
|
|
* modify it under the terms of the GNU General Public
|
|
|
|
* License v2 as published by the Free Software Foundation.
|
|
|
|
*
|
|
|
|
* This program is distributed in the hope that it will be useful,
|
|
|
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
|
|
|
* General Public License for more details.
|
|
|
|
*
|
|
|
|
* You should have received a copy of the GNU General Public
|
|
|
|
* License along with this program; if not, write to the
|
|
|
|
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
|
|
|
|
* Boston, MA 021110-1307, USA.
|
|
|
|
*/
|
2021-04-26 06:27:21 +00:00
|
|
|
|
|
|
|
#ifndef __BTRFS_ZONED_H__
|
|
|
|
#define __BTRFS_ZONED_H__
|
|
|
|
|
|
|
|
#include "kerncompat.h"
|
2023-08-28 20:12:13 +00:00
|
|
|
#include <sys/types.h>
|
2021-04-26 06:27:21 +00:00
|
|
|
#include <stdbool.h>
|
2023-08-28 20:12:13 +00:00
|
|
|
#include "kernel-lib/bitops.h"
|
|
|
|
#include "kernel-lib/sizes.h"
|
btrfs-progs: zoned: implement log-structured superblock
Superblock (and its copies) is the only data structure in btrfs which has a
fixed location on a device. Since we cannot overwrite in a sequential write
required zone, we cannot place superblock in the zone. One easy solution
is limiting superblock and copies to be placed only in conventional zones.
However, this method has two downsides: one is reduced number of superblock
copies. The location of the second copy of superblock is 256GB, which is in
a sequential write required zone on typical devices in the market today.
So, the number of superblock and copies is limited to be two. Second
downside is that we cannot support devices which have no conventional zones
at all.
To solve these two problems, we employ superblock log writing. It uses two
adjacent zones as a circular buffer to write updated superblocks. Once the
first zone is filled up, start writing into the second one. Then, when
both zones are filled up and before starting to write to the first zone
again, reset the first zone.
We can determine the position of the latest superblock by reading write
pointer information from a device. One corner case is when both zones are
full. For this situation, we read out the last superblock of each zone, and
compare them to determine which zone is older.
The following zones are reserved as the circular buffer on ZONED btrfs.
- primary superblock: offset 0B (and the following zone)
- first copy: offset 512G (and the following zone)
- Second copy: offset 4T (4096G, and the following zone)
If these reserved zones are conventional, superblock is written fixed at
the start of the zone without logging.
Currently, superblock reading/writing is done by pread/pwrite. This
commit replace the call sites with sbread/sbwrite to wrap the functions.
For zoned btrfs, btrfs_sb_io which is called from sbread/sbwrite
reverses the IO position back to a mirror number, maps the mirror number
into the superblock logging position, and do the IO.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-04-26 06:27:26 +00:00
|
|
|
#include "kernel-shared/disk-io.h"
|
2021-04-26 06:27:27 +00:00
|
|
|
#include "kernel-shared/volumes.h"
|
2023-04-19 21:17:14 +00:00
|
|
|
#include "kernel-shared/messages.h"
|
2021-04-26 06:27:21 +00:00
|
|
|
|
2023-08-28 20:12:13 +00:00
|
|
|
struct btrfs_block_group;
|
|
|
|
struct btrfs_fs_info;
|
|
|
|
|
2021-04-26 06:27:21 +00:00
|
|
|
#ifdef BTRFS_ZONED
|
|
|
|
#include <linux/blkzoned.h>
|
|
|
|
#else
|
|
|
|
|
|
|
|
struct blk_zone {
|
|
|
|
int dummy;
|
|
|
|
};
|
|
|
|
|
|
|
|
#endif /* BTRFS_ZONED */
|
|
|
|
|
btrfs-progs: zoned: implement log-structured superblock
Superblock (and its copies) is the only data structure in btrfs which has a
fixed location on a device. Since we cannot overwrite in a sequential write
required zone, we cannot place superblock in the zone. One easy solution
is limiting superblock and copies to be placed only in conventional zones.
However, this method has two downsides: one is reduced number of superblock
copies. The location of the second copy of superblock is 256GB, which is in
a sequential write required zone on typical devices in the market today.
So, the number of superblock and copies is limited to be two. Second
downside is that we cannot support devices which have no conventional zones
at all.
To solve these two problems, we employ superblock log writing. It uses two
adjacent zones as a circular buffer to write updated superblocks. Once the
first zone is filled up, start writing into the second one. Then, when
both zones are filled up and before starting to write to the first zone
again, reset the first zone.
We can determine the position of the latest superblock by reading write
pointer information from a device. One corner case is when both zones are
full. For this situation, we read out the last superblock of each zone, and
compare them to determine which zone is older.
The following zones are reserved as the circular buffer on ZONED btrfs.
- primary superblock: offset 0B (and the following zone)
- first copy: offset 512G (and the following zone)
- Second copy: offset 4T (4096G, and the following zone)
If these reserved zones are conventional, superblock is written fixed at
the start of the zone without logging.
Currently, superblock reading/writing is done by pread/pwrite. This
commit replace the call sites with sbread/sbwrite to wrap the functions.
For zoned btrfs, btrfs_sb_io which is called from sbread/sbwrite
reverses the IO position back to a mirror number, maps the mirror number
into the superblock logging position, and do the IO.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-04-26 06:27:26 +00:00
|
|
|
/* Number of superblock log zones */
|
|
|
|
#define BTRFS_NR_SB_LOG_ZONES 2
|
|
|
|
|
2022-04-06 01:43:10 +00:00
|
|
|
/*
|
|
|
|
* Location of the first zone of superblock logging zone pairs.
|
|
|
|
*
|
|
|
|
* - primary superblock: 0B (zone 0)
|
|
|
|
* - first copy: 512G (zone starting at that offset)
|
|
|
|
* - second copy: 4T (zone starting at that offset)
|
|
|
|
*/
|
|
|
|
#define BTRFS_SB_LOG_PRIMARY_OFFSET (0ULL)
|
|
|
|
#define BTRFS_SB_LOG_FIRST_OFFSET (512ULL * SZ_1G)
|
|
|
|
#define BTRFS_SB_LOG_SECOND_OFFSET (4096ULL * SZ_1G)
|
|
|
|
|
|
|
|
#define BTRFS_SB_LOG_FIRST_SHIFT const_ilog2(BTRFS_SB_LOG_FIRST_OFFSET)
|
|
|
|
#define BTRFS_SB_LOG_SECOND_SHIFT const_ilog2(BTRFS_SB_LOG_SECOND_OFFSET)
|
|
|
|
|
2021-04-26 06:27:21 +00:00
|
|
|
/*
|
|
|
|
* Zoned block device models
|
|
|
|
*/
|
|
|
|
enum btrfs_zoned_model {
|
|
|
|
ZONED_NONE,
|
|
|
|
ZONED_HOST_AWARE,
|
|
|
|
ZONED_HOST_MANAGED,
|
|
|
|
};
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Zone information for a zoned block device.
|
|
|
|
*/
|
|
|
|
struct btrfs_zoned_device_info {
|
|
|
|
enum btrfs_zoned_model model;
|
|
|
|
u64 zone_size;
|
|
|
|
u32 nr_zones;
|
|
|
|
struct blk_zone *zones;
|
2021-09-22 07:53:43 +00:00
|
|
|
bool emulated;
|
2021-04-26 06:27:21 +00:00
|
|
|
};
|
|
|
|
|
|
|
|
enum btrfs_zoned_model zoned_model(const char *file);
|
|
|
|
u64 zone_size(const char *file);
|
|
|
|
int btrfs_get_zone_info(int fd, const char *file,
|
|
|
|
struct btrfs_zoned_device_info **zinfo);
|
|
|
|
int btrfs_get_dev_zone_info_all_devices(struct btrfs_fs_info *fs_info);
|
2021-04-26 06:27:22 +00:00
|
|
|
int btrfs_check_zoned_mode(struct btrfs_fs_info *fs_info);
|
2021-04-26 06:27:21 +00:00
|
|
|
|
btrfs-progs: zoned: implement log-structured superblock
Superblock (and its copies) is the only data structure in btrfs which has a
fixed location on a device. Since we cannot overwrite in a sequential write
required zone, we cannot place superblock in the zone. One easy solution
is limiting superblock and copies to be placed only in conventional zones.
However, this method has two downsides: one is reduced number of superblock
copies. The location of the second copy of superblock is 256GB, which is in
a sequential write required zone on typical devices in the market today.
So, the number of superblock and copies is limited to be two. Second
downside is that we cannot support devices which have no conventional zones
at all.
To solve these two problems, we employ superblock log writing. It uses two
adjacent zones as a circular buffer to write updated superblocks. Once the
first zone is filled up, start writing into the second one. Then, when
both zones are filled up and before starting to write to the first zone
again, reset the first zone.
We can determine the position of the latest superblock by reading write
pointer information from a device. One corner case is when both zones are
full. For this situation, we read out the last superblock of each zone, and
compare them to determine which zone is older.
The following zones are reserved as the circular buffer on ZONED btrfs.
- primary superblock: offset 0B (and the following zone)
- first copy: offset 512G (and the following zone)
- Second copy: offset 4T (4096G, and the following zone)
If these reserved zones are conventional, superblock is written fixed at
the start of the zone without logging.
Currently, superblock reading/writing is done by pread/pwrite. This
commit replace the call sites with sbread/sbwrite to wrap the functions.
For zoned btrfs, btrfs_sb_io which is called from sbread/sbwrite
reverses the IO position back to a mirror number, maps the mirror number
into the superblock logging position, and do the IO.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-04-26 06:27:26 +00:00
|
|
|
#ifdef BTRFS_ZONED
|
|
|
|
size_t btrfs_sb_io(int fd, void *buf, off_t offset, int rw);
|
|
|
|
|
2021-05-11 04:25:01 +00:00
|
|
|
/*
|
|
|
|
* Read BTRFS_SUPER_INFO_SIZE bytes from fd to buf
|
|
|
|
*
|
|
|
|
* @fd fd of the device to be read from
|
|
|
|
* @buf: buffer contains a super block
|
|
|
|
* @offset: offset of the superblock
|
|
|
|
*
|
|
|
|
* Return count of bytes successfully read.
|
|
|
|
*/
|
btrfs-progs: zoned: implement log-structured superblock
Superblock (and its copies) is the only data structure in btrfs which has a
fixed location on a device. Since we cannot overwrite in a sequential write
required zone, we cannot place superblock in the zone. One easy solution
is limiting superblock and copies to be placed only in conventional zones.
However, this method has two downsides: one is reduced number of superblock
copies. The location of the second copy of superblock is 256GB, which is in
a sequential write required zone on typical devices in the market today.
So, the number of superblock and copies is limited to be two. Second
downside is that we cannot support devices which have no conventional zones
at all.
To solve these two problems, we employ superblock log writing. It uses two
adjacent zones as a circular buffer to write updated superblocks. Once the
first zone is filled up, start writing into the second one. Then, when
both zones are filled up and before starting to write to the first zone
again, reset the first zone.
We can determine the position of the latest superblock by reading write
pointer information from a device. One corner case is when both zones are
full. For this situation, we read out the last superblock of each zone, and
compare them to determine which zone is older.
The following zones are reserved as the circular buffer on ZONED btrfs.
- primary superblock: offset 0B (and the following zone)
- first copy: offset 512G (and the following zone)
- Second copy: offset 4T (4096G, and the following zone)
If these reserved zones are conventional, superblock is written fixed at
the start of the zone without logging.
Currently, superblock reading/writing is done by pread/pwrite. This
commit replace the call sites with sbread/sbwrite to wrap the functions.
For zoned btrfs, btrfs_sb_io which is called from sbread/sbwrite
reverses the IO position back to a mirror number, maps the mirror number
into the superblock logging position, and do the IO.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-04-26 06:27:26 +00:00
|
|
|
static inline size_t sbread(int fd, void *buf, off_t offset)
|
|
|
|
{
|
|
|
|
return btrfs_sb_io(fd, buf, offset, READ);
|
|
|
|
}
|
|
|
|
|
2021-05-11 04:25:01 +00:00
|
|
|
/*
|
|
|
|
* Write BTRFS_SUPER_INFO_SIZE bytes from buf to fd
|
|
|
|
*
|
|
|
|
* @fd fd of the device to be written to
|
|
|
|
* @buf: buffer contains a super block
|
|
|
|
* @offset: offset of the superblock
|
|
|
|
*
|
|
|
|
* Return count of bytes successfully written.
|
|
|
|
*/
|
btrfs-progs: zoned: implement log-structured superblock
Superblock (and its copies) is the only data structure in btrfs which has a
fixed location on a device. Since we cannot overwrite in a sequential write
required zone, we cannot place superblock in the zone. One easy solution
is limiting superblock and copies to be placed only in conventional zones.
However, this method has two downsides: one is reduced number of superblock
copies. The location of the second copy of superblock is 256GB, which is in
a sequential write required zone on typical devices in the market today.
So, the number of superblock and copies is limited to be two. Second
downside is that we cannot support devices which have no conventional zones
at all.
To solve these two problems, we employ superblock log writing. It uses two
adjacent zones as a circular buffer to write updated superblocks. Once the
first zone is filled up, start writing into the second one. Then, when
both zones are filled up and before starting to write to the first zone
again, reset the first zone.
We can determine the position of the latest superblock by reading write
pointer information from a device. One corner case is when both zones are
full. For this situation, we read out the last superblock of each zone, and
compare them to determine which zone is older.
The following zones are reserved as the circular buffer on ZONED btrfs.
- primary superblock: offset 0B (and the following zone)
- first copy: offset 512G (and the following zone)
- Second copy: offset 4T (4096G, and the following zone)
If these reserved zones are conventional, superblock is written fixed at
the start of the zone without logging.
Currently, superblock reading/writing is done by pread/pwrite. This
commit replace the call sites with sbread/sbwrite to wrap the functions.
For zoned btrfs, btrfs_sb_io which is called from sbread/sbwrite
reverses the IO position back to a mirror number, maps the mirror number
into the superblock logging position, and do the IO.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-04-26 06:27:26 +00:00
|
|
|
static inline size_t sbwrite(int fd, void *buf, off_t offset)
|
|
|
|
{
|
|
|
|
return btrfs_sb_io(fd, buf, offset, WRITE);
|
|
|
|
}
|
|
|
|
|
2021-04-26 06:27:27 +00:00
|
|
|
static inline bool zone_is_sequential(struct btrfs_zoned_device_info *zinfo,
|
|
|
|
u64 bytenr)
|
|
|
|
{
|
|
|
|
unsigned int zno;
|
|
|
|
|
|
|
|
if (!zinfo || zinfo->model == ZONED_NONE)
|
|
|
|
return false;
|
|
|
|
|
|
|
|
zno = bytenr / zinfo->zone_size;
|
|
|
|
return zinfo->zones[zno].type == BLK_ZONE_TYPE_SEQWRITE_REQ;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline bool btrfs_dev_is_empty_zone(struct btrfs_device *device, u64 pos)
|
|
|
|
{
|
|
|
|
struct btrfs_zoned_device_info *zinfo = device->zone_info;
|
|
|
|
unsigned int zno;
|
|
|
|
|
|
|
|
if (!zone_is_sequential(zinfo, pos))
|
|
|
|
return true;
|
|
|
|
|
|
|
|
zno = pos / zinfo->zone_size;
|
|
|
|
return zinfo->zones[zno].cond == BLK_ZONE_COND_EMPTY;
|
|
|
|
}
|
|
|
|
|
2023-09-14 16:05:35 +00:00
|
|
|
bool zoned_profile_supported(u64 map_type, bool rst);
|
btrfs-progs: zoned: implement log-structured superblock
Superblock (and its copies) is the only data structure in btrfs which has a
fixed location on a device. Since we cannot overwrite in a sequential write
required zone, we cannot place superblock in the zone. One easy solution
is limiting superblock and copies to be placed only in conventional zones.
However, this method has two downsides: one is reduced number of superblock
copies. The location of the second copy of superblock is 256GB, which is in
a sequential write required zone on typical devices in the market today.
So, the number of superblock and copies is limited to be two. Second
downside is that we cannot support devices which have no conventional zones
at all.
To solve these two problems, we employ superblock log writing. It uses two
adjacent zones as a circular buffer to write updated superblocks. Once the
first zone is filled up, start writing into the second one. Then, when
both zones are filled up and before starting to write to the first zone
again, reset the first zone.
We can determine the position of the latest superblock by reading write
pointer information from a device. One corner case is when both zones are
full. For this situation, we read out the last superblock of each zone, and
compare them to determine which zone is older.
The following zones are reserved as the circular buffer on ZONED btrfs.
- primary superblock: offset 0B (and the following zone)
- first copy: offset 512G (and the following zone)
- Second copy: offset 4T (4096G, and the following zone)
If these reserved zones are conventional, superblock is written fixed at
the start of the zone without logging.
Currently, superblock reading/writing is done by pread/pwrite. This
commit replace the call sites with sbread/sbwrite to wrap the functions.
For zoned btrfs, btrfs_sb_io which is called from sbread/sbwrite
reverses the IO position back to a mirror number, maps the mirror number
into the superblock logging position, and do the IO.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-04-26 06:27:26 +00:00
|
|
|
int btrfs_reset_dev_zone(int fd, struct blk_zone *zone);
|
2021-04-26 06:27:27 +00:00
|
|
|
u64 btrfs_find_allocatable_zones(struct btrfs_device *device, u64 hole_start,
|
|
|
|
u64 hole_end, u64 num_bytes);
|
2021-04-26 06:27:28 +00:00
|
|
|
int btrfs_load_block_group_zone_info(struct btrfs_fs_info *fs_info,
|
|
|
|
struct btrfs_block_group *cache);
|
2021-04-26 06:27:31 +00:00
|
|
|
bool btrfs_redirty_extent_buffer_for_zoned(struct btrfs_fs_info *fs_info,
|
|
|
|
u64 start, u64 end);
|
2021-04-26 06:27:32 +00:00
|
|
|
int btrfs_reset_chunk_zones(struct btrfs_fs_info *fs_info, u64 devid,
|
|
|
|
u64 offset, u64 length);
|
2024-05-29 07:13:21 +00:00
|
|
|
int btrfs_reset_zones(int fd, struct btrfs_zoned_device_info *zinfo, u64 byte_count);
|
2021-04-26 06:27:34 +00:00
|
|
|
int zero_zone_blocks(int fd, struct btrfs_zoned_device_info *zinfo, off_t start,
|
|
|
|
size_t len);
|
2021-04-26 06:27:40 +00:00
|
|
|
int btrfs_wipe_temporary_sb(struct btrfs_fs_devices *fs_devices);
|
2023-10-12 14:19:30 +00:00
|
|
|
bool btrfs_sb_zone_exists(struct btrfs_device *device, u64 bytenr);
|
2021-04-26 06:27:34 +00:00
|
|
|
|
btrfs-progs: zoned: implement log-structured superblock
Superblock (and its copies) is the only data structure in btrfs which has a
fixed location on a device. Since we cannot overwrite in a sequential write
required zone, we cannot place superblock in the zone. One easy solution
is limiting superblock and copies to be placed only in conventional zones.
However, this method has two downsides: one is reduced number of superblock
copies. The location of the second copy of superblock is 256GB, which is in
a sequential write required zone on typical devices in the market today.
So, the number of superblock and copies is limited to be two. Second
downside is that we cannot support devices which have no conventional zones
at all.
To solve these two problems, we employ superblock log writing. It uses two
adjacent zones as a circular buffer to write updated superblocks. Once the
first zone is filled up, start writing into the second one. Then, when
both zones are filled up and before starting to write to the first zone
again, reset the first zone.
We can determine the position of the latest superblock by reading write
pointer information from a device. One corner case is when both zones are
full. For this situation, we read out the last superblock of each zone, and
compare them to determine which zone is older.
The following zones are reserved as the circular buffer on ZONED btrfs.
- primary superblock: offset 0B (and the following zone)
- first copy: offset 512G (and the following zone)
- Second copy: offset 4T (4096G, and the following zone)
If these reserved zones are conventional, superblock is written fixed at
the start of the zone without logging.
Currently, superblock reading/writing is done by pread/pwrite. This
commit replace the call sites with sbread/sbwrite to wrap the functions.
For zoned btrfs, btrfs_sb_io which is called from sbread/sbwrite
reverses the IO position back to a mirror number, maps the mirror number
into the superblock logging position, and do the IO.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-04-26 06:27:26 +00:00
|
|
|
#else
|
|
|
|
|
|
|
|
#define sbread(fd, buf, offset) \
|
btrfs-progs: stop using legacy *64 interfaces
The *64 interfaces, such as fstat64, off64_t, etc, are legacy interfaces
created at a time when 64-bit file support was still new. They are
generally exposed when defining a macro named _LARGEFILE64_SOURCE, as
e.g. the glibc docs[0] say.
The modern way to utilise largefile support, is to continue to use the
regular interfaces (off_t, fstat, ..), and define _FILE_OFFSET_BITS=64.
We already use the autoconf macro AC_SYS_LARGEFILE[1] which arranges this
and sets this macro for us. Therefore, we can utilise the non-64 names
without fear of breaking on 32-bit systems.
This fixes the build against musl libc, ever since musl dropped the
*64 compat from interfaces by default[2] just for _GNU_SOURCE, unless
_LARGEFILE64_SOURCE is defined. However, there are plans for a future
removal of the whole *64 header API, and that workaround (adding another
define) might cease to exist.
So, rename all *64 API use to the regular non-suffixed names. For
consistency, rename the internal functions that were *64 named
(lstat64_path, ..) too.
This should have no regressions on any platform.
[0]: https://www.gnu.org/software/libc/manual/html_node/Feature-Test-Macros.html#index-_005fLARGEFILE64_005fSOURCE
[1]: https://www.gnu.org/software/autoconf/manual/autoconf-2.67/html_node/System-Services.html
[2]: https://github.com/bminor/musl/commit/25e6fee27f4a293728dd15b659170e7b9c7db9bc
Pull-request: #615
Signed-off-by: psykose <alice@ayaya.dev>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-15 17:15:49 +00:00
|
|
|
pread(fd, buf, BTRFS_SUPER_INFO_SIZE, offset)
|
btrfs-progs: zoned: implement log-structured superblock
Superblock (and its copies) is the only data structure in btrfs which has a
fixed location on a device. Since we cannot overwrite in a sequential write
required zone, we cannot place superblock in the zone. One easy solution
is limiting superblock and copies to be placed only in conventional zones.
However, this method has two downsides: one is reduced number of superblock
copies. The location of the second copy of superblock is 256GB, which is in
a sequential write required zone on typical devices in the market today.
So, the number of superblock and copies is limited to be two. Second
downside is that we cannot support devices which have no conventional zones
at all.
To solve these two problems, we employ superblock log writing. It uses two
adjacent zones as a circular buffer to write updated superblocks. Once the
first zone is filled up, start writing into the second one. Then, when
both zones are filled up and before starting to write to the first zone
again, reset the first zone.
We can determine the position of the latest superblock by reading write
pointer information from a device. One corner case is when both zones are
full. For this situation, we read out the last superblock of each zone, and
compare them to determine which zone is older.
The following zones are reserved as the circular buffer on ZONED btrfs.
- primary superblock: offset 0B (and the following zone)
- first copy: offset 512G (and the following zone)
- Second copy: offset 4T (4096G, and the following zone)
If these reserved zones are conventional, superblock is written fixed at
the start of the zone without logging.
Currently, superblock reading/writing is done by pread/pwrite. This
commit replace the call sites with sbread/sbwrite to wrap the functions.
For zoned btrfs, btrfs_sb_io which is called from sbread/sbwrite
reverses the IO position back to a mirror number, maps the mirror number
into the superblock logging position, and do the IO.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-04-26 06:27:26 +00:00
|
|
|
#define sbwrite(fd, buf, offset) \
|
btrfs-progs: stop using legacy *64 interfaces
The *64 interfaces, such as fstat64, off64_t, etc, are legacy interfaces
created at a time when 64-bit file support was still new. They are
generally exposed when defining a macro named _LARGEFILE64_SOURCE, as
e.g. the glibc docs[0] say.
The modern way to utilise largefile support, is to continue to use the
regular interfaces (off_t, fstat, ..), and define _FILE_OFFSET_BITS=64.
We already use the autoconf macro AC_SYS_LARGEFILE[1] which arranges this
and sets this macro for us. Therefore, we can utilise the non-64 names
without fear of breaking on 32-bit systems.
This fixes the build against musl libc, ever since musl dropped the
*64 compat from interfaces by default[2] just for _GNU_SOURCE, unless
_LARGEFILE64_SOURCE is defined. However, there are plans for a future
removal of the whole *64 header API, and that workaround (adding another
define) might cease to exist.
So, rename all *64 API use to the regular non-suffixed names. For
consistency, rename the internal functions that were *64 named
(lstat64_path, ..) too.
This should have no regressions on any platform.
[0]: https://www.gnu.org/software/libc/manual/html_node/Feature-Test-Macros.html#index-_005fLARGEFILE64_005fSOURCE
[1]: https://www.gnu.org/software/autoconf/manual/autoconf-2.67/html_node/System-Services.html
[2]: https://github.com/bminor/musl/commit/25e6fee27f4a293728dd15b659170e7b9c7db9bc
Pull-request: #615
Signed-off-by: psykose <alice@ayaya.dev>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-15 17:15:49 +00:00
|
|
|
pwrite(fd, buf, BTRFS_SUPER_INFO_SIZE, offset)
|
btrfs-progs: zoned: implement log-structured superblock
Superblock (and its copies) is the only data structure in btrfs which has a
fixed location on a device. Since we cannot overwrite in a sequential write
required zone, we cannot place superblock in the zone. One easy solution
is limiting superblock and copies to be placed only in conventional zones.
However, this method has two downsides: one is reduced number of superblock
copies. The location of the second copy of superblock is 256GB, which is in
a sequential write required zone on typical devices in the market today.
So, the number of superblock and copies is limited to be two. Second
downside is that we cannot support devices which have no conventional zones
at all.
To solve these two problems, we employ superblock log writing. It uses two
adjacent zones as a circular buffer to write updated superblocks. Once the
first zone is filled up, start writing into the second one. Then, when
both zones are filled up and before starting to write to the first zone
again, reset the first zone.
We can determine the position of the latest superblock by reading write
pointer information from a device. One corner case is when both zones are
full. For this situation, we read out the last superblock of each zone, and
compare them to determine which zone is older.
The following zones are reserved as the circular buffer on ZONED btrfs.
- primary superblock: offset 0B (and the following zone)
- first copy: offset 512G (and the following zone)
- Second copy: offset 4T (4096G, and the following zone)
If these reserved zones are conventional, superblock is written fixed at
the start of the zone without logging.
Currently, superblock reading/writing is done by pread/pwrite. This
commit replace the call sites with sbread/sbwrite to wrap the functions.
For zoned btrfs, btrfs_sb_io which is called from sbread/sbwrite
reverses the IO position back to a mirror number, maps the mirror number
into the superblock logging position, and do the IO.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-04-26 06:27:26 +00:00
|
|
|
|
|
|
|
static inline int btrfs_reset_dev_zone(int fd, struct blk_zone *zone)
|
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2021-04-26 06:27:27 +00:00
|
|
|
static inline bool zone_is_sequential(struct btrfs_zoned_device_info *zinfo,
|
|
|
|
u64 bytenr)
|
|
|
|
{
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline u64 btrfs_find_allocatable_zones(struct btrfs_device *device,
|
|
|
|
u64 hole_start, u64 hole_end,
|
|
|
|
u64 num_bytes)
|
|
|
|
{
|
|
|
|
return hole_start;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline bool btrfs_dev_is_empty_zone(struct btrfs_device *device, u64 pos)
|
|
|
|
{
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
2021-04-26 06:27:28 +00:00
|
|
|
static inline int btrfs_load_block_group_zone_info(
|
|
|
|
struct btrfs_fs_info *fs_info, struct btrfs_block_group *cache)
|
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2021-04-26 06:27:31 +00:00
|
|
|
static inline bool btrfs_redirty_extent_buffer_for_zoned(
|
|
|
|
struct btrfs_fs_info *fs_info, u64 start, u64 end)
|
|
|
|
{
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2021-04-26 06:27:32 +00:00
|
|
|
static inline int btrfs_reset_chunk_zones(struct btrfs_fs_info *fs_info,
|
|
|
|
u64 devid, u64 offset, u64 length)
|
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2024-05-29 07:13:21 +00:00
|
|
|
static inline int btrfs_reset_zones(int fd, struct btrfs_zoned_device_info *zinfo, u64 byte_count)
|
2021-04-26 06:27:33 +00:00
|
|
|
{
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
}
|
|
|
|
|
2021-04-26 06:27:34 +00:00
|
|
|
static inline int zero_zone_blocks(int fd,
|
|
|
|
struct btrfs_zoned_device_info *zinfo,
|
|
|
|
off_t start, size_t len)
|
|
|
|
{
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
}
|
|
|
|
|
2021-04-26 06:27:40 +00:00
|
|
|
static inline int btrfs_wipe_temporary_sb(struct btrfs_fs_devices *fs_devices)
|
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2023-09-14 16:05:35 +00:00
|
|
|
static inline bool zoned_profile_supported(u64 map_type, bool rst)
|
2022-02-07 16:40:25 +00:00
|
|
|
{
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2023-10-12 14:19:30 +00:00
|
|
|
static inline bool btrfs_sb_zone_exists(struct btrfs_device *device, u64 bytenr)
|
|
|
|
{
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
btrfs-progs: zoned: implement log-structured superblock
Superblock (and its copies) is the only data structure in btrfs which has a
fixed location on a device. Since we cannot overwrite in a sequential write
required zone, we cannot place superblock in the zone. One easy solution
is limiting superblock and copies to be placed only in conventional zones.
However, this method has two downsides: one is reduced number of superblock
copies. The location of the second copy of superblock is 256GB, which is in
a sequential write required zone on typical devices in the market today.
So, the number of superblock and copies is limited to be two. Second
downside is that we cannot support devices which have no conventional zones
at all.
To solve these two problems, we employ superblock log writing. It uses two
adjacent zones as a circular buffer to write updated superblocks. Once the
first zone is filled up, start writing into the second one. Then, when
both zones are filled up and before starting to write to the first zone
again, reset the first zone.
We can determine the position of the latest superblock by reading write
pointer information from a device. One corner case is when both zones are
full. For this situation, we read out the last superblock of each zone, and
compare them to determine which zone is older.
The following zones are reserved as the circular buffer on ZONED btrfs.
- primary superblock: offset 0B (and the following zone)
- first copy: offset 512G (and the following zone)
- Second copy: offset 4T (4096G, and the following zone)
If these reserved zones are conventional, superblock is written fixed at
the start of the zone without logging.
Currently, superblock reading/writing is done by pread/pwrite. This
commit replace the call sites with sbread/sbwrite to wrap the functions.
For zoned btrfs, btrfs_sb_io which is called from sbread/sbwrite
reverses the IO position back to a mirror number, maps the mirror number
into the superblock logging position, and do the IO.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-04-26 06:27:26 +00:00
|
|
|
#endif /* BTRFS_ZONED */
|
|
|
|
|
2022-04-06 01:43:10 +00:00
|
|
|
/*
|
|
|
|
* Get the first zone number of the superblock mirror
|
|
|
|
*/
|
|
|
|
static inline u32 sb_zone_number(int shift, int mirror)
|
|
|
|
{
|
|
|
|
u64 zone = 0;
|
|
|
|
|
|
|
|
ASSERT(0 <= mirror && mirror < BTRFS_SUPER_MIRROR_MAX);
|
|
|
|
switch (mirror) {
|
|
|
|
case 0: zone = 0; break;
|
|
|
|
case 1: zone = 1ULL << (BTRFS_SB_LOG_FIRST_SHIFT - shift); break;
|
|
|
|
case 2: zone = 1ULL << (BTRFS_SB_LOG_SECOND_SHIFT - shift); break;
|
|
|
|
}
|
|
|
|
|
|
|
|
ASSERT(zone <= U32_MAX);
|
|
|
|
|
|
|
|
return (u32)zone;
|
|
|
|
}
|
|
|
|
|
2021-04-26 06:27:27 +00:00
|
|
|
static inline bool btrfs_dev_is_sequential(struct btrfs_device *device, u64 pos)
|
|
|
|
{
|
|
|
|
return zone_is_sequential(device->zone_info, pos);
|
|
|
|
}
|
|
|
|
|
2021-04-26 06:27:21 +00:00
|
|
|
#endif /* __BTRFS_ZONED_H__ */
|