History log of /freebsd-10.3-release/sys/boot/zfs/zfsimpl.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 296373 04-Mar-2016 marius

- Copy stable/10@296371 to releng/10.3 in preparation for 10.3-RC1
builds.
- Update newvers.sh to reflect RC1.
- Update __FreeBSD_version to reflect 10.3.
- Update default pkg(8) configuration to use the quarterly branch.

Approved by: re (implicit)

# 294975 28-Jan-2016 smh

MFC r294040:

Prevent bogus compiler in ZFS boot code.

Sponsored by: Multiplay


# 293802 13-Jan-2016 allanjude

MFC: r293001
Introduce the ZFS Boot Environments menu to the loader menu

MFC: r293414
Add ZFS Boot Environments menu to userboot

MFC: r293454
Only call init_zfs_bootenv when the system is booted with ZFS

MFC: r293612
Fix calling init_zfs_bootenv to early, resulting in empty ZFS BE menu

Relnotes: yes
Sponsored by: ScaleEngine Inc.


# 284509 17-Jun-2015 avg

MFC r284025,284032: dnode_read: handle hole blocks in zfs boot code

PR: 199804


# 276081 22-Dec-2014 delphij

MFC r274337,r274673,274681,r275515:

ZFS large block support. The default recordsize remains at 128KB.

A new tunable/sysctl variable, vfs.zfs.max_recordsize is added to
allow adjusting the permitted maximum record size, or
zfs_max_recordsize, with a default of 1MB. ZFS will not allow
setting recordsize greater than zfs_max_recordsize as a safety
belt, because larger recordsize means greater read and write
latency and more memory usage.

Please note that booting from datasets that have recordsize greater
than 128KB is not supported (but it's Okay to enable the feature on
the pool).

Limited safety belt is provided for mounted root filesystem but use
caution when using a larger value.

Illumos issue:
5027 zfs large block support


# 268649 15-Jul-2014 delphij

MFC r268075: MFV r267565:

4757 ZFS embedded-data block pointers ("zero block compression")
4913 zfs release should not be subject to space checks


# 263397 19-Mar-2014 delphij

MFC r260150: MFV r259170:

4370 avoid transmitting holes during zfs send

4371 DMU code clean up

illumos/illumos-gate@43466aae47bfcd2ad9bf501faec8e75c08095e4f

NOTE: Make sure the boot code is updated if a zpool upgrade is
done on boot zpool.


# 256281 10-Oct-2013 gjb

Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


# 241293 06-Oct-2012 avg

zfs boot: export boot/primary pool and vdev guid all the way to kenv

This is work in progress to for znextboot and it also provides
some convenient infrastructure.

MFC after: 20 days


# 241291 06-Oct-2012 avg

zfs boot spa_status: print bootfs for each reported pool

MFC after: 9 days


# 241289 06-Oct-2012 avg

boot/zfs: call zfs_spa_init for all found pools

... and drop those for which it fails.
Also, add more sanity checking to the function.

MFC after: 16 days


# 241283 06-Oct-2012 avg

zfs boot: add code for listing child datasets of a given dataset

- only filesystem datasets are supported
- children names are printed to stdout

To do: allow to iterate over the list and fetch names programatically

MFC after: 17 days


# 240349 11-Sep-2012 avg

zfs boot: add a size check for a value in fzap_lookup

MFC after: 25 days


# 240348 11-Sep-2012 avg

zfs boot: print only an attribute name in fzap_list

... this matches mzap_list behavior

MFC after: 12 days


# 240347 11-Sep-2012 avg

zfs boot: fix/replace fzap_rlookup implementation

The previous one was totally bogus as it used hash value of
_output_ variable as an index for searching...
The only reliable way to do a reverse lookup here is to iterate
over all entries.

MFC after: 15 days


# 240346 11-Sep-2012 avg

zfs boot: bring zap_leaf_chunk field names in sync with kernel code

This change is cosmetic.

MFC after: 10 days


# 237001 13-Jun-2012 mm

Fix ZFS boot with pre-features pools (version <= 28) broken in r236884

Reported by: mav
MFC after: 1 month


# 236884 11-Jun-2012 mm

Introduce "feature flags" for ZFS pools (bump SPA version to 5000).
Add first feature "com.delphix:async_destroy" (asynchronous destroy
of ZFS datasets).
Implement features support in ZFS boot code.

Illumos revisions merged:
13700:2889e2596bd6
13701:1949b688d5fb
2619 asynchronous destruction of ZFS file systems
2747 SPA versioning with zfs feature flags

References:
https://www.illumos.org/issues/2619
https://www.illumos.org/issues/2747

Obtained from: illumos (issue #2619, #2747)
MFC after: 1 month


# 235390 13-May-2012 avg

zfs boot code: mark spa_t arguments as const where they are used as such

MFC after: 1 month


# 235364 12-May-2012 avg

sparc64/zfs boot: take advantage of new libzfsboot capabilities

Also drop the now unneeded compatibility shims.

Tested by: marius
MFC after: 1 month


# 235361 12-May-2012 avg

zfs boot code: use %j and uintmax_t instead %ll and uint64_t in printfs

This is to silence warnings that result from different definitions of
uint64_t on different architectures, specifically i386 and sparc64.

MFC after: 1 month


# 235329 12-May-2012 avg

zfsboot/zfsloader: support accessing filesystems within a pool

In zfs loader zfs device name format now is "zfs:pool/fs",
fully qualified file path is "zfs:pool/fs:/path/to/file"
loader allows accessing files from various pools and filesystems as well
as changing currdev to a different pool/filesystem.

zfsboot accepts kernel/loader name in a format pool:fs:path/to/file or,
as before, pool:path/to/file; in the latter case a default filesystem
is used (pool root or bootfs). zfsboot passes guids of the selected
pool and dataset to zfsloader to be used as its defaults.

zfs support should be architecture independent and is provided
in a separate library, but architectures wishing to use this zfs support
still have to provide some glue code and their devdesc should be
compatible with zfs_devdesc.
arch_zfs_probe method is used to discover all disk devices that may
be part of ZFS pool(s).

libi386 unconditionally includes zfs support, but some zfs-specific
functions are stubbed out as weak symbols. The strong definitions
are provided in libzfsboot.
This change mean that the size of i386_devspec becomes larger
to match zfs_devspec.

Backward-compatibility shims are provided for recently added sparc64
zfs boot support. Currently that architecture still works the old
way and does not support the new features.

TODO:
- clear up pool root filesystem vs pool bootfs filesystem distinction
- update sparc64 support
- set vfs.root.mountfrom based on currdev (for zfs)

Mid-future TODO:
- loader sub-menu for selecting alternative boot environment

Distant future TODO:
- support accessing snapshots, using a snapshot as readonly root

Reviewed by: marius (sparc64),
Gavin Mu <gavin.mu@gmail.com> (sparc64)
Tested by: Florian Wagner <florian@wagner-flo.net> (x86),
marius (sparc64)
No objections: fs@, hackers@
MFC after: 1 month


# 228266 04-Dec-2011 avg

zfs boot: allow file vdevs to be used in testing (e.g. with zfsboottest)

MFC after: 1 week


# 226568 20-Oct-2011 pjd

- Correctly read gang header from raidz.
- Decompress assembled gang block data if compressed.
- Verify checksum of a gang header.
- Verify checksum of assembled gang block data.
- Verify checksum of uber block.

Submitted by: avg
MFC after: 3 days


# 226553 19-Oct-2011 pjd

Always pass data size for checksum verification function, as using
physical block size declared in bp may not always be what we want.
For example in case of gang block header physical block size declared
in bp is much larger than SPA_GANGBLOCKSIZE (512 bytes) and checksum
calculation failed. This bug could lead to accessing unallocated
memory and resets/failures during boot.

MFC after: 3 days


# 226552 19-Oct-2011 pjd

Never pass NULL block pointer when reading. This is neither expected nor
handled by lower layers like vdev_raidz, which uses bp for checksum
verification. This bug could lead to NULL pointer reference and resets
during boot.

MFC after: 3 days


# 226551 19-Oct-2011 pjd

Don't mark vdev as healthy too soon, so we won't try to use invalid vdevs.

MFC after: 3 days


# 219089 27-Feb-2011 pjd

Finally... Import the latest open-source ZFS version - (SPA) 28.

Few new things available from now on:

- Data deduplication.
- Triple parity RAIDZ (RAIDZ3).
- zfs diff.
- zpool split.
- Snapshot holds.
- zpool import -F. Allows to rewind corrupted pool to earlier
transaction group.
- Possibility to import pool in read-only mode.

MFC after: 1 month


# 213136 24-Sep-2010 pjd

- Split code shared by almost any boot loader into separate files and
clean up most layering violations:

sys/boot/i386/common/rbx.h:

RBX_* defines
OPT_SET()
OPT_CHECK()

sys/boot/common/util.[ch]:

memcpy()
memset()
memcmp()
bcpy()
bzero()
bcmp()
strcmp()
strncmp() [new]
strcpy()
strcat()
strchr()
strlen()
printf()

sys/boot/i386/common/cons.[ch]:

ioctrl
putc()
xputc()
putchar()
getc()
xgetc()
keyhit() [now takes number of seconds as an argument]
getstr()

sys/boot/i386/common/drv.[ch]:

struct dsk
drvread()
drvwrite() [new]
drvsize() [new]

sys/boot/common/crc32.[ch] [new]

sys/boot/common/gpt.[ch] [new]

- Teach gptboot and gptzfsboot about new files. I haven't touched the
rest, but there is still a lot of code duplication to be removed.

- Implement full GPT support. Currently we just read primary header and
partition table and don't care about checksums, etc. After this change we
verify checksums of primary header and primary partition table and if
there is a problem we fall back to backup header and backup partition
table.

- Clean up most messages to use prefix of boot program, so in case of an
error we know where the error comes from, eg.:

gptboot: unable to read primary GPT header

- If we can't boot, print boot prompt only once and not every five
seconds.

- Honour newly added GPT attributes:

bootme - this is bootable partition
bootonce - try to boot from this partition only once
bootfailed - we failed to boot from this partition

- Change boot order of gptboot to the following:

1. Try to boot from all the partitions that have both 'bootme'
and 'bootonce' attributes one by one.
2. Try to boot from all the partitions that have only 'bootme'
attribute one by one.
3. If there are no partitions with 'bootme' attribute, boot from
the first UFS partition.

- The 'bootonce' functionality is implemented in the following way:

1. Walk through all the partitions and when 'bootonce'
attribute is found without 'bootme' attribute, remove
'bootonce' attribute and set 'bootfailed' attribute.
'bootonce' attribute alone means that we tried to boot from
this partition, but boot failed after leaving gptboot and
machine was restarted.
2. Find partition with both 'bootme' and 'bootonce' attributes.
3. Remove 'bootme' attribute.
4. Try to execute /boot/loader or /boot/kernel/kernel from that
partition. If succeeded we stop here.
5. If execution failed, remove 'bootonce' and set 'bootfailed'.
6. Go to 2.

If whole boot succeeded there is new /etc/rc.d/gptboot script coming
that will log all partitions that we failed to boot from (the ones with
'bootfailed' attribute) and will remove this attribute. It will also
find partition with 'bootonce' attribute - this is the partition we
booted from successfully. The script will log success and remove the
attribute.

All the GPT updates we do here goes to both primary and backup GPT if
they are valid. We don't touch headers or partition tables when
checksum doesn't match.

Reviewed by: arch (Message-ID: <20100917234542.GE1902@garage.freebsd.pl>)
Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
MFC after: 2 weeks


# 212387 09-Sep-2010 pjd

Remove empty lines committed by accident.

MFC after: 2 weeks


# 212384 09-Sep-2010 pjd

Ignore log vdevs.

MFC after: 2 weeks


# 212383 09-Sep-2010 pjd

Allow to boot from a pool within which replacing is in progress.
Before the change it wasn't possible and the following error was printed:

ZFS: can only boot from disk, mirror or raidz vdevs

Now if the original vdev (the one we are replacing) is still present we will
read from it, but if it is not present we won't read from the new vdev, as it
might not have enough valid data yet.

MFC after: 2 weeks


# 212382 09-Sep-2010 pjd

Remove duplicated code.

MFC after: 2 weeks


# 211091 09-Aug-2010 mm

Return EIO if vdev->v_phys_read is NULL.

This fixes booting from a ZFS mirror with a unavailable primary device.

PR: kern/148655
Reviewed by: avg
Approved by: delphij (mentor)
MFC after: 3 days


# 208610 28-May-2010 avg

boot/zfs: fix gang block reading code

- use correct size (512) while reading a gang block
- skip holes while reading child blocks
- advance buffer pointer while reading child blocks

PR: 144214
MFC after: 10 days


# 201690 06-Jan-2010 delphij

Space cleanup for revision 201689 committed separately for easier review.
This commit is purely space changes.

Submitted by: Matt Reimer
Sponsored by: VPOP Technologies, Inc.
MFC after: 2 weeks


# 201689 06-Jan-2010 delphij

Instead of assuming all vdevs are healthy, check the newest vdev label
for each vdev's status. Booting from a degraded vdev should now be
more robust.

Submitted by: Matt Reimer <mattjreimer at gmail.com>
Sponsored by: VPOP Technologies, Inc.
MFC after: 2 weeks


# 200309 09-Dec-2009 jhb

- Port bios_getmem() from libi386 to {gpt,}zfsboot() and use it to
safely allocate a heap region above 1MB. This enables {gpt,}zfsboot()
to allocate much larger buffers than before.
- Use a larger buffer (1MB instead of 128K) for temporary ZFS buffers. This
allows more reliable reading of compressed files in a raidz/raidz2 pool.

Submitted by: Matt Reimer mattjreimer of gmail
MFC after: 1 week


# 198420 23-Oct-2009 rnoland

Correct some issues with zfs boot.

- Teach it to read gang blocks. (essentially untested)
If you see "ZFS: gang block detected!", please let
me know, so we can either remove the printf if it
works, or fix it if it doesn't.

- If multiple partitions exist on a disk, probe them all.
We also need to reset dsk->start to 0 to read the right
sector here.

- With GPT, we can have 128 partitions.

- If the bootfs property has ever been set on a pool
it seems that it never goes away. zpool won't allow
you to add to the pool with the bootfs property set.
However, if you clear the property back to default
we end up getting 0 for the object number and read
a bogus block pointer and fail to boot.

- Fix some error printfs. The printf in the loader is
only capable of c,s and u formats.

- Teach printf how to display %llu

Reviewed by: dfr, jhb
MFC after: 2 weeks


# 192194 16-May-2009 dfr

Add support for booting from raidz1 and raidz2 pools.


# 186243 17-Dec-2008 dfr

Use full 64bit arithmetic when converting file offsets to block numbers - fixes
booting on filesystems with inode numbers with values above 4194304.

Submitted by: ps


# 185852 10-Dec-2008 dfr

Don't get confused if we encounter a device which is part of a raidz or raidz2
pool while probing for vdevs.

PR: 129539
Submitted by: Paul Wootton (paul at fletchermoorland dot co dot uk)


# 185097 19-Nov-2008 dfr

Some zfsboot fixes from Norikatsu Shigemura:

1. zfsboot2 (boot2) doesn't %d (printf), so change %d to %u.
2. chase new zpool versioning as SPA_VERSION.
Obtained from: sys/cddl/contrib/opensolaris/uts/common/sys/fs/zfs.h

Submitted by: nork


# 185029 17-Nov-2008 pjd

Update ZFS from version 6 to 13 and bring some FreeBSD-specific changes.

This bring huge amount of changes, I'll enumerate only user-visible changes:

- Delegated Administration

Allows regular users to perform ZFS operations, like file system
creation, snapshot creation, etc.

- L2ARC

Level 2 cache for ZFS - allows to use additional disks for cache.
Huge performance improvements mostly for random read of mostly
static content.

- slog

Allow to use additional disks for ZFS Intent Log to speed up
operations like fsync(2).

- vfs.zfs.super_owner

Allows regular users to perform privileged operations on files stored
on ZFS file systems owned by him. Very careful with this one.

- chflags(2)

Not all the flags are supported. This still needs work.

- ZFSBoot

Support to boot off of ZFS pool. Not finished, AFAIK.

Submitted by: dfr

- Snapshot properties

- New failure modes

Before if write requested failed, system paniced. Now one
can select from one of three failure modes:
- panic - panic on write error
- wait - wait for disk to reappear
- continue - serve read requests if possible, block write requests

- Refquota, refreservation properties

Just quota and reservation properties, but don't count space consumed
by children file systems, clones and snapshots.

- Sparse volumes

ZVOLs that don't reserve space in the pool.

- External attributes

Compatible with extattr(2).

- NFSv4-ACLs

Not sure about the status, might not be complete yet.

Submitted by: trasz

- Creation-time properties

- Regression tests for zpool(8) command.

Obtained from: OpenSolaris